First, a note: Here, we'll be using import statements as if you were in the GCToo directory (eg, from wherever you've forked l1ktools: l1ktools/python/broadinstitute_cmap/io/GCToo). If you'd like to use nested imports, first run setup.py (found in l1ktools/python/broadinstitute_cmap; type 

```
$ python setup.py --help 
```

on the command line for details. Then, from your python session, you should be able to access GCToo with import statements in the following form: 

```python
from broadinstitute_cmap.io.GCToo import [package]
```


# Reading in gct/x files 

#### If you'd like to read in an entire (.gct or .gctx) file to a GCToo instance:

In [2]:
import parse
my_gctoo = parse.parse("functional_tests/both_metadata_example_n1476x978.gctx")
my_gctoo

<GCToo.GCToo at 0x1149c99d0>

#### If you're using the .gctx format, you can also only read in row or column metadata from a .gctx file. 

In [3]:
# read in row metadata only 
row_metadata_only = parse.parse("functional_tests/mini_gctx_with_metadata_n2x3.gctx", meta_only="row")
row_metadata_only

rhd,pr_analyte_id,pr_analyte_num,pr_gene_id,pr_model_id
rid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
200814_at,Analyte 11,11,5720,
218597_s_at,Analyte 12,12,55847,
217140_s_at,Analyte 12,12,7416,


In [4]:
# read in column metadata only
col_metadata_only = parse.parse("functional_tests/mini_gctx_with_metadata_n2x3.gctx", meta_only="col")
col_metadata_only

chd,bead_batch,bead_revision
cid,Unnamed: 1_level_1,Unnamed: 2_level_1
LJP005_A375_24H:DMSO:-666,b19,r2
LJP005_A375_24H:BRD-K76908866:10,b19,r2


#### You can also read in only a certain subset of rids and/or cids to a GCToo instance.

Practically speaking, this is more useful for GCTX files than GCT files, since (as a text file) you'll need to read in the entire GCT file anyway. **You'll need to have a list of desired rids and/or cids already (can be obtained from reading only metadata in first, then subsetting)**

In [5]:
my_rids = ["218597_s_at", "200814_at"]
my_cids = ["LJP005_A375_24H:BRD-K76908866:10"]

# you can subset by rids, cids, or both rids and cids 
mini_gctoo_subset = parse.parse("functional_tests/mini_gctx_with_metadata_n2x3.gctx", rid = my_rids, cid= my_cids)
mini_gctoo_subset.row_metadata_df

rhd,pr_analyte_id,pr_analyte_num,pr_gene_id,pr_model_id
rid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
200814_at,Analyte 11,11,5720,
218597_s_at,Analyte 12,12,55847,


In [6]:
mini_gctoo_subset.data_df.shape

(2, 1)

In [7]:
mini_gctoo_subset.col_metadata_df

chd,bead_batch,bead_revision
cid,Unnamed: 1_level_1,Unnamed: 2_level_1
LJP005_A375_24H:BRD-K76908866:10,b19,r2


# Construct a GCToo object programatically

For instance, say we want to create a GCToo instance of a file we've read in (for this example, let's use `my_gctoo`) that lacks metadata.

In [9]:
import pandas as pd
from broadinstitute_cmap.io.GCToo import GCToo
from broadinstitute_cmap.io.GCToo import parse

my_gctoo = parse.parse("functional_tests/both_metadata_example_n1476x978.gctx")

minimal_row_meta = pd.DataFrame(index = my_gctoo.row_metadata_df.index)
minimal_col_meta = pd.DataFrame(index = my_gctoo.col_metadata_df.index)

data_only_gctoo = GCToo.GCToo(data_df = my_gctoo.data_df, 
 	row_metadata_df = minimal_row_meta, col_metadata_df = minimal_col_meta)

data_only_gctoo.row_metadata_df.shape

ImportError: No module named broadinstitute_cmap.io.GCToo