## From ImmuneDB Link
For a hosted ImmuneDB instance, you can directly download and load data from the website link. Depending on the database size, initially gathering the data may take some time.  After it is downloaded, the cached version will be used unless the data is explicitly deleted.

In [1]:
import hicutils.core.io as io

df = io.pull_immunedb_data(
    'https://myurl.com/immunedb',
    'mydb',
    'example_data_immunedb'
)

# Show a snippet of the resulting DataFrame
df[['clone_id', 'subject', 'v_gene', 'j_gene', 'cdr3_aa', 'copies', 'shm']]

Unnamed: 0,clone_id,subject,v_gene,j_gene,cdr3_aa,copies,shm
8248,6311533,HPAP015,IGHV2-5,IGHJ4,CAHSWVRYNSGWGFHYW,34,1.14
9810,6326493,HPAP015,IGHV2-70|2-70D,IGHJ4,CARPHGSSGWYYFDYW,31,4.34
8697,6315829,HPAP015,IGHV2-5,IGHJ4,CARGQWLAPNHFDYW,27,4.56
7970,6308963,HPAP015,IGHV2-5,IGHJ4,CAHRGSSWDYW,24,1.37
8549,6314347,HPAP015,IGHV2-5,IGHJ4,CAHSTIRFQYYFDSW,22,3.01
...,...,...,...,...,...,...,...
4137,7029341,HPAP017,IGHV1-46,IGHJ3,CAAVRYYDSSGYFAAGDSDYGRAGAFDIW,1,3.32
4135,7029336,HPAP017,IGHV1-46,IGHJ3,CAAANYYDXSGYYHYAFDIW,1,3.79
4134,7029309,HPAP017,IGHV1-46,IGHJ3,CARDLYDSIGYYRAXAFDIW,1,2.31
4133,7029295,HPAP017,IGHV1-46,IGHJ3,XARDKYSGSYYLSDAFDIW,1,0.46


## From existing files with metadata in filenames
Alternatively, if you have existing files which were exported from ImmuneDB (either using `immunedb_export ... clones ...` or via the website), they can be imported directly.  Take for example the files below:

In [2]:
%%bash
ls example_data_meta_in_names

HPAP015.T1D.pooled.tsv
HPAP017.Control.pooled.tsv


The files can be imported with the following:

In [3]:
import hicutils.core.io as io

# Specify that the metadata in the filename is the disease status
# If there are multiple features separated with the _AND_ string
# per the ImmuneDB specification, the second parameter should
# be a list of all features (e.g. for age and siease ['age', 'disease'].
df = io.read_tsvs('example_data_meta_in_names', ['disease'])

# Show a snippet of the resulting DataFrame
df[['clone_id', 'subject', 'v_gene', 'j_gene', 'cdr3_aa', 'copies', 'shm']]

Unnamed: 0,clone_id,subject,v_gene,j_gene,cdr3_aa,copies,shm
16548,6310562,HPAP015,IGHV2-5,IGHJ4|5,CARARGAYW,41,4.415122
16771,6311533,HPAP015,IGHV2-5,IGHJ4,CAHSWVRYNSGWGFHYW,34,1.140000
19430,6326493,HPAP015,IGHV2-70|2-70D,IGHJ4,CARPHGSSGWYYFDYW,31,4.340000
17713,6315829,HPAP015,IGHV2-5,IGHJ4,CARGQWLAPNHFDYW,30,4.629000
7648,6262779,HPAP015,IGHV1-3,IGHJ4,CARAVENHFDWLSNYW,30,5.996667
...,...,...,...,...,...,...,...
8487,7016857,HPAP017,IGHV1-3,IGHJ3,XXRQGA*QWLVLWGGDAFDIW,1,3.270000
8488,7016859,HPAP017,IGHV1-3,IGHJ3,CARVMVGYSGYGGXYXVSGYAFDIW,1,2.790000
8492,7016881,HPAP017,IGHV1-3,IGHJ3,CARGGXRQRVANYXGSGRGAFDIW,1,4.190000
8493,7016885,HPAP017,IGHV1-3,IGHJ3,CARVSSYGWESAGPDAFDXW,1,4.650000


## From existing replicate files and metadata file

Finally, if you have AIRR-seq files for each replicate and a metadata file, use the following to load the data.

In [4]:
%%bash
ls example_data_immunedb

HPAP015.IgH_HPAP015_rep1_200p0ng.pooled.tsv
HPAP015.IgH_HPAP015_rep2_200p0ng.pooled.tsv
HPAP017.IgH_HPAP017_rep1_200p0ng.pooled.tsv
HPAP017.IgH_HPAP017_rep2_200p0ng.pooled.tsv
metadata.tsv


In [5]:
import hicutils.core.io as io

df = io.read_directory('example_data_immunedb')

# Show a snippet of the resulting DataFrame
df[['clone_id', 'subject', 'v_gene', 'j_gene', 'cdr3_aa', 'copies', 'shm']]

Unnamed: 0,clone_id,subject,v_gene,j_gene,cdr3_aa,copies,shm
8248,6311533,HPAP015,IGHV2-5,IGHJ4,CAHSWVRYNSGWGFHYW,34,1.14
9810,6326493,HPAP015,IGHV2-70|2-70D,IGHJ4,CARPHGSSGWYYFDYW,31,4.34
8697,6315829,HPAP015,IGHV2-5,IGHJ4,CARGQWLAPNHFDYW,27,4.56
7970,6308963,HPAP015,IGHV2-5,IGHJ4,CAHRGSSWDYW,24,1.37
8549,6314347,HPAP015,IGHV2-5,IGHJ4,CAHSTIRFQYYFDSW,22,3.01
...,...,...,...,...,...,...,...
4137,7029341,HPAP017,IGHV1-46,IGHJ3,CAAVRYYDSSGYFAAGDSDYGRAGAFDIW,1,3.32
4135,7029336,HPAP017,IGHV1-46,IGHJ3,CAAANYYDXSGYYHYAFDIW,1,3.79
4134,7029309,HPAP017,IGHV1-46,IGHJ3,CARDLYDSIGYYRAXAFDIW,1,2.31
4133,7029295,HPAP017,IGHV1-46,IGHJ3,XARDKYSGSYYLSDAFDIW,1,0.46
