# Library: Quick Start

## Basic usage

#### 1. Load TCRs into a data frame

Examples of files you may want to load:

- **10X**: `filtered_contig_annotations.csv`

- **Adaptive**: `Sample_TCRB.tsv`

- **IMGT**: Output from `MiXCR` or other tools

In [1]:
import tcrconvert
import pandas as pd

tcr_file = tcrconvert.get_example_path('tenx.csv')
tcrs = pd.read_csv(tcr_file)[['barcode', 'v_gene', 'j_gene', 'cdr3']]
tcrs

Unnamed: 0,barcode,v_gene,j_gene,cdr3
0,AAACCTGAGACCACGA-1,TRAV29DV5,TRAJ12,CAVMDSSYKLIF
1,AAACCTGAGACCACGA-1,TRBV20/OR9-2,TRBJ2-1,CASSGLAGGYNEQFF
2,AAACCTGAGGCTCTTA-1,TRDV2,TRDJ3,CASSGVAGGTDTQYF
3,AAACCTGAGGCTCTTA-1,TRGV9,TRGJ1,CAVKDSNYQLIW


#### 2. Convert

In [2]:
new_tcrs = tcrconvert.convert_gene(tcrs, frm='tenx', to='adaptive')
new_tcrs

INFO - Converting from 10X. Using *01 as allele for all genes.


Unnamed: 0,barcode,v_gene,j_gene,cdr3
0,AAACCTGAGACCACGA-1,TCRAV29-01*01,TCRAJ12-01*01,CAVMDSSYKLIF
1,AAACCTGAGACCACGA-1,TCRBV20-or09_02*01,TCRBJ02-01*01,CASSGLAGGYNEQFF
2,AAACCTGAGGCTCTTA-1,TCRDV02-01*01,TCRDJ03-01*01,CASSGVAGGTDTQYF
3,AAACCTGAGGCTCTTA-1,TCRGV09-01*01,TCRGJ01-01*01,CAVKDSNYQLIW


> **Tip**: Suppress INFO-level messages by setting `verbose=False`. Warnings and errors will still appear.

> **Tip**: If your Adaptive data lacks `x_resolved`/`xMaxResolved` columns, create them yourself by combining the `x_gene`/`xGeneName` and `x_allele`/`xGeneAllele` columns. See the FAQs.

## AIRR data

Supply the standard AIRR gene column names to `frm_cols`:

```python
new_airr = tcrconvert.convert_gene(airr, frm = "imgt", to = "adaptive", 
                                   frm_cols = c('v_call', 'd_call', 'j_call', 'c_call'))
```

## Custom column names

By default, `TCRconvert` assumes these column names based on the input nomenclature (`frm`):

- `frm='imgt'` : `['v_gene', 'd_gene', 'j_gene', 'c_gene']`

- `frm='tenx'` : `['v_gene', 'd_gene', 'j_gene', 'c_gene']`

- `frm='adaptive'` : `['v_resolved', 'd_resolved', 'j_resolved']`

- `frm='adaptivev2'` : `['vMaxResolved', 'dMaxResolved', 'jMaxResolved']`

You can override these columns using `frm_cols`:

**1. Load 10X data with custom column names**

In [3]:
custom_file = tcrconvert.get_example_path('customcols.csv')

custom = pd.read_csv(custom_file)
custom

Unnamed: 0,myVgene,myDgene,myJgene,myCgene,myCDR3,antigen
0,TRAV1-2,TRBD1,TRAJ12,TRAC,CAVMDSSYKLIF,Flu
1,TRBV6-1,TRBD2,TRBJ2-1,TRBC2,CASSGLAGGYNEQFF,Flu
2,TRBV6-4,TRBD2,TRBJ2-3,TRBC2,CASSGVAGGTDTQYF,CMV
3,TRAV1-2,TRBD1,TRAJ33,TRAC,CAVKDSNYQLIW,CMV
4,TRBV2,TRBD1,TRBJ1-2,TRBC1,CASNQGLNYGYTF,CMV


**2. Specify names using `frm_cols` and convert to IMGT**

In [4]:
custom_new = tcrconvert.convert_gene(
    custom,
    frm='tenx',
    to='imgt',
    verbose=False,
    frm_cols=['myVgene', 'myDgene', 'myJgene', 'myCgene'],
)
custom_new

Unnamed: 0,myVgene,myDgene,myJgene,myCgene,myCDR3,antigen
0,TRAV1-2*01,TRBD1*01,TRAJ12*01,TRAC*01,CAVMDSSYKLIF,Flu
1,TRBV6-1*01,TRBD2*01,TRBJ2-1*01,TRBC2*01,CASSGLAGGYNEQFF,Flu
2,TRBV6-4*01,TRBD2*01,TRBJ2-3*01,TRBC2*01,CASSGVAGGTDTQYF,CMV
3,TRAV1-2*01,TRBD1*01,TRAJ33*01,TRAC*01,CAVKDSNYQLIW,CMV
4,TRBV2*01,TRBD1*01,TRBJ1-2*01,TRBC1*01,CASNQGLNYGYTF,CMV


## Rhesus or mouse data

Use `species='rhesus'` or `species='mouse'`

In [6]:
new_tcrs = tcrconvert.convert_gene(
    tcrs, frm='tenx', to='imgt', verbose=False, species='rhesus'
)  # or 'mouse'
new_tcrs

 ['TRAV29DV5', 'TRBV20/OR9-2', 'TRGJ1']


Unnamed: 0,barcode,v_gene,j_gene,cdr3
0,AAACCTGAGACCACGA-1,,TRAJ12*01,CAVMDSSYKLIF
1,AAACCTGAGACCACGA-1,,TRBJ2-1*01,CASSGLAGGYNEQFF
2,AAACCTGAGGCTCTTA-1,TRDV2*01,TRDJ3*01,CASSGVAGGTDTQYF
3,AAACCTGAGGCTCTTA-1,TRGV9*01,,CAVKDSNYQLIW
