# Usage (command-line)

## Basic usage

TCRconvert takes a `.csv` or `.tsv` file with at least one column of gene names as input. It produces a `.csv` or `.tsv` file with converted gene names as output.

**First, download the [folder](https://github.com/seshadrilab/tcrconvert/tree/1-add-cli/tcrconvert/examples) of example data from GitHub.** In this example I've cloned the entire repo into `~/workspace`.

**Inspect our input 10X data**

In [None]:
!cat ../../tcrconvert/examples/tenx.csv

**Convert gene names from 10X to Adaptive...**

In [None]:
!tcrconvert convert \
    -i ../../tcrconvert/examples/tenx.csv \
    -o ../../tcrconvert/examples/tenx2adapt.tsv \
    --frm tenx \
    --to adaptive

> Tip: Suppress warnings by including the `-q` / `--quiet` flag.

> Tip: You can experiment with enabling [tab completion](https://click.palletsprojects.com/en/stable/shell-completion/) for subcommands.

In [None]:
!cat ../../tcrconvert/examples/tenx2adapt.tsv

**...or to IMGT.**

In [None]:
!tcrconvert convert \
    -i ../../tcrconvert/examples/tenx.csv \
    -o ../../tcrconvert/examples/tenx2imgt.csv \
    --frm tenx \
    --to imgt

In [None]:
!cat ../../tcrconvert/examples/tenx2imgt.csv

**Convert back to 10X to see that no data is lost.**

In [None]:
!tcrconvert convert \
    -i ../../tcrconvert/examples/tenx2imgt.csv \
    -o ../../tcrconvert/examples/imgt2tenx.csv \
    --frm imgt \
    --to tenx

In [None]:
!diff ../../tcrconvert/examples/imgt2tenx.csv \
      ../../tcrconvert/examples/tenx.csv

## Custom column names

TCRconvert uses the gene column names below based on the `frm` parameter. Note that there are no standard IMGT column names and that Adaptive does not capture C genes.

* `--frm imgt` uses `v_gene`, `d_gene`, `j_gene`, `c_gene`
* `--frm tenx` uses `v_gene`, `d_gene`, `j_gene`, `c_gene`
* `--frm adaptive` uses `v_resolved`, `d_resolved`, `j_resolved`
* `--frm adaptivev2` uses `vMaxResolved`, `dMaxResolved`, `jMaxResolved`

At least one of the assumed columns needs to be in the input data. You can use your own columns with the `-c` / `--frm_cols` flag.

>For AIRR files, use column names `v_call`, `d_call`, `j_call`, `c_call`

**Inspect our 10X-format data:**

In [None]:
!cat ../../tcrconvert/examples/customcols.csv

**Specify gene column names using `-c` / `--frm_cols` and convert to IMGT**:

Note that you need to list each one individually

In [None]:
!tcrconvert convert \
    -i ../../tcrconvert/examples/customcols.csv \
    -o ../../tcrconvert/examples/custom2imgt.csv \
    --frm tenx \
    --to imgt \
    -c myVgene \
    -c myDgene \
    -c myJgene \
    -c myCgene

In [None]:
!cat ../../tcrconvert/examples/custom2imgt.csv

> Tip: If your Adaptive data doesn't have `x_resolved` or `xMaxResolved` columns simply make them yourself by combining text from the gene and allele columns using `*` as a seperator.

## Rhesus or mouse data

**Specify the species if not human using the `-s`/ `--species` flag:**

Note that TRBV2 is not a rhesus macaque gene

In [None]:
!tcrconvert convert \
    -i ../../tcrconvert/examples/tenx.csv \
    -o ../../tcrconvert/examples/tenx2adapt.tsv \
    --frm tenx \
    --to adaptive \
    -s rhesus  # or mouse

## Using a custom reference

**You may want to create a reference for a species that isn't already included**, such as rabbit. To do so, you'll need FASTA files that contain TCR gene names in the headers in this format:

```
>SomeText|TRBV10-1*02|MoreText|...
```

1. The easiest way is to download the reference FASTAs for every gene group from [IMGT](https://www.imgt.org/vquest/refseqh.html) into a folder.

2. Build the lookup tables, specifying the species name you'll use when running TCRconvert:

```
$ tcrconvert build -i path/to/fasta/dir/ -s rabbit
```