Skip to content

Commit

Permalink
Add convenience script to readme
Browse files Browse the repository at this point in the history
  • Loading branch information
corneliusroemer committed Aug 10, 2023
1 parent ff80c24 commit dc45bb9
Showing 1 changed file with 36 additions and 0 deletions.
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,44 @@ See [tests](tests/test_aliasor.py) for more examples.

## Installation

Choose any of the following:

```bash
pip install pango_aliasor
conda install -c bioconda pango_aliasor
mamba install -c bioconda pango_aliasor
```

## Convenience script

If you have a `metadata.tsv` with a `pango_lineage` column and you simply want to add a `pango_lineage_unaliased` column, you can use the convenience script below:

```py
import pandas as pd
from pango_aliasor.aliasor import Aliasor
import argparse


def add_unaliased_column(tsv_file_path, pango_column='pango_lineage', unaliased_column='pango_lineage_unaliased'):
aliasor = Aliasor()
def uncompress_lineage(lineage):
if not lineage or pd.isna(lineage):
return "?"
return aliasor.uncompress(lineage)

df = pd.read_csv(tsv_file_path, sep='\t')
df[unaliased_column] = df[pango_column].apply(uncompress_lineage)
return df


if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Add unaliased Pango lineage column to a TSV file.')
parser.add_argument('--input-tsv', required=True, help='Path to the input TSV file.')
parser.add_argument('--pango-column', default='pango_lineage', help='Name of the Pango lineage column in the input file.')
parser.add_argument('--unaliased-column', default='pango_lineage_unaliased', help='Name of the column to use for the unaliased Pango lineage column in output.')
args = parser.parse_args()
df = add_unaliased_column(args.input_tsv, args.pango_column, args.unaliased_column)
print(df.to_csv(sep='\t', index=False))
```

## Testing
Expand Down

0 comments on commit dc45bb9

Please sign in to comment.