## Simple showcase of the PyGeneBe library

Install genebe using 
```
pip install -U genebe
```

In [1]:
import genebe as gnb

print('Version:' + gnb.__version__)

Version:0.0.14


GeneBe makes it easy to parse HGVS variants. It works with `c`, `m`, `g` and `n` representation. It does not support `p`.

In [8]:
parsed = gnb.parse_hgvs(['ENST00000679957.1:c.803C>T',
                         'ENST00000404276.6:c.1100del',
                         'NC_000003.12:g.39394574A>T',
                         'NC_012920.1:m.1243T>C'] )
parsed

100%|██████████| 1/1 [00:00<00:00,  3.43it/s]


['1-230710021-G-A', '22-28695868-AG-A', '3-39394574-A-T', 'M-1243-T-C']

Here is an example of annotating a list of genetic variants represented as `chr-pos-ref-alt` to the `dict` data structure.

In [9]:
flat = gnb.annotate(parsed, flatten_consequences=True, use_refseq=False, output_format="list")
flat

100%|██████████| 1/1 [00:00<00:00,  2.42it/s]


[{'chr': '1',
  'pos': 230710021,
  'ref': 'G',
  'alt': 'A',
  'transcript': 'NM_001384479.1',
  'gene_symbol': 'AGT',
  'dbsnp': '1228544607',
  'gnomad_exomes_af': 6.840659807494376e-06,
  'gnomad_exomes_ac': 10.0,
  'gnomad_exomes_homalt': 0.0,
  'revel_score': 0.18700000643730164,
  'alphamissense_score': 0.06610000133514404,
  'bayesdelnoaf_score': -0.3700000047683716,
  'phylop100way_score': 4.302999973297119,
  'acmg_score': -1,
  'acmg_classification': 'Likely_benign',
  'acmg_criteria': 'PM2_Supporting,BP4_Moderate',
  'gene_hgnc_id': 333,
  'hgvs_c': 'c.803C>T',
  'consequences': 'missense_variant'},
 {'chr': '22',
  'pos': 28695868,
  'ref': 'AG',
  'alt': 'A',
  'transcript': 'NM_007194.4',
  'gene_symbol': 'CHEK2',
  'dbsnp': '555607708',
  'gnomad_genomes_af': 0.0017200199654325843,
  'gnomad_genomes_ac': 262.0,
  'gnomad_genomes_homalt': 0.0,
  'phylop100way_score': 8.668000221252441,
  'acmg_score': 9,
  'acmg_classification': 'Likely_pathogenic',
  'acmg_criteria': 'P

It is natural to work with a lists of data in represented as Pandas dataframe. Here is an example of annotating a list of variants to the Pandas dataframe. In the `clingen-erepo.ipynb` example file you can find more examples of annotating and joining variants using Pandas.

In [11]:
df = gnb.annotate(parsed, flatten_consequences=True, use_ensembl=False, output_format="dataframe")
df

100%|██████████| 1/1 [00:00<00:00,  3.33it/s]


Unnamed: 0,chr,pos,ref,alt,transcript,gene_symbol,dbsnp,gnomad_exomes_af,gnomad_exomes_ac,gnomad_exomes_homalt,...,hgvs_c,consequences,gnomad_genomes_af,gnomad_genomes_ac,gnomad_genomes_homalt,clinvar_disease,clinvar_classification,dbscsnv_ada_score,gnomad_mito_homoplasmic,gnomad_mito_heteroplasmic
0,1,230710021,G,A,NM_001384479.1,AGT,1228544607,7e-06,10.0,0.0,...,c.803C>T,missense_variant,,,,,,,,
1,22,28695868,AG,A,NM_007194.4,CHEK2,555607708,,,,...,c.1100delC,frameshift_variant,0.00172,262.0,0.0,"Hereditary cancer-predisposing syndrome,Li-Fra...",Conflicting interpretations of pathogenicity,,,
2,3,39394574,A,T,NM_017875.4,SLC25A38,121918332,1e-06,2.0,0.0,...,c.790A>T,"stop_gained,splice_region_variant",,,,Sideroblastic anemia 2,Pathogenic,0.99721,,
3,M,1243,T,C,,RNR1,28358572,,,,...,,,,,,"not specified,not provided",Benign,,839.0,1.0
