# Tutorial for Using PyPFB Library to Export Data into Workspace

We can use the PyPFB library through the Gen3 CLI or directly in python. 

PyPFB source and documentation: https://github.com/uc-cdis/pypfb

To run any of the CLI commands from the documentation through the Gen3 CLI, simply prepend `gen3` in front of `pfb`

For example "cat tests/pfb-data/test.avro | pfb to tsv ./tsvs/" becomes "cat tests/pfb-data/test.avro | gen3 pfb to tsv ./tsvs/"


In [9]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from pfb import reader
from pfb.exporters import tsv
import pandas as pd

## Convert PFB to tsv(s)

Using the Gen3 CLI, we can utilize unix pipes to feed a pfb as input. 

In [10]:
# Using the Gen3 CLI
!cat PFB_example.avro | gen3 pfb to tsv ./PFB_example_tsvs/

In [None]:
!rm -r PFB_example_tsvs

In [None]:
# OR Using PyPFB SDK
r = reader.PFBReader("PFB_example.avro")
r = r.__enter__()
tsv._to_tsv(r, "PFB_example_tsvs", {})

In [3]:
# Read into dataframe
df = pd.read_csv('PFB_example_tsvs/demographic.tsv', sep='\t')

In [2]:
df.head()

In [1]:
df.info()

In [None]:
# Visualize some data
plt.figure(figsize=(10,5))
sns.countplot(x='annotated_sex', hue='race', data=df, orient='v')
plt.xticks(
    rotation=45, 
    horizontalalignment='right',
    fontweight='light',
    fontsize='x-large'  
)

In [7]:
# clean up
!rm -r PFB_example_tsvs