Converts avro file to pandas dataframe in parallel
- fastavro
- joblib
- multiporcessing
- pandas
- Downlowd the repository and Extract to any location
- Go to the folder and run following command
# with admin rights
python setup.py install
# without admin rights
python setup.py install --user
import fastpandavro as fpa
import pandas as pd
df = fpa.avro_to_pandas(fname="test/sample.avro")
print(df.shape)
fpa.pandas_to_avro(df, "output.avro", schema_file="test/schema.avsc")