Skip to content

shemul/pandas-multiprocessing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple multiprocessing implementation in python using pandas dataframe

git clone https://github.com/shemul/pandas-multiprocessing
cd pandas-multiprocessing
pipenv install
pipenv run python main.py --input_csv="./data/users.csv" --output_csv="./output/users.csv" --chunk_size=300 --pool=10

where pool indicates how many process will spawn and chunk_size defines how many rows will be process in every pool

Todos
- update readme.md

About

simple demonstration of python pandas multiprocessing

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages