Skip to content

Moving large files into hdfs in python using multiprocessing

Notifications You must be signed in to change notification settings

novaferg/cloudera-python-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

This is my first project in a Cloudera Quickstart Container. This is a low level approach of getting multiple large (100GB) files and combining them into hdfs. Uses multiprocessing and runs quickly. At most, this uses 7-8GB of memory.

About

Moving large files into hdfs in python using multiprocessing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages