GitHub - scubbx/couch-bulk-multiprocess: A multiprocess bulk-uploading helper for CouchDB

Multiprocess Bulk Uploading to CouchDB

Status

This is not yet finished, but can already be used.

History

v 0.2.7

Updated readme

v 0.2.6

Introduced the jobsbuffersizemax argument to limit the amount of upload-processed buffered in working memory. Now it is possible to work on data of infinite size.

v 0.2.5

only spawn one concurrent process for performing uploads

v 0.2.4

working version, still spans unlimited amount of processes

Usage

Create a new mpcouchPusher object

myCouchPusher = mpcouch.mpcouchPusher( "http://localhost:5984/myDatabase", 30000 )

If the data is created faster than the upload happens, it might be necessary to let the generation of new documents pause befor allowing another batch-upload process to be buffered. The amount of bufferd upload-processes is specified by the optional jobsbuffersizemax argument which defaults to 10. This means that when there are 10 batch-upload processes already waiting in line to be executed, the module holds the main python thread until one process has finished. By adjusing this value it is possible to find an equilibrium between upload-buffer (working memory) and document generation speed.

Use this object every time you have one single document ready to be stored in the database:

myCouchPusher.pushData(myNewDocument)

The module will collect all documents until the threshold is reached (in our example this would be the 30000 specified above) and upload them as a batch to the CouchDB also specified at creation time of the object (myCouchDbDatabase).

Since every bulk-upload is performed by a single process, the original program continues while the upload happens in the background.

To wait for all running uploads to finish and to make sure the very last batch of documents gets pushed to the server, run

myCouchPusher.finish()

after your final document was sent to pushData. The module now waits for all the uploads to finish and uploads the final bulk of collected documents.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
build/lib/mpcouch		build/lib/mpcouch
mpcouch		mpcouch
tests		tests
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
pushToPyPi.sh		pushToPyPi.sh
readme.rst		readme.rst
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiprocess Bulk Uploading to CouchDB

Status

History

v 0.2.7

v 0.2.6

v 0.2.5

v 0.2.4

Usage

About

Releases

Packages

Languages

scubbx/couch-bulk-multiprocess

Folders and files

Latest commit

History

Repository files navigation

Multiprocess Bulk Uploading to CouchDB

Status

History

v 0.2.7

v 0.2.6

v 0.2.5

v 0.2.4

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages