GitHub - codeinthehole/foiegras: The missing CSV loading library for Postgres

+------------------+
|          __      |    _|_|_|_|          _|                  _|_|_|                                 
| uh oh >(' )      |    _|        _|_|          _|_|        _|        _|  _|_|    _|_|_|    _|_|_|   
|         )/       |    _|_|_|  _|    _|  _|  _|_|_|_|      _|  _|_|  _|_|      _|    _|  _|_|       
|        /(        |    _|      _|    _|  _|  _|            _|    _|  _|        _|    _|      _|_|   
|       /  `----/  |    _|        _|_|    _|    _|_|_|        _|_|_|  _|          _|_|_|  _|_|_|     
|       \  ~=- /   |
| ~^~^~^~^~^~^~^~^ |    Rapid loading of CSVs for Postgres
+----------------- +

In MySQL you can do this:

    mysql> LOAD DATA INFILE '/tmp/data.csv'
        -> REPLACE INTO TABLE 'my_data'
        -> FIELDS (isbn, price, stock);

to load CSV data from /tmp/data.csv into the my_data table, replacing rows that match on a unique constraint. Sadly, the Postgres's COPY command does not support this option.

This package provides a work-around to allow using CSVs to update tables in Postgres.

    $ pip install foiegras 
    $ python

    >>> import psycopg2, foiegras
    >>> conn = psycopg2.connect("dbname=mydb")
    >>> goose = foiegras.Goose(conn)
    >>> goose.load('my_data', '/tmp/data.csv', ('isbn', 'price', 'stock'))

Yay! Don't forget to close the connection:

	>>> conn.close()

Interesting... tell me more

The Postgres equivalent of LOAD DATA INFILE is the COPY ... FROM ... command, however it does not support replacing rows that match on a unique constraint. Using CSVs to load data into a table is the fastest way of updating a table - hence it's frustrating that Postgres doesn't support the REPLACE option that MySQL does.

This package works around this issue by loading the CSV to a temporary table then updating the master table using two UPDATE commmands that insert new records and update matching records respectively. All very simple.

Does this work with Django?

Yup.

	>>> from django.db import connection
	>>> connection.enter_transaction_management()
	>>> goose = foisgras.Goose(connection)
	>>> goose.load(Record._meta.db_table, '/path/to/data.csv', ('isbn', 'price', 'stock'))

Surely this has been done already?

Apparently not. There are some similar CSV loading libraries for Django

django-csvimport - This is a Django app for uploading CSV files to populate model tables. It uses the ORM directly to save new instances and so does not perform well when loading larges volumes of data.
pgloader - This looks like it might be quite good but it's hard to tell as the docs are so bad. It's also not on PyPI.

API

Very simple at the moment. All you have to do is instantiate a Goose object passing a psycopg2 connection:

	>>> conn = psycopg2.connect("dbname=mydb")
	>>> goose = foiegras.Goose(conn)

Then the only method you need to know is load which has signature:

	def load(self, table_name, filepath, fields, delimiter=",", 
	         replace_duplicates=True, has_header=False):
	    ...

If replace_duplicates == False, then rows from the CSV file that match on a unique constraint will be ignored.

## Contribute

Zero to tests passing in 6 presses of ↩:

	$ git clone git@github.com:codeinthehole/foiegras.git
	$ cd foiegras
	$ mkvirtualenv foiegras
	$ python setup.py develop
	$ pip install -r requirements.txt
	$ nosetests

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
foiegras		foiegras
sandbox		sandbox
tests		tests
LICENSE		LICENSE
README.md		README.md
TODO		TODO
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interesting... tell me more

Does this work with Django?

Surely this has been done already?

API

About

Releases

Packages

Languages

License

codeinthehole/foiegras

Folders and files

Latest commit

History

Repository files navigation

Interesting... tell me more

Does this work with Django?

Surely this has been done already?

API

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages