Skip to content

cvickery/csv2db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

csv2db - Create database tables from CSV files.

Description

This is a Python script that converts a “properly structured” CSV file to a Postgres database table.

The CSV file has to have a heading row in which each column name turns into a valid db column name when lowercased and with spaces replaced by underscores. Most database column types will be text, but certain column names will trigger the use of the date or integer data types.

Usage Examples

csv2db.py -f CSVfile123.csv
  • Create a table named csvfile in the databse with the same name as the user’s username.
csv2db.py -p id region -t regions -db accounts < csvfile.csv
  • Create a table named regions in the accounts database; the primary key is (id, region)

Usage Details

  • The --table_name (-t) option gives the name of the table to create. The table will be replaced if it already exists. If the table name is not specified, it will be the stem part of the CSV file name, lowercased and with digits and hyphens stripped. If input is from stdin and no table name is specified, the table name is test.
  • Column names from the header row are used to set the types of the database columns: names ending in _date will be of type date, names ending in _id, _num_, and _nbr will be of type integer. All other columns will be of type text.
    • The command line options -date, -id, -num, and -nbr suppress the use of date/integer types for the corresponding column name patterns.
  • The --modulus (-m) n option, with a value greater than 0 for n, turns on progress reporting: display the number of lines processed every n lines (i.e., when the line count is zero modulo n).
  • The --primary_key (-p) option gives a list of column names to use as the primary key for the resulting table. If omitted, the table is created with no primary key.
  • The --columns (-c) option displays the column names and types for the generated table, and exits. Intended as a helper for setting up and checking the primary key settings.
  • There is a --debug (-d) option, but it doesn’t do anything.

Intended Use and Error Conditions

The program is intended as a development utility for loading spreadsheets into database tables so that standard sql queries can be used to explore the information in the sheet quickly rather than traditional spreadsheet pivot tables and lookup functions. Thus, rather than try to make the code robust in the face of anomalies, database errors are not handled, and the user is presumed equipped to make the necessary adjustments to the command line options or even the CSV file.

About

Postgres table from CSV file.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages