Stupid Import tool
This program is currently under heavy development. It may be unusable for your usecase, incompatible with your SQL-Server instance or even be unusable at all
Provide a line oriented database import / export tool, targeting CSV.
CSV is simple and efficient when used outside of the Microsoft ecosystem. Let's take it back.
Current supported databases:
- SQL-Server (pymssql)
- sqlite (sqlite3)
Maybe a pyodbc support will be implemented in a near future. (pymssql seems hard to setup on a few OSes)
usage: sit [-h] [-c CONFIG] [-u USER] [-e ENGINE] [-p PASSWORD] [-t HOSTNAME]
[-d DATABASE] [--port PORT]
{raw,push,pull} ... env
positional arguments:
{raw,push,pull} * sub-commands help *
raw open an SQL prompt to the remote server.You'll need
`pygment` and `prompt_toolkit`.
push Copy data to remote server. Can take regular CSV
directly from STDIN.
pull Retrieve data from remote server. Outputs regular CSV
on stdout
env environement
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
read configuration from file (default to ./sit.ini,
fallback on ~/.sit.ini (default: None)
-u USER, --user USER username used (default: None)
-e ENGINE, --engine ENGINE
db engine/driver (default: None)
-p PASSWORD, --password PASSWORD
password used (default: None)
-t HOSTNAME, --hostname HOSTNAME
database hostname (IP are ok) (default: None)
-d DATABASE, --database DATABASE
which database to connect to (default: None)
--port PORT service port (default: None)
Retrieve data
usage: sit pull [-h] [-f FILE] [-T [TABLE_NAME]] [-q [QUERY]]
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE write to file instead of stdout
-T [TABLE_NAME], --table [TABLE_NAME]
extract data from this table
-q [QUERY], --query [QUERY]
run a custom sqlquery for data retrieval.
example:
$ sit pull -q 'SELECT *
> FROM users U
> ORDER BY U.points
> LIMIT 100' staging > top_100_users.csv
Insert data (also, pretty obvious)
usage: sit push [-h] [-f FILE] [-T [TABLE_NAME]] [-C]
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE read from file instead of stdin
-T [TABLE_NAME], --table [TABLE_NAME]
INSERT into this table (default to current timestamp)
-C, --create automatically attempt to create a new table.
Column types will be naively inferred from first
lines of data.
eg:
`$ sit push -f dummy.csv -CT dummy_insert dev`
When invoked, sit
looks for an sit.ini
file located in your current directory.
This file primarily avoids very long command lines.
[dev]
hostname = MSSQL-DV1-062
port = 2003
user = DOMAIN\username
password = mypassword
database = test_database
driver = sqlserver
[local_sqlite]
database = sqlite_file.db
driver = sqlite
# Create a table containing each line of records.csv,
# naming table after the current timestamp.
#
# CSV comes from HTTP. Also, using pipes,
# no intermediate files are needed
$ curl https://someurl/data.csv | sit push -C dev
# Create a table containing each line of records.csv,
# into a local sqlite database. the table will
# be named "records" (-T argument)
$ sit push -f records.csv -CT records local_sqlite
# Opens a shell on the remote SQL-Server instance:
$ sit raw dev
# Fetch "test"'s table content from env. `lite`, then
# insert it into the table named "test", creating it
# if it does not exists.
#
$ sit pull -T test lite | python sit push -CT test dev
$ git clone git@github.com:slashdevsda/sit.git && cd sit
$ pip install .
$ git clone git@github.com:slashdevsda/sit.git && cd sit
$ virtualenv venv && source venv/bin/activate
(venv)$ pip install -r requirements.txt
$ git clone git@github.com:slashdevsda/sit.git && cd sit
$ virtualenv venv && source venv/bin/activate
(venv)$ pip install .
(venv)$ pip install pytest
(venv)$ pytest tests
Opens a shell on the remote (or local) server. Essentially for debugging purpose (and delete faulty insertions).
Note: it uses prompt_toolkit and does not fallback on anything yet if the terminal does not supports fancy termcaps.
This feature may disappear.
The push feature is not suitable for:
- very larges datasets
Since there is only one transaction when inserting lines from a csv file, it is not suitable for very large datasets, and not very fault tolerant over a bad connection. However, this could be implemented, depending on usage.
So far, the SQL-Server implementation was tested with 1 000 000 rows tiny dataset (around 57 megs), and succeed both to import and export tasks.
- speed
this tool aims to speedup development and operational tasks. Tt is not sufficiently reliable for a more automated usage (ie: you should not use it to performs automated backups, at least for the moment)
- smart stuff
This feature is not smart, even when it tries to guess columns types. Basically, it will fallback on VARCHAR/derivatives when stuck, meaning you could have to run some additional queries to tranform the data.