The main purpose of HTTP transport is to reduce massive fetching overhead associated with large data sets (1M+ rows). It uses native Exasol commands EXPORT
and IMPORT
specifically designed to move large amounts of data. Data is transferred using CSV format with optional zlib compression.
This is a powerful tool which helps to bypass creation of intermediate Python objects altogether and dramatically increases performance.
PyEXASOL offloads HTTP communication and decompression to separate thread using threading module. Main thread deals with a simple pipe opened in binary mode.
You may specify a custom callback
function to read or write from raw pipe and to apply complex logic. Use callback_params
to pass additional parameters to callback
function (e.g. options for pandas).
You may also specify import_params
or export_params
to alter IMPORT
or EXPORT
query and modify CSV data stream.
Export data from Exasol into pandas.DataFrame
. You may use callback_param
argument to pass custom options for pandas read_csv
function.
# Read from SQL
pd = C.export_to_pandas("SELECT * FROM users")
# Read from table
pd = C.export_to_pandas("users")
Import data from pandas.DataFrame
into Exasol table. You may use callback_param
argument to pass custom options for pandas to_csv
function.
C.import_from_pandas(pd, "users")
my_list = [
(1, 'Bob', False, '2018-01-01'),
(2, 'Gill', True, '2018-02-01'),
]
C.import_from_iterable(my_list, "users")
This function is suitable for very big INSERTS as long as generator returns rows 1-by-1 and does not run out of memory.
def my_generator():
for i in range(5):
yield (i, 'Bob', True, '2017-01-01')
C.import_from_iterable(my_generator(), "users")
Import data from file, path object or file-like object opened in binary mode. You may import from process STDIN
using sys.stdin.buffer
.
# Import from file defined with string path
C.import_from_file('/test/my_file.csv', "users")
# Import from path object
C.import_from_file(pathlib.Path('/test/my_file.csv'), "users")
# Import from opened file
file = open('/test/my_file.csv', 'rb')
C.import_from_file(file, "users")
file.close()
# Import from STDIN
C.import_from_file(sys.stdin.buffer, "users")
Export data from Exasol into file, path object or file-like object opened in binary mode. You may export to process STDOUT
using sys.stdout.buffer
.
# Export from file defined with string path
C.export_to_file('my_file.csv', "users")
# Export into STDOUT
C.export_to_file(sys.stdout.buffer, "users")
Please refer to Exasol User Manual to know more about IMPORT
/ EXPORT
parameters.
Name | Example | Description |
---|---|---|
column_separator |
, |
Column separator for CSV |
column_delimiter |
" |
Column delimiter for CSV (quotting) |
columns |
['id', 'name'] |
List of table columns in data source, useful if column order of data source does not match column order of Exasol table |
csv_cols |
["1..5", "6 FORMAT='999.99'", "8"] |
List of CSV columns with optional numeric or date formats |
row_separator |
LF |
Row separator for CSV (line-endings) |
encoding |
UTF8 |
File encoding |
with_column_names |
True |
Add column names as first line, useful for Pandas |
null |
\N |
Custom NULL value |
delimit |
AUTO |
Delimiter mode: AUTO , ALWAYS , NEVER |
format |
gz |
Import file or stream compressed with gz , bzip2 , zip |
comment |
This is a query description |
Add a comment before the beginning of the query |
Name | Example | Description |
---|---|---|
column_separator |
, |
Column separator for CSV |
column_delimiter |
" |
Column delimiter for CSV (quotting) |
columns |
['id', 'name'] |
List of table columns, useful to reorder table columns during export from table |
csv_cols |
["1..5", "6 FORMAT='999.99'", "8"] |
List of CSV columns with optional numeric or date formats |
row_separator |
LF |
Row separator for CSV (line-endings) |
encoding |
UTF8 |
File encoding |
skip |
1 |
How many first rows to skip, useful for skipping header |
null |
\N |
Custom NULL value |
trim |
TRIM |
Trim mode: TRIM , RTRIM , LTRIM |
format |
gz |
Export file or stream compressed with gz , bzip2 , zip |
comment |
This is a query description |
Add a comment before the beginning of the query |
For any export_*
or import_*
call, you can add a comment that will be inserted before the beginning of the query.
This can be used for profiling and auditing. Example:
C.import_from_file('/test/my_file.csv', 'users', import_params={'comment': '''
This comment will be inserted before the query.
This query is importing user from CSV.
'''})
The comment is inserted as a block comment (/* <comment> */
). Block comment closing sequence (*/
) is forbidden in the comment.
Full collection of pre-defined callback functions is available in callback.py
module.
Example of callback exporting into basic Python list.
# Define callback function
def export_to_list(pipe, dst, **kwargs):
wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
reader = csv.reader(wrapped_pipe, lineterminator='\n', **kwargs)
return [row for row in reader]
# Run EXPORT using defined callback function
C.export_to_callback(export_to_list, None, 'my_table')
Example of callback importing from Pandas into Exasol.
df = <pandas.DataFrame>
def import_from_pandas(pipe, src, **kwargs):
wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
return src.to_csv(wrapped_pipe, header=False, index=False, quoting=csv.QUOTE_NONNUMERIC, **kwargs)
# Run IMPORT using defined callback function
C.export_from_callback(import_from_pandas, df, 'my_table')