Skip to content

Latest commit

 

History

History
37 lines (27 loc) · 2.03 KB

HTTP_TRANSPORT_PARALLEL.md

File metadata and controls

37 lines (27 loc) · 2.03 KB

HTTP transport (parallel)

It is possible to run HTTP Transport in parallel. Workload may be distributed across multiple CPU cores and even across multiple servers.

How it works on high level

  1. Parent process opens main connection to Exasol and spawns multiple child processes.
  2. Each child process connects to individual Exasol node using http_transport(), gets internal Exasol address (ipaddr:port string) using .address property, and sends it to parent process.
  3. Parent process collects list of internal Exasol addresses from child processes and runs export_parallel() or import_parallel() function to execute SQL query.
  4. Each child process runs callback function and reads or sends chunk of data from or to Exasol.
  5. Parent process waits for SQL query and child processes to finish.

Parallel export

Please note that PyEXASOL does not provide any specific way to send internal Exasol address strings from child processes to parent process. You are free to choose your own way of inter-process communication. For example, you may use multiprocessing.Pipe.

Examples

Example of EXPORT query executed in Exasol

This is how complete query looks from Exasol perspective.

EXPORT my_table INTO CSV
AT 'http://27.1.0.30:33601' FILE '000.csv'
AT 'http://27.1.0.31:41733' FILE '001.csv'
AT 'http://27.1.0.32:45014' FILE '002.csv'
AT 'http://27.1.0.33:42071' FILE '003.csv'
AT 'http://27.1.0.34:36669' FILE '004.csv'
AT 'http://27.1.0.35:36794' FILE '005.csv'
WITH COLUMN HEADERS
;