Skip to content

Commit

Permalink
Updated user guide
Browse files Browse the repository at this point in the history
  • Loading branch information
morazow committed Sep 22, 2021
1 parent 0c704c9 commit aad34d0
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 6 deletions.
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ max_line_length = 120
trim_trailing_whitespace = true

[*.md]
max_line_length = 120
max_line_length = 80
trim_trailing_whitespace = false

[Makefile]
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/ci-build.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: CI Build

on: [ push, pull_request ]
on:
- push

jobs:
build:
Expand Down
4 changes: 4 additions & 0 deletions doc/changes/changes_2.0.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ Code name: Improved Parquet Reader

## Summary

In this release we add optimized Parquet file importer. Previous version read single Parquet file in a single importer process, in this version we improved it by virtually splitting files into fixed sized chunks that then can be imported in many parallel processes.

In addition, we added support for using proxies when accessing cloud storage systems.

## Features

* #173: Added improved chunked Parquet reader
Expand Down
24 changes: 20 additions & 4 deletions doc/user_guide/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,8 +224,12 @@ CREATE OR REPLACE JAVA SET SCRIPT IMPORT_PATH(...) EMITS (...) AS
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-<VERSION>.jar;
/

CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...)
EMITS (filename VARCHAR(2000), partition_index VARCHAR(100)) AS
CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
filename VARCHAR(2000),
partition_index VARCHAR(100),
start_index DECIMAL(36, 0),
end_index DECIMAL(36, 0)
) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-<VERSION>.jar;
/
Expand Down Expand Up @@ -255,8 +259,12 @@ CREATE OR REPLACE JAVA SET SCRIPT IMPORT_PATH(...) EMITS (...) AS
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-<VERSION>.jar;
/

CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...)
EMITS (filename VARCHAR(2000), partition_index VARCHAR(100)) AS
CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
filename VARCHAR(2000),
partition_index VARCHAR(100),
start_index DECIMAL(36, 0),
end_index DECIMAL(36, 0)
) AS
%scriptclass com.exasol.cloudetl.scriptclasses.DockerFilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-<VERSION>.jar;
/
Expand Down Expand Up @@ -369,6 +377,14 @@ These are optional parameters that have default values.
in the Import SQL statement. Likewise, the default value is `iproc()` in the
Export SQL statement.

#### Import Optional Parameters

The following are option parameters for import statements.

* ``CHUNK_SIZE`` - It specifies a file chunk size in bytes. The importer then
will try to virtually splits a file into chunks with specified size, and
imports each chunk in parallel. By default it is `67108864` (64MB).

#### Export Optional Parameters

These optional parameters only apply to the data export statements.
Expand Down

0 comments on commit aad34d0

Please sign in to comment.