# Using Script-Language Container

## Preparing Notebook

In [91]:
!pip install -r requirements.txt



In [1]:
import bash_runner as bash # A helper to run bash with interactive output from python
import importlib
from pathlib import Path
import pyexasol
import requests
import textwrap

## Cloning the Git Repository

To use the Script-Language Container we need to clone their Git Repository with

```
git clone https://github.com/exasol/script-languages-release --recursive
```

We need to use `--recursive` because the repoistory has submodules

In [23]:
slc_path="script-languages-release"
if not Path(slc_path).exists():
    bash.run("""
    git clone https://github.com/exasol/script-languages-release --recursive
    """)
else:
    bash.run(f"""
    cd {slc_path}
    git reset --hard origin/master 
    git submodule foreach git reset --hard origin/master 
    """)

HEAD is now at 3bb2b47 Renaming certain things to udf plugins (#171)
Entering 'script-languages'
HEAD is now at ed13033 Renaming certain things to udf plugins (#168)


## Buiding and Exporting a Container

To build and export the container you can use `exaslct`. It first builds a series of docker images and then exports the container as tar.gz package. A container is here by defined as a flavor. For this example, we use `flavors/python3-ds-EXASOL-6.1.0` and export it the directory `containers`.

In [24]:
bash.run(f"""
pushd {slc_path}
./exaslct export --flavor-path flavors/python3-ds-EXASOL-6.1.0 --export-path containers | grep -E "DockerPullImageTask"
""")

~/data-science-examples/tutorials/script-languages/script-languages-release ~/data-science-examples/tutorials/script-languages
Virtualenv already exists!
Removing existing virtualenv...
Creating a virtualenv for this project...
Pipfile: /home/jupyter/data-science-examples/tutorials/script-languages/script-languages-release/script-languages/Pipfile
Using /opt/conda/bin/python3.7m (3.7.9) to create virtualenv...
⠴[0m Creating virtual environment...[Kcreated virtual environment CPython3.7.9.final.0-64 in 1452ms
  creator CPython3Posix(dest=/home/jupyter/.local/share/virtualenvs/script-languages-VP1Xj6ma, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/jupyter/.local/share/virtualenv)
    added seed packages: pip==20.3.1, setuptools==51.0.0, wheel==0.36.2
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

[K[?25h[32m[

### What to do if something doesn't work?

During the build it can happen that external package repositories might be not available or something is wrong on your machine where you run the build. For these cases, `exaslsct` stores many logs to identify the problem after a build.

#### Exaslsct Log

The main log for `exaslct` is stored directly in `exaslct.log`. However, it gets overwritten when you start `exaslct` again.

In [25]:
bash.run(f"""tail {slc_path}/exaslct.log""")

===== Luigi Execution Summary =====

The command took 5.551279 s

Cached container under /home/jupyter/data-science-examples/tutorials/script-languages/script-languages-release/.build_output/cache/exports/python3-ds-EXASOL-6.1.0-release-EYFRS54NWXPTDZOBU2ZWGIXIID77PTCFMM2LLHPQNWDGQAQUNIAA.tar.gz

Copied container to containers/python3-ds-EXASOL-6.1.0_release.tar.gz




#### Build Output Directory

More detailed information about the build or other operations can be found in the `.build_output/jobs/*/outputs` directory. Here each run of `exaslsct` create its own directory under `.build_output/jobs`. The `outputs` directory stores each executed task of `exaslct` it outputs and log files in case it produces these. Especially, the Docker tasks such as build, pull and push store the logs returned by the Docker API. This can be helpful for finding problems during build.

In [40]:
bash.run(f"""
find {slc_path}/.build_output/jobs/*/outputs -type f
""")

script-languages-release/.build_output/jobs/2021_02_02_11_00_53_ExportContainers/outputs/ExportContainers_b032926fcd/ExportFlavorContainer_8eba5879f8/ExportContainerTask_a69810acba/logs/extract_release_file.log
script-languages-release/.build_output/jobs/2021_02_02_11_00_53_ExportContainers/outputs/ExportContainers_b032926fcd/ExportFlavorContainer_8eba5879f8/ExportContainerTask_a69810acba/logs/pack_release_file.log
script-languages-release/.build_output/jobs/2021_02_02_11_00_53_ExportContainers/outputs/ExportContainers_b032926fcd/ExportFlavorContainer_8eba5879f8/DockerCreateImageTask_f2e68ce3b8/DockerPullImageTask_f2e68ce3b8/logs/pull_docker_db_image.log
script-languages-release/.build_output/jobs/2021_02_02_11_00_53_ExportContainers/outputs/ExportContainers_b032926fcd/command_line_output
script-languages-release/.build_output/jobs/2021_02_02_11_15_40_ExportContainers/outputs/ExportContainers_b73ac2ac38/command_line_output


## Customizing Script-Language Containers

### Flavors of Containers

In [7]:
bash.run(f"""
find {slc_path}/flavors/  -maxdepth 1 -name '*EXASOL*'
""")

script-languages-release/flavors/fancyr-EXASOL-6.1.0
script-languages-release/flavors/standard-EXASOL-6.1.0
script-languages-release/flavors/python3-ds-EXASOL-6.1.0
script-languages-release/flavors/standard-EXASOL-7.0.0
script-languages-release/flavors/standard-EXASOL-6.2.0
script-languages-release/flavors/python3-ds-cuda-preview-EXASOL-6.1.0


### Flavor Definition

In [48]:
bash.run(f""" 
find -L {slc_path}/flavors/python3-ds-EXASOL-6.1.0 -maxdepth 2
""")

script-languages-release/flavors/python3-ds-EXASOL-6.1.0
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/Dockerfile
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/base_test_build_run
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/release
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/testconfig
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/flavor_test_build_run
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/base_test_deps
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/language_definition
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_base/build_run
script-languages-release/flavor

### Flavor Customization Build Step

In [49]:
bash.run(f""" 
find -L {slc_path}/flavors/python3-ds-EXASOL-6.1.0/flavor_customization -type f
""")

script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/Dockerfile
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages/python3_pip_packages
script-languages-release/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages/apt_get_packages


#### Dockerfile

In [50]:
bash.run(f""" 
cat {slc_path}/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/Dockerfile
""")

############################################################################################
############################################################################################
# This Dockerfile allows you to extend this flavor by installing packages or adding files. 
# IF you didn't change the lines below, you can add packages and their version to the  
# files in ./packages and they get automatically installed.                                
############################################################################################
############################################################################################

#######################################################################
#######################################################################
# Do not change the following lines unless you know what you are doing 
#######################################################################
###################################################################

#### Package Lists

In [70]:
bash.run(f""" 
cat {slc_path}/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages/python3_pip_packages
""")

# This file specifies the package list which gets installed via pip for python3.
# You must specify the the package and its version separated by a |.
# We recommend here the usage of package versions, to ensure that the container 
# builds are reproducible. However, we allow also packages without version.
# As you can see, this file can contain comments which start with #.
# If a line starts with # the whole line is a comment, however you can
# also start a comment after the package definition.

#tensorflow-probability|0.9.0


In [13]:
bash.run(f""" 
echo "dask[complete]|2021.1.1 " >> {slc_path}/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages/python3_pip_packages
""")

In [14]:
bash.run(f""" 
cat {slc_path}/flavors/python3-ds-EXASOL-6.1.0/flavor_customization/packages/python3_pip_packages
""")

# This file specifies the package list which gets installed via pip for python3.
# You must specify the the package and its version separated by a |.
# We recommend here the usage of package versions, to ensure that the container 
# builds are reproducible. However, we allow also packages without version.
# As you can see, this file can contain comments which start with #.
# If a line starts with # the whole line is a comment, however you can
# also start a comment after the package definition.

#tensorflow-probability|0.9.0
dask[complete]|2021.1.1 


#### Rebuilding the customized Flavor

In [76]:
bash.run(f"""
pushd {slc_path}
./exaslct export --flavor-path flavors/python3-ds-EXASOL-6.1.0 --export-path containers | grep -E "DockerPullImageTask|DockerBuildImageTask"
""")

~/data-science-examples/tutorials/script-languages/script-languages-release ~/data-science-examples/tutorials/script-languages
Virtualenv already exists!
Removing existing virtualenv...
Creating a virtualenv for this project...
Pipfile: /home/jupyter/data-science-examples/tutorials/script-languages/script-languages-release/script-languages/Pipfile
Using /opt/conda/bin/python3.7m (3.7.9) to create virtualenv...
⠼[0m Creating virtual environment...[Kcreated virtual environment CPython3.7.9.final.0-64 in 195ms
  creator CPython3Posix(dest=/home/jupyter/.local/share/virtualenvs/script-languages-VP1Xj6ma, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/jupyter/.local/share/virtualenv)
    added seed packages: pip==20.3.1, setuptools==51.0.0, wheel==0.36.2
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

[K[?25h[32m[2

In [11]:
bash.run(f"""
ls {slc_path}/.build_output/cache/exports
""")

python3-ds-EXASOL-6.1.0-release-EYFRS54NWXPTDZOBU2ZWGIXIID77PTCFMM2LLHPQNWDGQAQUNIAA.tar.gz
python3-ds-EXASOL-6.1.0-release-EYFRS54NWXPTDZOBU2ZWGIXIID77PTCFMM2LLHPQNWDGQAQUNIAA.tar.gz.sha256sum
python3-ds-EXASOL-6.1.0-release-RN5HZ72HNOSOJWNBF2M7TWKIOGWZ4U423PAIU7JYFHMRUZM55QNQ.tar.gz
python3-ds-EXASOL-6.1.0-release-RN5HZ72HNOSOJWNBF2M7TWKIOGWZ4U423PAIU7JYFHMRUZM55QNQ.tar.gz.sha256sum


## Testing the new Script-Language Container

In [5]:
DATABASE_HOST="localhost"
DATABASE_PORT=8888
DATABASE_USER="sys"
DATABASE_PASSWORD="exasol"
BUCKETFS_PORT=6666
BUCKETFS_USER="w"
BUCKETFS_PASSWORD="write"
BUCKETFS_NAME="bfsdefault"
BUCKET_NAME="default"
PATH_IN_BUCKET="container"

### Starting a local Docker-DB for Testing

In [6]:
test_env_path="integration-test-docker-environment"
if not Path(test_env_path).exists():
    bash.run("""
    git clone https://github.com/exasol/integration-test-docker-environment
    """)
else:
    bash.run(f"""
    cd {slc_path}
    git reset --hard origin/master 
    """)
    

HEAD is now at 3bb2b47 Renaming certain things to udf plugins (#171)


In [12]:
bash.run(f"""
pushd {test_env_path}
./start-test-env spawn-test-environment --environment-name test --database-port-forward {DATABASE_PORT} --bucketfs-port-forward {BUCKETFS_PORT} &> integration-test-docker-environment.log
tail integration-test-docker-environment.log
""")

~/data-science-examples/tutorials/script-languages/integration-test-docker-environment ~/data-science-examples/tutorials/script-languages
    - 1 DockerCreateImageTask_74a4edf6fc(image_name=exasol/script-language-container:db-test-container)
    - 1 DockerTestContainerBuild(caller_output_path=[])
    - 1 PopulateEngineSmallTestDataToDatabase(...)
    ...

This progress looks :) because there were no failed tasks or missing dependencies

===== Luigi Execution Summary =====

The command took 184.578282 s


### Upload the Container to the Database

In [15]:
bash.run(f"""
pushd {slc_path}
./exaslct upload \
    --flavor-path flavors/python3-ds-EXASOL-6.1.0 \
    --database-host {DATABASE_HOST}\
    --bucketfs-port {BUCKETFS_PORT} \
    --bucketfs-username {BUCKETFS_USER} \
    --bucketfs-password {BUCKETFS_PASSWORD} \
    --bucketfs-name {BUCKETFS_NAME} \
    --bucket-name {BUCKET_NAME} \
    --path-in-bucket {PATH_IN_BUCKET} \
    --release-name current &> upload.log
tail -n 30 upload.log
""")

~/data-science-examples/tutorials/script-languages/script-languages-release ~/data-science-examples/tutorials/script-languages

ALTER SYSTEM SET SCRIPT_LANGUAGES='PYTHON3=localzmq+protobuf:///bfsdefault/default/container/python3-ds-EXASOL-6.1.0-release-current?lang=python#buckets/bfsdefault/default/container/python3-ds-EXASOL-6.1.0-release-current/exaudf/exaudfclient_py3';


ependencies

===== Luigi Execution Summary =====

The command took 75.470361 s

Uploaded .build_output/cache/exports/python3-ds-EXASOL-6.1.0-release-EYFRS54NWXPTDZOBU2ZWGIXIID77PTCFMM2LLHPQNWDGQAQUNIAA.tar.gz to
http://localhost:6666/default/container/python3-ds-EXASOL-6.1.0-release-current.tar.gz


In SQL, you can activate the languages supported by the python3-ds-EXASOL-6.1.0
flavor by using the following statements:


To activate the flavor only for the current session:

ALTER SESSION SET SCRIPT_LANGUAGES='PYTHON3=localzmq+protobuf:///bfsdefault/default/container/python3-ds-EXASOL-6.1.0-release-current?lang=pyth

### Check if your customization did work

In [16]:
def connect():
    con=pyexasol.connect(dsn=f"{DATABASE_HOST}:{DATABASE_PORT}",user=DATABASE_USER,password=DATABASE_PASSWORD)
    con.execute("ALTER SESSION SET SCRIPT_LANGUAGES='PYTHON3_DS=localzmq+protobuf:///bfsdefault/default/container/python3-ds-EXASOL-6.1.0-release-current?lang=python#buckets/bfsdefault/default/container/python3-ds-EXASOL-6.1.0-release-current/exaudf/exaudfclient_py3';")
    con.execute("OPEN SCHEMA TEST")
    return con

In [17]:
con = connect()

con.execute(textwrap.dedent("""
CREATE OR REPLACE PYTHON3_DS SCALAR SCRIPT execute_shell_command_py3(command VARCHAR(2000000), split_output boolean)
EMITS (lines VARCHAR(2000000)) AS
import subprocess

def run(ctx):
    try:
        p = subprocess.Popen(ctx.command,
                             stdout    = subprocess.PIPE,
                             stderr    = subprocess.STDOUT,
                             close_fds = True,
                             shell     = True)
        out, err = p.communicate()
        if isinstance(out,bytes):
            out=out.decode('utf8')
        if ctx.split_output:
            for line in out.strip().split('\\n'):
                ctx.emit(line)
        else:
            ctx.emit(out)
    finally:
        if p is not None:
            try: p.kill()
            except: pass
/
"""))

<ExaStatement session_id=1690764294772752384 stmt_idx=3>

#### Check with "pip list" if a the package "dask" got installed

In [18]:
con = connect()
rs=con.execute("""select execute_shell_command_py3('python3 -m pip list', true)""")
for r in rs: 
    print(r[0])

Package              Version
-------------------- ---------------
absl-py              0.11.0
astor                0.8.1
autograd             1.3
autograd-gamma       0.5.0
bokeh                2.2.3
cached-property      1.5.2
click                7.1.2
cloudpickle          1.6.0
contextvars          2.4
cycler               0.10.0
dask                 2021.1.1
distributed          2021.1.1
formulaic            0.2.1
fsspec               0.8.5
future               0.18.2
gast                 0.4.0
gensim               3.8.3
grpcio               1.35.0
h5py                 3.1.0
HeapDict             1.0.1
imbalanced-learn     0.7.0
immutables           0.14
importlib-metadata   3.4.0
interface-meta       1.2.2
Jinja2               2.11.3
joblib               1.0.0
Keras                2.3.1
Keras-Applications   1.0.8
Keras-Preprocessing  1.1.2
kiwisolver           1.3.1
kmodes               0.10.2
lifelines            0.25.8
locket               0.2.1
lxml                 4.6.2
Markdown

#### Embedded Build Info of the Container

In [19]:
con = connect()
rs=con.execute("""select execute_shell_command_py3('find /build_info', true)""")
for r in rs: 
    print(r[0])

/build_info
/build_info/image_info
/build_info/image_info/python3-ds-EXASOL-6.1.0-language_deps
/build_info/image_info/python3-ds-EXASOL-6.1.0-build_deps
/build_info/image_info/python3-ds-EXASOL-6.1.0-udfclient_deps
/build_info/image_info/python3-ds-EXASOL-6.1.0-release
/build_info/image_info/python3-ds-EXASOL-6.1.0-flavor_base_deps
/build_info/image_info/python3-ds-EXASOL-6.1.0-build_run
/build_info/image_info/python3-ds-EXASOL-6.1.0-flavor_customization
/build_info/dockerfiles
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-language_deps
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-build_deps
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-udfclient_deps
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-release
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-flavor_base_deps
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-build_run
/build_info/dockerfiles/python3-ds-EXASOL-6.1.0-flavor_customization
/build_info/actual_installed_packages
/build_info/actual_installed_packages/release


In [20]:
con = connect()
rs=con.execute("""select execute_shell_command_py3('cat /build_info/actual_installed_packages/release/python3_pip_packages', true)""")
for r in rs: 
    print(r[0])

absl-py|0.11.0
astor|0.8.1
autograd|1.3
autograd-gamma|0.5.0
bokeh|2.2.3
cached-property|1.5.2
click|7.1.2
cloudpickle|1.6.0
contextvars|2.4
cycler|0.10.0
dask|2021.1.1
distributed|2021.1.1
formulaic|0.2.1
fsspec|0.8.5
future|0.18.2
gast|0.4.0
gensim|3.8.3
grpcio|1.35.0
h5py|3.1.0
HeapDict|1.0.1
imbalanced-learn|0.7.0
immutables|0.14
importlib-metadata|3.4.0
interface-meta|1.2.2
Jinja2|2.11.3
joblib|1.0.0
Keras|2.3.1
Keras-Applications|1.0.8
Keras-Preprocessing|1.1.2
kiwisolver|1.3.1
kmodes|0.10.2
lifelines|0.25.8
locket|0.2.1
lxml|4.6.2
Markdown|3.3.3
MarkupSafe|1.1.1
matplotlib|3.3.4
mock|4.0.3
msgpack|1.0.2
nltk|3.5
numpy|1.19.5
packaging|20.9
pandas|1.1.5
partd|1.1.0
patsy|0.5.1
Pillow|8.1.0
pip|20.3.4
protobuf|3.14.0
psutil|5.8.0
pyasn1|0.4.8
pycurl|7.43.0.6
pyexasol|0.16.1
pygobject|3.26.1
pyparsing|2.4.7
python-apt|1.6.5+ubuntu0.5
python-dateutil|2.8.1
pytz|2021.1
PyYAML|5.4.1
regex|2020.11.13
rsa|4.7
scikit-learn|0.24.1
scipy|1.2.1
seaborn|0.11.1
setuptools|53.0.0
six|1.15.0
sm