# FWI in Azure project

## Create Experimentation Docker image

FWI demo based on: 
This project ports devito (https://github.com/opesci/devito) into Azure and runs tutorial notebooks at:
https://nbviewer.jupyter.org/github/opesci/devito/blob/master/examples/seismic/tutorials/



In this notebook we create a custom docker image that will be used to run the devito demo notebooks in AzureML. 

 - We transparently create a docker file, a conda environment .yml file, build the docker image and push it into dockerhub. Azure ACR could also be used for storing docker images. 
 - The conda environment .yml file lists conda and pip installs, and separates all python dependencies from the docker installs. 
 - The dockerfile is generic. The only AzureML depedency is azureml-sdk pip installable package in conda environment .yml file
 - The created docer image will be run in following notebook in a container on the local AzureVM or on a remote AzureML compute cluster. This AzureML pattern decouples experimentation (or training) job definition (experimentation script, data location, dependencies and docker image) happening on the control plane machine that runs this notebook, from the elastically allocated and Azure managed VM/cluster that does the actual training/experimentation computation.
 
<a id='user_input_requiring_steps'></a>
User input requiring steps:
 - [Fill in and save docker image name settings, if needed. ](#docker_image_settings)
 - [Update DOCKER_CONTAINER_MOUNT_POINT to match our local path](#docker_image_settings)
 - [Set docker build and test flags](#docker_build_test_settings) 


In [1]:
# Allow multiple displays per cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all" 

In [2]:
import sys, os
import shutil
import urllib
import azureml.core
from azureml.core import Workspace, Experiment
from azureml.core.datastore import Datastore
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.runconfig import MpiConfiguration
from azureml.exceptions import ComputeTargetException
from azureml.data.data_reference import DataReference
from azureml.pipeline.steps import HyperDriveStep
from azureml.pipeline.core import Pipeline, PipelineData
from azureml.train.dnn import TensorFlow

from azureml.train.estimator import Estimator
from azureml.widgets import RunDetails


import platform
import math
import docker

In [3]:
print("Azure ML SDK Version: ", azureml.core.VERSION)
platform.platform()
os.getcwd()

Azure ML SDK Version:  1.0.65


'Linux-4.15.0-1060-azure-x86_64-with-debian-10.0'

'/workspace/examples/imaging/azureml_devito/notebooks'

<a id='docker_build_test_settings'></a>
#### Setup docker image build and test process. 
 - devito tests take abou 15 mins (981.41 seconds). When running this notebook for first time make:
     > docker_build_no_cache = '--no-cache'  
     > docker_test_run_devito_tests = True
     
[Back](#user_input_requiring_steps) to summary of user input requiring steps.

In [4]:
docker_build_no_cache = ''  # '--no-cache' # or '' #
docker_test_run_devito_tests = False # True # False

##### Import utilities functions

In [5]:
def add_path_to_sys_path(path_to_append):
    if not (any(path_to_append in paths for paths in sys.path)):
        sys.path.append(path_to_append)
        
auxiliary_files_dir = os.path.join(*(['.', 'src']))
paths_to_append = [os.path.join(os.getcwd(), auxiliary_files_dir)]
[add_path_to_sys_path(crt_path) for crt_path in paths_to_append]

import project_utils
prj_consts = project_utils.project_consts()

[None]

##### Create experimentation docker file

In [6]:
dotenv_file_path = os.path.join(*(prj_consts.DOTENV_FILE_PATH))
dotenv_file_path

'./../not_shared/general.env'

<a id='docker_image_settings'></a>

##### Input here docker image settings 
in cell below we use [dotenv](https://github.com/theskumar/python-dotenv) to overwrite docker image properties already save in dotenv_file_path. Change as needed, e.g. update azureml_sdk version if using a different version.

[Back](#user_input_requiring_steps) to summary of user input requiring steps.

In [7]:
# SDK changes often, so we'll keep its version transparent 
import dotenv

# EXPERIMENTATION_IMAGE_VERSION should:
# - match sdk version in fwi01_conda_env01 environmnet in conda_env_fwi01_azureml_sdk.v1.0.XX.yml file below
# -  match the conda env yml file name, e.g. conda_env_fwi01_azureml_sdk.v1.0.62.yml referenced in 
#      Dockerfile_fwi01_azureml_sdk.v1.0.62
dotenv.set_key(dotenv_file_path, 'EXPERIMENTATION_IMAGE_VERSION', 'sdk.v1.0.65')
dotenv.set_key(dotenv_file_path, 'EXPERIMENTATION_IMAGE_TAG', 'fwi01_azureml')
dotenv.set_key(dotenv_file_path, 'DOCKER_CONTAINER_MOUNT_POINT', '/datadrive01/prj/DeepSeismic/examples/imaging/azureml_devito/notebooks')

(True, 'EXPERIMENTATION_IMAGE_VERSION', 'sdk.v1.0.65')

(True, 'EXPERIMENTATION_IMAGE_TAG', 'fwi01_azureml')

(True,
 'DOCKER_CONTAINER_MOUNT_POINT',
 '/datadrive01/prj/DeepSeismic/examples/imaging/azureml_devito/notebooks')

In [8]:
%load_ext dotenv
%dotenv $dotenv_file_path

docker_file_location = os.path.join(*(prj_consts.AML_EXPERIMENT_DIR + ['docker_build']))

docker_file_name = 'Dockerfile'+ '_' + os.getenv('EXPERIMENTATION_IMAGE_TAG')
conda_dependency_file_name = 'conda_env'+ '_' + os.getenv('EXPERIMENTATION_IMAGE_TAG')
devito_conda_dependency_file_name = 'devito_conda_env'+'.yml'
docker_image_name = os.getenv('DOCKER_LOGIN') + '/' + os.getenv('EXPERIMENTATION_IMAGE_TAG')
image_version = os.getenv('EXPERIMENTATION_IMAGE_VERSION')
if image_version!="":
    docker_file_name = docker_file_name +'_'+ image_version
    conda_dependency_file_name = conda_dependency_file_name+'_'+ image_version
    docker_image_name = docker_image_name +':'+ image_version
conda_dependency_file_name=conda_dependency_file_name+'.yml'

docker_file_dir = os.path.join(*([os.getcwd(), docker_file_location]))
os.makedirs(docker_file_dir, exist_ok=True)
docker_file_path = os.path.join(*([docker_file_dir]+[docker_file_name]))
conda_file_path = os.path.join(*([docker_file_dir]+[conda_dependency_file_name]))

docker_image_name
conda_dependency_file_name
conda_file_path
docker_file_dir
docker_file_path

'georgedockeraccount/fwi01_azureml:sdk.v1.0.65'

'conda_env_fwi01_azureml_sdk.v1.0.65.yml'

'/workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build/conda_env_fwi01_azureml_sdk.v1.0.65.yml'

'/workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build'

'/workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build/Dockerfile_fwi01_azureml_sdk.v1.0.65'

In [9]:
%%writefile $conda_file_path
name: fwi01_conda_env01
    
#https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.13.1-cp37-cp37m-linux_x86_64.whl    
# https://github.com/dask/dask-tutorial

channels:
  - anaconda
  - conda-forge
dependencies:
  - python=3.6 # 3.6 req by tf, not 3.7.2 
  - dask
  - distributed
  - h5py
  - matplotlib
  - nb_conda
  - notebook 
  - numpy 
  - pandas
  - pip
  - py-cpuinfo # all required by devito or dask-tutorial
  - pytables
  - python-graphviz
  - requests>=2.19.1
  - pillow
  - scipy
  - snakeviz
  - scikit-image
  - toolz
  - pip:
    - anytree # required by devito
    - azureml-sdk[notebooks,automl]==1.0.65
    - codepy # required by devito
    - papermill[azure]
    - pyrevolve # required by devito

Overwriting /workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build/conda_env_fwi01_azureml_sdk.v1.0.65.yml


In [10]:
%%writefile $docker_file_path 

FROM continuumio/miniconda3:4.7.10    
MAINTAINER George Iordanescu <ghiordan@microsoft.com>

RUN apt-get update --fix-missing && apt-get install -y --no-install-recommends \
    gcc g++ \
    wget bzip2 \
    curl \
    git make \
    mpich \ 
    libmpich-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

ENV CONDA_ENV_FILE_NAME conda_env_fwi01_azureml_sdk.v1.0.65.yml
ADD $CONDA_ENV_FILE_NAME /tmp/$CONDA_ENV_FILE_NAME
ENV CONDA_DIR /opt/conda
ENV CONDA_ENV_NAME fwi01_conda_env

RUN git clone https://github.com/opesci/devito.git  && \
    cd devito  && \
    /opt/conda/bin/conda env create -q --name $CONDA_ENV_NAME -f environment.yml && \
    pip install -e . 
    
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV CONDA_DEFAULT_ENV=$CONDA_ENV_NAME
ENV CONDA_PREFIX=$CONDA_DIR/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:/opt/conda/bin:$PATH   

RUN /opt/conda/bin/conda env update --name $CONDA_ENV_NAME -f /tmp/$CONDA_ENV_FILE_NAME && \
    /opt/conda/bin/conda clean  --yes --all

ENV PYTHONPATH=$PYTHONPATH:devito/app

# WORKDIR /devito     
    
CMD /bin/bash

Overwriting /workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build/Dockerfile_fwi01_azureml_sdk.v1.0.65


In [11]:
! ls -l $docker_file_dir

total 24
-rw-r--r-- 1 root root 1098 Sep 25 00:39 Dockerfile_fwi01_azureml_sdk.v1.0.60
-rw-r--r-- 1 root root 1098 Sep 26 19:04 Dockerfile_fwi01_azureml_sdk.v1.0.62
-rw-r--r-- 1 root root 1085 Oct  7 22:25 Dockerfile_fwi01_azureml_sdk.v1.0.65
-rw-r--r-- 1 root root  713 Sep 25 00:39 conda_env_fwi01_azureml_sdk.v1.0.60.yml
-rw-r--r-- 1 root root  713 Sep 26 19:04 conda_env_fwi01_azureml_sdk.v1.0.62.yml
-rw-r--r-- 1 root root  733 Oct  7 22:25 conda_env_fwi01_azureml_sdk.v1.0.65.yml


In [12]:
cli_command='docker build -t '+ docker_image_name + \
' -f ' + docker_file_path + \
' ' + docker_file_dir + ' ' +\
docker_build_no_cache  #'' #' --no-cache'


cli_command
! $cli_command

'docker build -t georgedockeraccount/fwi01_azureml:sdk.v1.0.65 -f /workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build/Dockerfile_fwi01_azureml_sdk.v1.0.65 /workspace/examples/imaging/azureml_devito/notebooks/./../temp/docker_build '

Sending build context to Docker daemon  11.78kB
Step 1/15 : FROM continuumio/miniconda3:4.7.10
 ---> 4a51de2367be
Step 2/15 : MAINTAINER George Iordanescu <ghiordan@microsoft.com>
 ---> Using cache
 ---> fd7cf6c96c9d
Step 3/15 : RUN apt-get update --fix-missing && apt-get install -y --no-install-recommends     gcc g++     wget bzip2     curl     git make     mpich     libmpich-dev &&     apt-get clean &&     rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 8d9b9aae4809
Step 4/15 : ENV CONDA_ENV_FILE_NAME conda_env_fwi01_azureml_sdk.v1.0.65.yml
 ---> Using cache
 ---> 98c1084d1571
Step 5/15 : ADD $CONDA_ENV_FILE_NAME /tmp/$CONDA_ENV_FILE_NAME
 ---> Using cache
 ---> 75c937721b70
Step 6/15 : ENV CONDA_DIR /opt/conda
 ---> Using cache
 ---> 3dc77d946814
Step 7/15 : ENV CONDA_ENV_NAME fwi01_conda_env
 ---> Using cache
 ---> 6c04ce507b84
Step 8/15 : RUN git clone https://github.com/opesci/devito.git  &&     cd devito  &&     /opt/conda/bin/conda env create -q --name $CONDA_ENV_NAME -f en

Docker containers can be run using python docker sdk

In [13]:
docker_image_name

sh_command='bash -c "pwd;python -c \'import azureml.core;print(azureml.core.VERSION)\'"'
sh_command
client = docker.from_env()
client.containers.run(docker_image_name, 
                      remove=True,
                      volumes={os.getenv('DOCKER_CONTAINER_MOUNT_POINT'): {'bind': '/workspace', 'mode': 'rw'}},
                      working_dir='/',
                      command=sh_command)

'georgedockeraccount/fwi01_azureml:sdk.v1.0.65'

'bash -c "pwd;python -c \'import azureml.core;print(azureml.core.VERSION)\'"'

b'/\n1.0.65\n'

Docker containers can also be run in cli 

Here we also create a log file to capture commands execution in container. If flag docker_test_run_devito_tests is True, we run 
and capture test commands output. Tests take abou 15 minutes to run. If flag docker_test_run_devito_tests is False, we show the results of a previous session. 

In [14]:
fwi01_log_file = os.path.join(*(['.', 'fwi01_azureml_buildexperimentationdockerimage.log']))
fwi01_log_file

'./fwi01_azureml_buildexperimentationdockerimage.log'

#### Create command for running devito tests, capture output in a log file, save log file outside container

In [15]:
if docker_test_run_devito_tests:
    run_devito_tests_command = ' python -m pytest tests/ '   + \
'> ' + fwi01_log_file +' 2>&1; ' + \
' mv ' + fwi01_log_file + ' /workspace/'  
    
    with open(os.path.join(*(['.', 'fwi01_azureml_buildexperimentationdockerimage.log'])), "w") as crt_log_file:
        print('Before running e13n container... ', file=crt_log_file)
    print('\ncontent of devito tests log file before testing:')
    !cat $fwi01_log_file
else:
    run_devito_tests_command =  '' 

# run_devito_tests_command =  'ls -l > ./fwi01_azureml_buildexperimentationdockerimage.log 2>&1;  mv ./fwi01_azureml_buildexperimentationdockerimage.log /workspace/'
run_devito_tests_command

''

In [16]:
cli_command='docker run -it --rm  --name fwi01_azureml_container ' +\
' -v '+os.getenv('DOCKER_CONTAINER_MOUNT_POINT')+':/workspace:rw ' + \
docker_image_name + \
' /bin/bash -c "conda env list ; ls -l /devito/tests;  '  + \
'python -c \'import azureml.core;print(azureml.core.VERSION)\'; '  + \
'cd /devito; '  + \
run_devito_tests_command +\
' "'

cli_command
! $cli_command
# # ============= 774 passed, 70 skipped, 1 xfailed in 1106.76 seconds =============
print('\ncontent of devito tests log file after testing:')
!cat $fwi01_log_file

'docker run -it --rm  --name fwi01_azureml_container  -v /datadrive01/prj/DeepSeismic/examples/imaging/azureml_devito/notebooks:/workspace:rw georgedockeraccount/fwi01_azureml:sdk.v1.0.65 /bin/bash -c "conda env list ; ls -l /devito/tests;  python -c \'import azureml.core;print(azureml.core.VERSION)\'; cd /devito;  "'

# conda environments:
#
base                     /opt/conda
fwi01_conda_env       *  /opt/conda/envs/fwi01_conda_env

total 504
-rw-r--r-- 1 root root 11521 Oct  7 21:43 conftest.py
-rw-r--r-- 1 root root  6425 Oct  7 21:43 test_adjoint.py
-rw-r--r-- 1 root root 13882 Oct  7 21:43 test_autotuner.py
-rw-r--r-- 1 root root  9727 Oct  7 21:43 test_checkpointing.py
-rw-r--r-- 1 root root  1095 Oct  7 21:43 test_constant.py
-rw-r--r-- 1 root root 52392 Oct  7 21:43 test_data.py
-rw-r--r-- 1 root root   481 Oct  7 21:43 test_dependency_bugs.py
-rw-r--r-- 1 root root 16585 Oct  7 21:43 test_derivatives.py
-rw-r--r-- 1 root root 30846 Oct  7 21:43 test_dimension.py
-rw-r--r-- 1 root root 21233 Oct  7 21:43 test_dle.py
-rw-r--r-- 1 root root  1138 Oct  7 21:43 test_docstrings.py
-rw-r--r-- 1 root root 26251 Oct  7 21:43 test_dse.py
-rw-r--r-- 1 root root  8612 Oct  7 21:43 test_gradient.py
-rw-r--r-- 1 root root 15229 Oct  7 21:43 test_interpolation.py
-rw-r--r-- 1 root root 31514 Oct  7 21:43 

In [17]:
docker_pwd = os.getenv('DOCKER_PWD')
docker_login = os.getenv('DOCKER_LOGIN')
!docker login -u=$docker_login -p=$docker_pwd

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded


In [18]:
# %%bash
!docker push {docker_image_name}

The push refers to repository [docker.io/georgedockeraccount/fwi01_azureml]

[1B8da5e589: Preparing 
[1Ba559fdba: Preparing 
[1B772bb00d: Preparing 
[1B54377100: Preparing 
[1Bf8fc4c9a: Preparing 
[1Bba47210e: Preparing 
[4B54377100: Layer already exists [7A[1K[K[1A[1K[K[4A[1K[Ksdk.v1.0.65: digest: sha256:f327c88a842c9e77df4df9ae1b980367ea053be4a0c778e1647e105d2cbf08a3 size: 1800


In [19]:
# !jupyter nbconvert 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito --to html
print('Finished running 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito!')

Finished running 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito!
