<!--
#  Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
#    Licensed under the Apache License, Version 2.0 (the "License").
#    You may not use this file except in compliance with the License.
#    You may obtain a copy of the License at
#
#        http://www.apache.org/licenses/LICENSE-2.0
#
#    Unless required by applicable law or agreed to in writing, software
#    distributed under the License is distributed on an "AS IS" BASIS,
#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#    See the License for the specific language governing permissions and
#    limitations under the License.
-->

# Sample notebook to build a Jupyter Image with Spark Kernel.

## Content
1. [Configuration](#Configuration)
2. [Build Image](#Build-Image)
2. [Running Container using the profile](#Running-container)


### Configuration

In [1]:
profile_name='spark'
image_name = 'spark'
folder_name = 'spark'

### Build Image

Lets see how  orbit build image works...

In [2]:
!orbit build image --help

Usage: orbit build image [OPTIONS]

  Build and Deploy a new Docker image into ECR.

Options:
  -e, --env TEXT        Orbit Environment.  [required]
  -d, --dir TEXT        Dockerfile directory.  [required]
  -n, --name TEXT       Image name.  [required]
  -s, --script TEXT     Build script to run before the image build.
  -t, --team TEXT       One or more Teams to deploy the image to (can de
                        declared multiple times).

  --build-arg TEXT      One or more --build-arg parameters to pass to the
                        Docker build command.

  --debug / --no-debug  Enable detailed logging.  [default: False]
  --help                Show this message and exit.


get our orbit env and team names

In [3]:
env_name = %env AWS_ORBIT_ENV
team_name = %env AWS_ORBIT_TEAM_SPACE
(env_name,team_name)

('dev-env', 'lake-user')

Repository name will be created from the image name prefixed by the env context.  Users are only able to manipulate ECR repos that start with 'orbit-{env_name}/users/'

In [4]:
repository_name = (f"orbit-{env_name}/users/{image_name}")
repository_name

'orbit-dev-env-users-spark'

In [5]:
!aws ecr delete-repository --repository-name $repository_name --force

{
    "repository": {
        "repositoryArn": "arn:aws:ecr:us-west-2:495869084367:repository/orbit-dev-env-users-spark",
        "registryId": "495869084367",
        "repositoryName": "orbit-dev-env-users-spark",
        "repositoryUri": "495869084367.dkr.ecr.us-west-2.amazonaws.com/orbit-dev-env-users-spark",
        "createdAt": 1614288506.0,
        "imageTagMutability": "MUTABLE"
    }
}


In [22]:
%cd ~/shared/samples/notebooks/I-Image/$folder_name

/home/jovyan/shared/samples/notebooks/I-Image/spark


adding a file to our docker as an example

In [23]:
pwd = %pwd
pwd

'/home/jovyan/shared/samples/notebooks/I-Image/spark'

Now lets run the command

In [8]:
%%time

output = !orbit build image -e $env_name -d $pwd -n $image_name
output

CPU times: user 9.67 ms, sys: 10.8 ms, total: 20.5 ms
Wall time: 9min 34s


['',
 'Deploying Docker Image |\x1b[32m                   \x1b[0m|   0% \x1b[0m',
 '                                                  ',
 '',
 'Deploying Docker Image |\x1b[32m▏                  \x1b[0m|   1% \x1b[0m',
 'Deploying Docker Image |\x1b[32m▌                  \x1b[0m|   3% \x1b[0m',
 'Deploying Docker Image |\x1b[32m▉                  \x1b[0m|   5% \x1b[0m',
 'Deploying Docker Image |\x1b[32m███▊               \x1b[0m|  20% \x1b[0m',
 'Deploying Docker Image |\x1b[32m█████▏             \x1b[0m|  27% \x1b[0m',
 'Deploying Docker Image |\x1b[32m██████████████████▊\x1b[0m|  99% \x1b[0m',
 '                                                  ',
 '',
 'Deploying Docker Image |\x1b[32m██████████████████▊\x1b[0m|  99% \x1b[0m',
 '                                                  ',
 '',
 'Deploying Docker Image |\x1b[32m██████████████████▊\x1b[0m|  99% \x1b[0m',
 '                                                  ',
 '',
 'Deploying Docker Image |\x1b[32m██████████████████▊\x1b[0m| 

Lets get the image address from the output of the previous command

In [42]:
look_for = 'ECR Image Address='
image = None
for o in output:
    if look_for in o:
        image = o[o.index(look_for) + len(look_for):]
        print(image)

assert(image != None)       
    

495869084367.dkr.ecr.us-west-2.amazonaws.com/orbit-dev-env-users-spark


In [10]:
# check that the image was built
import json
print(repository_name)
images = !aws ecr list-images --repository-name $repository_name
images = "".join(images)
im = json.loads(images)
print(im['imageIds'])
assert(len(im['imageIds']) > 0)

orbit-dev-env-users-spark
[{'imageDigest': 'sha256:1a3a635f7db29009c48befa9cb7e31c8b12613053f3b4cdf60d5930263ddef7e', 'imageTag': 'latest'}]


### Building the profile for the Image

In [43]:
import json
profile = {
        "display_name": profile_name,
        "description": "Use for spark kernel",
        "kubespawner_override": {
            "image": image,
            "cpu_guarantee": 2,
            "cpu_limit": 2,
            "mem_guarantee": "1G",
            "mem_limit": "1G"
        }
}

with open("profile.json", 'w') as f:
    json.dump(profile, f)


In [44]:
!cat profile.json

{"display_name": "spark", "slug": "spark", "description": "Use for spark kernel", "kubespawner_override": {"image": "495869084367.dkr.ecr.us-west-2.amazonaws.com/orbit-dev-env-users-spark", "cpu_guarantee": 2, "cpu_limit": 2, "mem_guarantee": "1G", "mem_limit": "1G"}}

In [45]:
!orbit build profile --env $env_name --team $team_name profile.json

[[39m[22m[24m Info [0m] Retrieving existing profiles0% [0m
[[94m[22m[24m Tip [0m] Profile added spark    0m|   1% [0m
                                                  0m| 100% [0m
Adding profile |[32m███████████████████████████[0m| 100% [0m


In [46]:
!orbit list profile --env $env_name --team $team_name

Team profiles:
[
    {
        "description": "Use simple custom image",
        "display_name": "simple_image",
        "kubespawner_override": {
            "cpu_guarantee": 2,
            "cpu_limit": 2,
            "image": "495869084367.dkr.ecr.us-west-2.amazonaws.com/orbit-dev-env-users-custom_image2",
            "mem_guarantee": "1G",
            "mem_limit": "1G"
        },
        "slug": "nano"
    },
    {
        "description": "Use for spark kernel",
        "display_name": "spark",
        "kubespawner_override": {
            "cpu_guarantee": 2,
            "cpu_limit": 2,
            "image": "495869084367.dkr.ecr.us-west-2.amazonaws.com/orbit-dev-env-users-spark",
            "mem_guarantee": "1G",
            "mem_limit": "1G"
        },
        "slug": "spark"
    }
]
Admin deployed profiles:
[
    {
        "description": "1 CPU + 1G MEM",
        "display_name": "Nano",
        "kubespawner_override": {
            "cpu_guarantee": 1,
            "cpu_limit": 1,
 

### Running container 

Lets run a container using the profile and image we created

In [50]:
import json
run = {
      "compute": {
          "container" : {
              "p_concurrent": "1"
          },
          "node_type": "ec2",
          "profile": profile_name
      },
      "tasks":  [{
          "notebookName": "test-image.ipynb",
          "sourcePath": pwd,
          "targetPath": f"/home/jovyan/shared/regression/notebooks/I-Image/{folder_name}",
          "params": {
          }
        }]
 }

with open("run.json", 'w') as f:
    json.dump(run, f)


In [53]:
%%time

!orbit run notebook --env $env_name --team $team_name --user testing --wait --tail-logs run.json


INFO:root:using profile spark
INFO:root:Waiting for 1 tasks [{'ExecutionType': 'eks', 'Identifier': 'orbit-lake-user-ec2-runner-nxgcl', 'NodeType': 'ec2', 'tasks': [{'notebookName': 'test-image.ipynb', 'sourcePath': '/home/jovyan/shared/samples/notebooks/I-Image/spark', 'targetPath': '/home/jovyan/shared/regression/notebooks/I-Image/spark', 'params': {}, 'ExecutionType': 'ecs'}]}]
INFO:root:Watching task: 'orbit-lake-user-ec2-runner-nxgcl'
INFO:root:Running: 1 Completed: 0 Errored: 0
INFO:root:waiting for [{'ExecutionType': 'eks', 'Identifier': 'orbit-lake-user-ec2-runner-nxgcl', 'NodeType': 'ec2', 'tasks': [{'notebookName': 'test-image.ipynb', 'sourcePath': '/home/jovyan/shared/samples/notebooks/I-Image/spark', 'targetPath': '/home/jovyan/shared/regression/notebooks/I-Image/spark', 'params': {}, 'ExecutionType': 'ecs'}]}]
INFO:root:Task {'ExecutionType': 'eks', 'Identifier': 'orbit-lake-user-ec2-runner-nxgcl', 'NodeType': 'ec2', 'tasks': [{'notebookName': 'test-image.ipynb', 'sourcePa