# Katib 실습
katib을 사용하여 training code `01-covid19-katib-train.py`를 hyper parameter tunning한다

# 사전준비1. Base Image
Host에 ssh 접속하여 아래와 같이 docker build하고 push한다

```sh
cat << EOF > Dockerfile.kf-base
FROM ubuntu:18.04
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y python3 python3-pip python3-dev ffmpeg libsm6 libxext6 

RUN  pip3 install --upgrade pip && \
     pip install \
        tensorflow \
        sklearn \
        opencv-python \   
        pillow \
        minio
EOF
```

```sh
# docker build script
REGISTRY=registry.kube-system.svc.cluster.local:30000
TAG=$REGISTRY/kf-base:latest

docker build -f Dockerfile.kf-base -t $TAG . && \
docker push $TAG
```

# 사전준비2. s3 repository(Minio)에 training data 적재
- Minio 레파지토리에 model bucet 생성
- Covid19 데이터셋 다운로드
- Covid19 데이터셋을 Minio에 업로드
- 첨부: 00-uploadDataset-covid19.sh

In [1]:
! ../01-prerequisite/00-uploadDataset-covid19.sh


Minio Cli 설치
----------------------------------------

/usr/bin/mc
[m[32mAdded `myminio` successfully.[0m
[0m
Covid19 데이터셋 다운로드
----------------------------------------

Cloning into 'Covid19-X-Rays'...
remote: Enumerating objects: 239, done.[K
remote: Total 239 (delta 0), reused 0 (delta 0), pack-reused 239[K
Receiving objects: 100% (239/239), 74.15 MiB | 23.10 MiB/s, done.

Covid19 데이터셋을 Minio에 업로드
----------------------------------------

[m[32m[2021-03-14 19:46:13 UTC][0m[33m     0B[0m[36;1m test/[0m
[0m[m[32m[2021-03-14 19:46:13 UTC][0m[33m     0B[0m[36;1m train/[0m
...4854.jpeg:  75.02 MiB / 75.02 MiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 63.04 MiB/s 1s[0m[0m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m
완료
----------------------------------------



- Mino bucket 확인
  - http://<vm외부IP>:32001/minio/dataset/covid19/

# Step 1. Hyperparamer 수행할 로직을 Containerizing

### docker 명령으로 build
**VM에 ssh 접속**하여 `01-covid19-katib-train.py`를 업로드한 후

사전준비에서 마련한 Base Image를 이용하여 아래와 같이 docker build하고 push한다

```sh
# docker build script
REGISTRY=registry.kube-system.svc.cluster.local:30000
TAG=$REGISTRY/covid19-katib-job:latest

cat << EOF > Dockerfile.covid 
FROM $REGISTRY/kf-base:latest
COPY 01-covid19-katib-train.py /app/
CMD ["python3", "/app/01-covid19-katib-train.py"]
EOF

docker build -f Dockerfile.covid -t $TAG . && \
docker push $TAG
```

# Step 2. Katib 실행

```yaml
apiVersion: "kubeflow.org/v1alpha3"
kind: Experiment
...
spec:
...
  trialTemplate:
    goTemplate:
...
          spec:
            template:
              spec:
                containers:
                - name: {{.Trial}}
                  image: registry.kube-system.svc.cluster.local:30000/katib-job:latest  # 여기를 수정
```

### 방법 1. kubectl로 apply
Host에 ssh 접속하여 아래와 같이 kubectl apply 실행
```sh
kubectl apply -f 03-covid19-katib-random.yaml
```

### 방법 2. katib UI에서 submit
katib UI의 submit 화면에서 `03-covid19-katib-random.yaml` 내용을 copy & paste하고 submit 버튼 클릭