Will solve:

https://github.com/CONABIO/kube_sipecam_playground/issues/14

# Set up minikube and usage of docker image for MAD-Mex + kale in AWS

Will follow: 

* For minikube: [minikube_sipecam/setup](https://github.com/CONABIO/kube_sipecam/tree/master/minikube_sipecam/setup#aws)

* docker image for MAD-Mex: [kube_sipecam/dockerfiles/MAD_Mex/odc_kale](https://github.com/CONABIO/kube_sipecam/tree/master/dockerfiles/MAD_Mex/odc_kale) and [minikube_sipecam/deployments/MAD_Mex](https://github.com/CONABIO/kube_sipecam/tree/master/minikube_sipecam/deployments/MAD_Mex/)

* Reference for this nbook: 

[1_issue_5_basic_setup_in_AWS_for_MAD_Mex_classif_pipeline](https://github.com/CONABIO/kube_sipecam_playground/blob/master/MAD_Mex/notebooks/1_issue_5_basic_setup_in_AWS_for_MAD_Mex_classif_pipeline.ipynb)

[1_issue_10_basic_setup_in_AWS_for_MAD_Mex_classif_pipeline](https://github.com/CONABIO/kube_sipecam_playground/blob/master/MAD_Mex/notebooks/2_issues_and_nbooks/1_issue_10_basic_setup_in_AWS_for_MAD_Mex_classif_pipeline.ipynb.ipynb)

Will use [minikube_sipecam/deployments/MAD_Mex/hostpath_pv](https://github.com/CONABIO/kube_sipecam/tree/master/minikube_sipecam/deployments/MAD_Mex/hostpath_pv)

## Instance

In AWS account we can select ami: `minikube-sipecam` which has next description:

*Based in k8s-1.16-debian-buster-amd64-hvm-ebs-2020-04-27 - ami-0ab39819e336a3f3f Contains kubectl 1.19.1 minikube 1.13.0 kubeflow 1.0.2*

and instance `m5.2xlarge` with `100` gb of disk.

Use next bash script for user data:

```
#!/bin/bash
##variables:
region=us-west-2
name_instance=minikube-10-09-2020
##System update
apt-get update -yq
##Tag instance
INSTANCE_ID=$(curl -s http://instance-data/latest/meta-data/instance-id)
PUBLIC_IP=$(curl -s http://instance-data/latest/meta-data/public-ipv4)
aws ec2 create-tags --resources $INSTANCE_ID --tag Key=Name,Value=$name_instance-$PUBLIC_IP --region=$region
```

**Ssh to instance, all commands will be executed as root**

`sudo su`


**Next will start minikube and kubeflow pods:**

```
cd /root && minikube start --driver=none

cd /opt/kf-test && /root/kfctl apply -V -f kfctl_k8s_istio.v1.0.2.yaml
```


Check pods and status with:

```
minikube status

minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
```

```
kubectl get pods -n kubeflow

#all running except:
spark-operatorcrd-cleanup-2p7x2                                0/2     Completed   0          7m6s
```



**To access kubeflow UI set:**

```
export INGRESS_HOST=$(minikube ip)
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
echo $INGRESS_PORT
```


**And go to:**

```
http://<ipv4 of ec2 instance>:$INGRESS_PORT
```



## Deployments and services 


**Set:**

```
MAD_MEX_LOAD_BALANCER_SERVICE=loadbalancer-mad-mex-0.1.0_1.7.0_0.5.0-hostpath-pv
MAD_MEX_PV=hostpath-pv
MAD_MEX_PVC=hostpath-pvc
MAD_MEX_JUPYTERLAB_SERVICE=jupyterlab-mad-mex-0.1.0_1.7.0_0.5.0-hostpath-pv
MAD_MEX_URL=https://raw.githubusercontent.com/CONABIO/kube_sipecam/master/minikube_sipecam/deployments/MAD_Mex/
```

**Create storage:**


```
kubectl create -f $MAD_MEX_URL/hostpath_pv/$MAD_MEX_PV.yaml
kubectl create -f $MAD_MEX_URL/hostpath_pv/$MAD_MEX_PVC.yaml
```

**Create service:**

```
kubectl create -f $MAD_MEX_URL/hostpath_pv/$MAD_MEX_LOAD_BALANCER_SERVICE.yaml
```

**Create deployment:**

```
kubectl create -f $MAD_MEX_URL/hostpath_pv/$MAD_MEX_JUPYTERLAB_SERVICE.yaml
```

**And go to:**

```
http://<ipv4 of ec2 instance>:30001/madmexurl
```


# Set up postgresql instance in AWS

Will follow:

https://github.com/CONABIO/antares3-docker/tree/master/postgresql/local_deployment

**Clone, init DB**

```
cd /shared_volume
dir=/shared_volume/postgresql_volume_docker
mkdir $dir

git clone https://github.com/CONABIO/antares3-docker.git $dir/antares3-docker

mkdir -p $dir/etc/postgresql
mkdir -p $dir/var/log/postgresql
mkdir -p $dir/var/lib/postgresql

docker run -v $dir/etc/postgresql:/etc/postgresql \
-v $dir/var/log/postgresql:/var/log/postgresql \
-v $dir/var/lib/postgresql:/var/lib/postgresql \
-v $dir/antares3-docker/postgresql/local_deployment/conf/:/home/postgres/conf/ \
-w /home/postgres \
-p 2225:22 -p 2345:5432 --name postgresql-madmex-odc --hostname postgresql-madmex \
-dit madmex/postgresql-madmex-local:v8 /bin/bash

docker exec -it postgresql-madmex-odc /usr/local/bin/entrypoint.sh
docker exec -u=postgres -it postgresql-madmex-odc /home/postgres/conf/setup.sh
```


# Create `/shared_volume/.geonode_conabio`:

```
HOST_NAME="<ipv4 DNS of ec2>"
USER_GEOSERVER="super"
PASSWORD_GEOSERVER="duper"
PASSWORD_DB_GEONODE_DATA="geonode"
```

## Init files for antares3 and ODC

**Next commands in jupyterlab**

`~/.datacube.conf`

```
[user]
default_environment: datacube
#default_environment: s3aio_env

[datacube]
db_hostname: 172.17.0.1
db_database: antares_datacube
db_username: postgres
db_password: postgres
db_port: 2345


execution_engine.use_s3: False

[s3aio_env]
db_hostname: 172.17.0.1
db_database: antares_datacube
db_username: postgres
db_password: postgres
db_port: 2345

#index_driver: s3aio_index

execution_engine.use_s3: False
```

`~/.antares`

```
# Django settings
SECRET_KEY=<key>
DEBUG=True
DJANGO_LOG_LEVEL=DEBUG
ALLOWED_HOSTS=
# Database
DATABASE_NAME=antares_datacube
DATABASE_USER=postgres
DATABASE_PASSWORD=postgres
DATABASE_HOST=172.17.0.1
DATABASE_PORT=2345
# Datacube
SERIALIZED_OBJECTS_DIR=/shared_volume/datacube_ingest/serialized_objects/
INGESTION_PATH=/shared_volume/datacube_ingest
#DRIVER=s3aio
DRIVER='NetCDF CF'
#INGESTION_BUCKET=datacube-s2-jalisco-test
# Query and download
USGS_USER=<username>
USGS_PASSWORD=<password>
SCIHUB_USER=
SCIHUB_PASSWORD=
# Misc
BIS_LICENSE=<license>
TEMP_DIR=/shared_volume/temp
SEGMENTATION_DIR=/shared_volume/segmentation/
#SEGMENTATION_BUCKET=<name of bucket>

```

**Create dir for segmentation if will hold results of that process:**

`mkdir /shared_volume/segmentation/`

**Upgrade antares with no deps:**

`pip3 install --user git+https://github.com/CONABIO/antares3.git@develop --upgrade --no-deps`

**Init antares and datacube:**

```
~/.local/bin/antares init
datacube -v system init
```


**Check:**

`datacube -v system check`

**Create spatial indexes:**

```
apt-get install -y postgresql-client
psql -h 172.17.0.1 -d antares_datacube -U postgres -p 2345
#password postgres
CREATE INDEX madmex_predictobject_gix ON public.madmex_predictobject USING GIST (the_geom);
CREATE INDEX madmex_trainobject_gix ON public.madmex_trainobject USING GIST (the_geom);
```

**There are some notes that could be followed [Notes](https://github.com/CONABIO/antares3-docker/tree/master/postgresql/local_deployment#note) for docker container of postgresql**

# Register and ingest LANDSAT 8 data into ODC

S3 bucket that has data: `landsat-images-kube-sipecam-mad-mex`

**Prepare metadata:**

```
~/.local/bin/antares prepare_metadata --path "/" --bucket landsat-images-kube-sipecam-mad-mex --dataset_name landsat_espa --outfile /shared_volume/metadata_mex_l8.yaml --multi 2
```

**Datacube ingestion:**

```
datacube -v product add ~/.config/madmex/indexing/ls8_espa_scenes.yaml
datacube -v dataset add /shared_volume/metadata_mex_l8.yaml
datacube -v ingest -c ~/.config/madmex/ingestion/ls8_espa_mexico.yaml --executor multiproc 6
```

# Register and ingest SRTM data into ODC

Using https://conabio.github.io/antares3/example_s2_land_cover.html#prepare-terrain-metrics

From http://dwtkns.com/srtm/ will download srtm data for Chiapas:


```
cd /shared_volume
wget http://srtm.csi.cgiar.org/wp-content/uploads/files/srtm_5x5/tiff/srtm_18_09.zip
apt-get install -y unzip
unzip srtm_18_09.zip -d /shared_volume/srtm_18_09
mkdir /shared_volume/srtm_mosaic
cp /shared_volume/srtm_18_09/srtm_18_09.tif /shared_volume/srtm_mosaic/srtm_mosaic.tif
gdaldem slope /shared_volume/srtm_mosaic/srtm_mosaic.tif /shared_volume/srtm_mosaic/slope_mosaic.tif -s 111120
gdaldem aspect /shared_volume/srtm_mosaic/srtm_mosaic.tif /shared_volume/srtm_mosaic/aspect_mosaic.tif
```

## Create product and Index mosaic


`datacube -v product add ~/.config/madmex/indexing/srtm_cgiar.yaml`


```
~/.local/bin/antares prepare_metadata --path /shared_volume/srtm_mosaic --dataset_name srtm_cgiar --outfile /shared_volume/metadata_srtm.yaml

datacube -v dataset add /shared_volume/metadata_srtm.yaml
datacube -v ingest -c ~/.config/madmex/ingestion/srtm_cgiar_mexico.yaml --executor multiproc 6
```

# Ingest Mexico's shapefile to antares-datacube DB

`~/.local/bin/antares init -c 'MEX'`

# Ingest training data in antares-datacube DB

**Training data is in bucket `training-data-kube-sipecam-mad-mex`**

```
Chiapas_31.shp
Chiapas_31.shx
Chiapas_31.prj
Chiapas_31.dbf
```

```
~/.local/bin/antares ingest_training_from_vector /shared_volume/training_data/Chiapas_31.shp --scheme madmex --year 2015 --name train_chiapas_dummy --field class
```

# Deploy geonode

**Being in EC2 instance as root**

Following: https://github.com/CONABIO/geonode/tree/master/deployment_using_spcgeonode

Being root `sudo su`

Install docker-compose:

```
cd ~
curl -L "https://github.com/docker/compose/releases/download/1.26.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
```

Deploy geonode using https://github.com/CONABIO/geonode/tree/master/deployment_using_spcgeonode instructions

When cloning repo of geonode in `/shared_volume` change `/shared_volume/geonode/scripts/spcgeonode/nginx/nginx.conf.envsubst` to `server_names_hash_bucket_size  128;` and use in `/shared_volume/geonode/scripts/spcgeonode/.env` `ipv4 dns of ec2 instance`




And add rule in security groups for `80` port

## Deployments and services 


**Set:**

```
GEONODE_CONABIO_LOAD_BALANCER_SERVICE=loadbalancer-geonode-conabio-0.1_0.5.0-hostpath-pv
GEONODE_CONABIO_PV=hostpath-pv
GEONODE_CONABIO_PVC=hostpath-pvc
GEONODE_CONABIO_JUPYTERLAB_SERVICE_HOSTPATH_PV=jupyterlab-geonode-conabio-0.1_0.5.0-hostpath-pv
GEONODE_CONABIO_URL=https://raw.githubusercontent.com/CONABIO/kube_sipecam/master/minikube_sipecam/deployments/geonode_conabio/

```

**Create storage:**


```
kubectl create -f $GEONODE_CONABIO_URL/hostpath_pv/$GEONODE_CONABIO_PV.yaml
kubectl create -f $GEONODE_CONABIO_URL/hostpath_pv/$GEONODE_CONABIO_PVC.yaml
```

**Create service:**

```
kubectl create -f $GEONODE_CONABIO_URL/hostpath_pv/$GEONODE_CONABIO_LOAD_BALANCER_SERVICE.yaml
```

**Create deployment:**

```
kubectl create -f $GEONODE_CONABIO_URL/hostpath_pv/$GEONODE_CONABIO_JUPYTERLAB_SERVICE_HOSTPATH_PV.yaml
```

**And go to:**

```
http://<ipv4 of ec2 instance>:30002/geonodeurl
```


# Note:

If disk is full which could happen if a kubeflow pipeline will be uploaded from kale:

```
HTTP response headers: HTTPHeaderDict({'Date': 'Tue, 01 Sep 2020 18:12:22 GMT', 'Content-Length': '487', 'Content-Type': 'text/plain; charset=utf-8'})
HTTP response body: {"error_message":"Error creating pipeline: Create pipeline failed: InternalServerError: Failed to store b2fa5a70-cab4-4c89-8784-9c0cb118d1b4: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed.","error_details":"Error creating pipeline: Create pipeline failed: InternalServerError: Failed to store b2fa5a70-cab4-4c89-8784-9c0cb118d1b4: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed."}
```

Delete kubeflow (MAD-Mex and geonode deployments)

To free space:

```
minikube stop
minikube delete
```

Check:

```
docker system df
docker system prune --all --volumes
rm -r /root/.minikube/*
rm -r /root/.kube/*
rm -r /opt/kf-test
```

Start again (being in root dir):

```
CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.2.yaml"
source ~/.profile
chmod gou+wrx -R /opt/
mkdir -p ${KF_DIR}
#minikube start
cd /root && minikube start --driver=none
#kubeflow start
cd ${KF_DIR}

wget $CONFIG_URI
wget https://codeload.github.com/kubeflow/manifests/tar.gz/v1.0.2 -O v1.0.2.tar.gz

```

change kfctl_k8s_istio.v1.0.2.yaml at the end uri:

```
#this section:
  repos:
  - name: manifests
    uri: https://github.com/kubeflow/manifests/archive/v1.0.2.tar.gz
#for: 
  repos:
  - name: manifests
    uri: file:///opt/kf-test/v1.0.2.tar.gz
```

Then:

```
kfctl apply -V -f kfctl_k8s_istio.v1.0.2.yaml
```



ref: https://github.com/aws-samples/eks-workshop/issues/639

If there's problems with geonode (because stack of docker-compose was deleted, clone again repo and deploy geonode)