#### Note: 
1. You will be working off of terminal for this chapter.
2. Change your working directory: ```cd /p/project/training2308/$USER/```

# Install Monda using Miniconda
You will be using Python and TensorFlow for training. For this you need to create an environment you can use across nodes. You will be using Miniconda to create a Python virtual environment for your experiments.

## Training using high performance computing and TensorFlow
1. Download Miniconda from https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
2. Install conda using the Miniconda script you downloaded
3. Make sure you can use the Python from the conda environment
4. Create a conda environment using provided yaml file
5. Install required packages
6. Folder structure and files you will be using
7. Update configuration for training
8. Update batch_job file.
9. Submit job
10. Check progress
11. Test saved model

## Use cloud computing for inferencing
1. Push trained model to AWS environment using boto3 (share credentials before this step)
2. Access setup environments in SageMaker
3. Load model in SageMaker
4. Deploy model
5. Test deployed model
6. Deploy API endpoint to interact with model
7. Interact with API endpoint to get inferences from the model

## Download Miniconda from https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

All of the following sections are to be performed in a terminal (from the jupyterhub). <Add details>

In [None]:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

## Install conda using the downloaded Miniconda shell script

In [None]:
chmod 770 Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
# Where to install miniconda (after the installation, it will ask): `/p/project/training2206/$USER/miniconda3` 
# Do you wish the installer to Initialize miniconda3? Yes 


### Check if the installation updated your bashrc to automatically use Python from conda

In [None]:
cat ~/.bashrc

## Folder structure

All of the project-related files are located at `/p/project/training2308/$USER/`. (User is the environment variable with your user name. You can check what it is set as using `echo $USER`)

Change directory to the aforementioned directory.

Check for `2023-igarss-tutorial` folder in the directory. If it is not present, you can use Git to download it from https://github.com/nasa-impact/2023-igarss-tutorial using `git clone https://github.com/nasa-impact/pixel-detector.git` or using the jupyterhub.

Once cloned, change the directory to the 2023-igarss-tutorial folder using `cd 2023-igarss-tutorial`

Below is the folder structure for the code:
```
|> chapter-1
    |> mmsegmentation
        |> config: `Contains configuration files`
        |> burn_scars.sh `Bash file for Burn Scars Training job submission.`
        |> flood.sh `Bash file for Flood Training job submission.`
    ...
|> chapter-2 `Contains files for loading the files in sagemaker environment and inferencing.`
|> chapter-3 `Contains files for establishing an API in cloud environment to interact with the trained model`
```

## Create conda environment
You will use the Python from the conda environment to create a conda environment which will be used throughout.

In some cases, conda might not be activated after installation. You can just refresh your bash terminal using `exec bash`, and it should enable the conda environment for you.

Once in the conda environment, you can create a new Python virtual environment using `conda create --name py39 -f tutorial.yml`

Then you will use the environment you just created using `conda activate py39`

Once the environment is activated, you will need to make sure you are starting from scratch. To make sure no other modules are installed, use `module purge` to remove all the unwanted modules.

For the purposes of this tutorial all packages except mmcv, mmcv-full, and mmsegmentation are already prepared via the tutorial.yml file.

We will need to install the aforementioned files as follows:
```
mim install mmcv==1.5.0
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.30.0
```
Once these are installed, local version of mmsegmentation also needs to be installed:
`cd mmsegmentation`
`pip install -e .`
`cd ..`

All of the required packages are now installed.

## Update configuration and environment variables

To find your user name, run the following command in the terminal:

`echo $USER`

Training and validation files are located at `/p/project/training2308/data/burn_scars` and `/p/project/training2308/data/flood`.

Before we start working on any of this, we will change our directory to `/p/project/training2206/$USER/2023-igarss-tutorial/chapter-1`

In the jupyter lab interface, find `chapter-1/burn_scars.sh` or `chapter-1/flood.sh` file. Right click on either of them in the left pane, and select `editor`. Once the file is open, you can update the `<username>` instances with your `username`.



# Submit Training Job
In the `burn_scars.sh` or `flood.sh` file you can specify the number of nodes you want to use for training. As an example, you are going to use 2 nodes for training.

Check details of the training job:

`cat /p/project/training2206/$USER/2023-igarss-tutorial/chapter-1/burn_scars.sh`

You can submit the training job using the `sbatch` command. Like so: `sbatch burn_scars.sh` or `sbatch flood.sh`

Once submitted, two new files will be created by the process: `output.out` and `error.err`. `output.out` will contain details of the output from your processes, and `error.err` will provide details on any errors or logs from the scripts. Once the job is submitted and the files are created, you can check for updates simply by using `tail -f output.out error.err`. (Any warnings, automated messages, and errors are tracked in the `error.err` file while only the [ed. note: incomplete sentence]

You can see how good or bad the model training is by watching the loss outputs in `output.out` or `error.err`.

# Uploading the Model to a Cloud Environment

After the model is finished training, the model is stored in the location specified in your config file `/p/project/training2308/<username>/<experiment>/training/latest.pth`, where `<username>` is your `username` and `experiment` is one of `burn_scars` or `flood`. You will be taking this model and pushing it to an S3 bucket using `boto3` and the credentials from the AWS account shared with you.

## Get AWS credentials
Account creation links should have been shared with you. Once the account is setup, you can obtain the credentials required for upload from the AWS SSO homepage.
Please follow the steps listed below:

1. Navigate to https://nasa-impact.awsapps.com/start
2. Login
3. Click on `AWS Account`
4. Click on `Summer School`
5. Click on `Command line or Programmatic access`
6. Copy the `AWS Access Key Id`, `AWS Secret Access Key`, and `AWS session token` from the pop up
7. Update the following script and run it in a python shell. (You can start a python shell by just typing `python` in the terminal).

This will upload the files directly into the S3 bucket. You will then fetch the file from S3 bucket into the SageMaker notebook from where you will be deploying the model and hosting an API to interact with the model.


*Note: Please make sure the virtual environment is active while working with the python shell.

In [None]:
import boto3 
import os

AWS_ACCESS_KEY_ID = <Copied over from SSO login>
AWS_SECRET_ACCESS_KEY = <Copied over from SSO login>
AWS_SESSION_TOKEN = <Copied over from SSO login>

BUCKET_NAME = '2023-igarss-tutorial-store'
## Please update this with either burn_scars or flood
EXPERIMENT = <experiment>

USER = os.environ.get('USER')

def generate_federated_session():
    """
    Method to generate federated session to upload the file from HPC to S3 bucket.
    ARGs:
        filename: Upload filename
    Returns: 
        Signed URL for file upload 
    """
    return boto3.session.Session(
            aws_access_key_id=AWS_ACCESS_KEY_ID,
            aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
            aws_session_token=AWS_SESSION_TOKEN
        )

model_filename = f"/p/project/training2308/{USER}/{EXPERIMENT}/training/latest.pth"
session = generate_federated_session()
s3_connector = session.client('s3')

s3_connector.upload_file(model_filename, BUCKET_NAME, f"{USER}/{USER}_{EXPERIMENT}.pth")


Once the process is done, you can check for the files in S3 using the AWS console.

1. Navigate to https://nasa-impact.awsapps.com/start
2. Login
3. Click on `AWS Account`
4. Click on `Summer School`
5. Click on `Management Console`
6. In the search bar, search for `s3`
7. Click on `s3`
8. Click on `2023-igarss-tutorial-data`
9. Click on your `username`

You should be able to view your file there now. 

# Foundation Model



# Notes on Fine-tuning

## MMSegmentation
