## Welcome to eHealth Student Lab
This lab is designed to help you gain experience in structuring data science projects properly. By completing these tasks, you'll learn how to write clean, reusable, and scalable code for real-world data science applications. The prerequicites has been done for you on the laptops provided, but if you want to replicate it on your laptops, you should follow all the instructions.

Feel free to ask questions or consult the instructors if you get stuck at any point.

### Prerequisite
- Python
- Git
- vscode

### i Setting Up the Environment
1. **Clone the repository**:
   ```bash
   git clone https://github.com/Marshall-mk/Ehealth-Tutorial.git
   cd Ehealth-Tutorial
   ```
2. **Create a virtual environment** (recommended):
   ```bash
   python -m venv myEhealth 
   source myEhealth/bin/activate  # For Linux/Mac
   myEhealth\Scripts\activate  # For Windows
   ```
3. **Install dependencies**:
   ```bash
   pip install -r requirements.txt (move on to other steps if you do not see the file)
   ```


### ii Initial Exploration
0. Open the folder "Ehealth-Tutorial" in vscode (or any code editor)
1. **Navigate to the notebooks directory**:
   The `notebooks/` folder contains a Jupyter notebooks.
2. Open the `sample_notebook.ipynb` to see how a typical student would expeiment on a given task.
3. Open the `notebook_to_codebase.ipynb` for instructions on how to convert a typical notebook to structured codebase.******

### iii New Tools and Libraries

- MLFlow and WandB: We will use this to track and log our experiments.
- pipreqs: For dependencies and requirements handling.
- ruff: For Code refactoring.
- Omegaconf and hydra: These two will be used to manage our configuration files.
- torch, tensorflow and sklearn: Will be use to train, finetune and evaluate our models

**<H1 style="text-align:center;">  Start of the Experiment</H1>**

**Step 1** (Optional)
   - Move data loading code to `src/data.py`.
   - Place model architecture code in `src/models.py`.
   - Place model training code into `src/train.py` and `src/trainer.py`.
   - Place model evaluation code into `src/evaluate.py`.
   - Place image prediction code in `src/test.py`.
   - Place utility functions into `src/utils.py`.

**Step 2** (Optional)
   - You will now learn how to extract dependencies and create a requirements.txt file.
   - Install pipreqs using pip
   - In your terminal, navigate to the root folder (just inside Ehealth-Tutorial) and run `pipreqs .`.
   - This creates a `requirements.txt` file for you which you can use to install the dependencies with the command in section i.

**Step 3** (Optional)
   - Here you will learn how to refactor a codebase.
   - Go back to each file, you will notice that the lines, spacing, and tabs are not well structured. We will use `ruff` to fix that!
   - Use pip to install `ruff`
   - Use the command (in the terminal inside Ehealth-Tutorial) `ruff format .` to refactor your codebase.

**Step 4**
   - Run your first experiment!!!
   - Create two folders in the root diectory, ```checkpoints``` and ```metrics```. if they don't already exist.
   - In the terminal or command prompt, Navigate into the source directory ```src```.
   - Start the mlflow server ``` mlflow server --host 127.0.0.1 --port 8080```
   - In another terminal or command prompt, Navigate into the source directory ```src```.
   - Run the experiment ```python train.py```

**Step 5**
   - To run your second experiment, modify the parameters.
   - Note: You can do this directly from the command line.
   - You can try ```python train.py model.model_name=mobilenetv3 train.batch_size=32 train.epochs=2 ```
   - You can train multiple experiments depending on how much time you have.

**Step 6**
   - Use mlflow to visualize your experiments.
   - In your command line, run ```mlflow ui``` and click on the url to open the UI in your browser.

**<H2 style="text-align:center;">  Using a Linux GPU Server (optional)</H2>**

We will be using an online server as we do not have access to a physical one. To do this, we will need to create an account on ```RunPod```

**Step 1**
   - Visit https://www.runpod.io/ and create an account. 


**Step 2**
   - Setup credentials to access runpod remotely.
   1. Create a key pair in a terminal window as follows: ```ssh-keygen -t ed25519```
   2. Get your public key (you can use the following command if you used the defaults) ```cat ~/.ssh/id_ed25519.pub``` for windows or ```type %USERPROFILE%\.ssh\id_ed25519.pub``` for mac and linux
   3. Copy your SSH key to the runpod server. See Image below.
      ![Image Description](sshkey.png) 

**Step 3**
   - On runpod, go to ```home -> Gpu-cloud``` and select the cheapest Nvidia machine.
   - The default image container is pytorch so leave as it is.
   - Check the ```ssh terminal access``` and click deploy.
   - Wait a few seconds, click ```connect``` and copy the ```ssh over exposed TCP``` code.

**Step 4**
- On Vscode use ```ctrl + shift + p``` to open the command pallete.
- Select ```Remote-SSH``` and then ```add new ssh host```.
- Paste the ```ssh over exposed TCP``` code and enter. (You should see a pop up saying 'Host added!').
- Type ```ctrl + shift + p``` and add the IP address associated with your TCP code (an example is 69.30.85.70).
- Open terminal and clone the Ehealth Git Repo.
- You should be able to run your experiments just as you did with your local device.