## Welcome to eHealth Student Lab
This lab is designed to help you gain experience in structuring data science projects properly. By completing these tasks, you'll learn how to write clean, reusable, and scalable code for real-world data science applications.

Feel free to ask questions or consult the instructors (Tuhin and Momin) if you get stuck at any point.

### Prerequisite
- Python
- Git
- vscode

### i Setting Up the Environment
1. **Clone the repository**:
   ```bash
   git clone https://github.com/Marshall-mk/Ehealth-Tutorial.git
   cd Ehealth-Tutorial
   ```
2. **Create a virtual environment** (recommended):
   ```bash
   python -m venv myEhealth 
   source myEhealth/bin/activate  # For Linux/Mac
   myEhealth\Scripts\activate  # For Windows
   ```
3. **Install dependencies**:
   ```bash
   pip install -r requirements.txt (move on to other steps if you do not see the file)
   ```


### ii Initial Exploration
0. Open the folder "Ehealth-Tutorial" in vscode (or any code editor)
1. **Navigate to the notebooks directory**:
   The `notebooks/` folder contains a Jupyter notebooks.
2. Open the `sample_notebook.ipynb` to see how a typical student would expeiment on a given task.
3. Open the `notebook_to_codebase.ipynb` for instructions on how to convert a typical notebook to structured codebase.******

### iii New Tools and Libraries

- MLFlow and WandB: We will use this to track and log our experiments.
- pipreqs: For dependencies and requirements handling.
- ruff: For Code refactoring.
- Omegaconf and hydra: These two will be used to manage our configuration files.
- torch, tensorflow and sklearn: Will be use to train, finetune and evaluate our models

**<H1 style="text-align:center;">  Start of the Experiment</H1>**

#### Explore the cloned repository, you will find that the filenames are not appropraite for the code contained in them. They are also not in the right directory. Your task will be to rename each of this files and move them to appropriate directories. 

**Step 0**
   - Solve the trivia!
   - Each trivia unlocks a zip file containing the codes that should be in the ```src``` directory.
   - All answers are in lowercase!!!

**Trivia 1**
   - What does this image say? Use the answer to unlock the zip file named ```data$Models```


![Image Description](1.png)

**Trivia 2**
   - Unscramble the word in this image. Use the answer to unlock the zip file named ```train$trainer```
   - Hint: The hypens do matter.

   
![Image Description](2.png)

**Trivia 3**
   - What does this image say? Use the answer to unlock the zip file named ```others```
   - Hint: The answer can be abbreviated.
   
   
![Image Description](3.png)

**Step 1**
   - Move data loading code to `src/data.py`.
   - Place model architecture code in `src/models.py`.
   - Place model training code into `src/train.py` and `src/trainer.py`.
   - Place model evaluation code into `src/evaluate.py`.
   - Place image prediction code in `src/test.py`.
   - Place utility functions into `src/utils.py`.

**Step 2** (Optional)
   - You will now learn how to extract dependencies and create a requirements.txt file.
   - Install pipreqs using pip
   - In your terminal, navigate to the root folder (just inside Ehealth-Tutorial) and run `pipreqs .`.
   - This creates a `requirements.txt` file for you which you can use to install the dependencies with the command in section i.

**Step 3** (Optional)
   - Here you will learn how to refactor a codebase.
   - Go back to each file, you will notice that the lines, spacing, and tabs are not well structured. We will use `ruff` to fix that!
   - Use pip to install `ruff`
   - Use the command (in the terminal inside Ehealth-Tutorial) `ruff format .` to refactor your codebase.

**Step 4**
   - Run your first experiment!!!
   - Create two folders in the root diectory, ```checkpoints``` and ```metrics```.
   - In the terminal or command prompt, Navigate into the source directory ```src```.
   - Start the mlflow server ``` mlflow server --host 127.0.0.1 --port 8080```
   - In another terminal or command prompt, Navigate into the source directory ```src```.
   - Run the experiment ```python train.py```

**Step 5**
   - Identify the other parameters and configuration used in the first experiment. 
   - Hint: It is set to default in a ```.yaml``` file.

**Step 6**
   - To run your second experiment, modify the parameters.
   - Note: You can do this directly from the command line.
   - You can try ```python train.py model.model_name=mobilenetv3 train.batch_size=32 train.epochs=2 ```
   - You can train multiple experiments depending on how much time you have.

**Step 7**
   - Use mlflow to visualize your experiments
   - ```mlflow ui```