# LAB | MLOps Deployment from PROD to DEV


## Objective
In this exercise, you will work in pairs to simulate the process of deploying code from a development environment (DEV) to a production environment (PRD). One student will act as the developer pushing code, while the other will be the gatekeeper ensuring the code runs correctly.

## Instructions

### Step 1: Form Pairs
- Split into pairs

### Step 2: Developer Role
1. **Push Code to Repository**
   - Create a new project or use an existing one
   - Copy the code to a folder (which will be a git repo in a few minutes)
   - Create a venv (see previous class in case you missed something)
   - Install all the necessary packages
   - Create the requirements.txt file

   - Initialize a git repository in your project folder:
     ```sh
     git init
     ```
   - Add your project files to the repository - **do not add the venv folder to git**:
     ```sh
     git add [each_file]
     ```

   - Commit your changes:
     ```sh
     git commit -m "Initial commit"
     ```
   - Push your project to a remote git repository (GitHub, GitLab, etc.):
     ```sh
     git remote add origin <remote-repo-url>
     git push -u origin main
     ```

2. **Create a Pull Request**
   - Go to your remote repository and create a pull request (PR).

### Step 3: Gatekeeper Role
1. **Review and Pull Code**
   - Review the pull request created by your partner.
   - If everything looks good, merge the pull request.
   - Pull the latest changes from the remote repository:
     ```sh
     git pull origin main
     ```

2. **Setup Environment**
   - Navigate to the project directory.
   - Create an empty venv
   - Install the required dependencies listed in `requirements.txt`:
     ```sh
     pip install -r requirements.txt
     ```

3. **Run the Project**
   - Ensure the project runs without errors.
   - Provide feedback to the developer if there are any issues.

### Step 4: Swap Roles and Repeat
- Swap roles and repeat the exercise, so each student gets to be both the developer and the gatekeeper.
  - Bonus: to simulate reality, try to do this again outside a video call (via Slack messages for example)

### Notes
- Ensure clear communication between partners during the process.
- Make use of version control best practices.
- Document any issues encountered and how they were resolved.

## Deliverables
- Each pair should have a functioning project that has been successfully reviewed, pulled, and run by the gatekeeper.
- A brief report on the process, any challenges faced, and how they were overcome.

Good luck and happy coding!




## Developer role

As the developer, I was responsible for creating the project "clean_csv" and ensuring it could run smoothly across different systems. Initially, I worked with a requirements.yml file generated on MacOS.

Project Creation: Set up the codebase and organized the project files.

Virtual Environment: A conda environment was created (test_prod_dev) to isolate dependencies and maintain reproducibility.

Dependency Tracking: Identified that the requirments.yml contained MacOS-specific packages (e.g., libcxx, libedit, appnope) and a Mac-specific prefix path. I removed those and simplified the file for cross-platform use.

Git Usage: Pushed the updated requirments.yml and code changes to the repository.

PR Creation: Submitted a Pull Request (PR) to share the adapted environment setup with collaborators.

## Gatekeeper role

As the gatekeeper, the responsibilities included verifying the changes made by the developer:

PR Review: Reviewed the developer’s PR to confirm that OS-specific dependencies were removed and only cross-platform packages remained.

Repository Pull: Pulled the updated repository from GitHub to ensure local code and environment files matched the developer’s changes.

Virtual Environment: Created the virtual environment on Windows using the updated YAML file.

conda env create -f environment_windows.yml -n test_env


Dependency Installation: Verified that conda successfully installed the required packages without errors.

Project Run: Activated the environment and executed the project code to confirm that the application ran correctly:

conda activate test_env
python run.py

## Swap Roles for End-to-End Process Simulation

To simulate a real-world development workflow, roles were swapped:

The developer became the gatekeeper, testing the project setup and validating reproducibility.

The gatekeeper took the developer role, making adjustments or new changes (e.g., adding missing dependencies like pandas, numpy, or scikit-learn if needed) and opening a PR.

This simulation ensured that the process works seamlessly in both directions and validated the project’s cross-platform compatibility.

## Documentation and Brief Report on Process and Challenges

Process: The project required adapting a MacOS-generated environment.yml for Windows. This involved stripping away OS-specific dependencies, removing the prefix path, and simplifying the environment file. The updated file was committed, reviewed, pulled, and tested successfully on Windows.

Challenges:

OS-specific dependencies caused errors on Windows.

The prefix path pointed to a Mac directory, which had to be removed.

Ensuring minimal but complete dependencies required careful validation.

Outcome: A clean, cross-platform environment_windows.yml was created, enabling reproducible environments on both MacOS and Windows.

Lesson Learned: It is best to keep environment files minimal and cross-platform, adding only essential dependencies. System-specific details (like prefix paths or OS-specific libraries) should not be included in shared environment files.