# Development Environment Lab

## Overview

In this simple lab we will set up the scaffolding for the development of your project with respect to file structure, documentation, and development environment.

## Goal

To instill the practice and discipline of standardizing your development process. Please, do not become too comfortable with a *specific* way of doing things. For example, it's nice if you like to organize your projects and documentation or sytle your code a certain way, but when you join a new team, you may be expected to switch over to their way of doing things. You must do this, and be a team player, even if you believe your way is the *right* way. It isn't. The *right* way is however the team decides, and it may change from team to team, company to company. If you are very opinionated, you may be lucky enough to join a new team that has not yet established their best practices, and then you can have your say and influence the team.

## Instructions

### IDE

You should start by deciding if you want to work out of Jupyter Labs, or try something new like PyCharm or VS Code. If you go with Jupyter, you may want to try Jupyter Desktop. I, personally, will be using VS Code for all demos.

### Virtual Environments

Next, let's create a virtual environment for the work you will be doing in this class. You can use virtual environments in python, or you can use conda. It is up to you. 

Using conda, here are some commands that might be useful.

`conda -V`  
`conda update conda `  

Create and activate a virtual environment, with python 3.9 in it.
`conda create --name mlops python=3.9`  
`conda activate mlops`  
`conda info --envs`

If you need to switch out of the virtual environment, you can use  
`conda activate`  

Or you can use python to create and activate a virtual environment.  
`python3 -m venv mlops`  
`source mlops/bin/activate`  

You should notice, in the terminal, that you have switched over to the virtual environment. To deactivate, simply run  
`deactivate`

1. In the terminal, navigate to where you would like to create your mlops project folder, create a folder for your project work, then cd into it. Now might be a good time to initialize a git repository using `git init`.
2. Create a virtual environment, call it mlops, and activate it.
3. Create a requirements.txt file. Include mlfow, scikit-learn, pandas, and numpy. I'm not going to be prescriptive, so please include whatever version number you want for each of these libraries.  
4. Install libraries from your requirements.txt file using `pip install -r requirements.txt`.

`pip freeze` or `pip list` will allow you to see the libraries you've installed. Upon creating your virtual environment, this should return nothing.


We can, but are not required to, create a **setup.py** file which would hold some metadata about our project, and also details about the libraries we installed (e.g. which libraries are *required* in production versus which are used only for development purposes). Later on in the course we will create a **pyproject.toml** file which will hold some other configurations for us, but I find having a separate setup.py file to be just another list of things that I will forget to keep track of, so I won't be using it.

### Organization and Documentation

Now that your environment is ready, let's create a project folder structure, begin our documentation, and push everything to Github. For the work we will be doing for the project, labs, and demos, we will be doing a lot of typical ML training processes, like adding data, writing model-training scripts, using Jupyter notebooks, and writing other pieces of code.

1. Create other folders for your project. You may want a folder for all of your project notebooks, a folder for data, a folder for models. Start with the bare minimum, and add new folders when you need them, rather than creating a bunch of folders that you may not ever need.
2. Create a README.md file. Remember to come back to this file and describe your project in detail. Provide a project description, data sets, data sources, and description of solution.
3. Commit your changes (which at this point is really just a requirements.txt and readme.md file) and push to Github.

### Final Project

To help with completing the final project you should have:

- An organized project folder structure  
- Code with appropriate comments, or alternatively a read the docs website  
- Your repo on Github should have a fully documented README.md file with *at least* the following sections:
    - Project Description  
    - Data Set Descriptions and Sources   
    - MLOps stages that were tested  
    - Which tools were tested  
    - Comparison of tools (pros and cons)