# Software Upgrade
### `! git clone https://github.com/ds4e/upgrade`

## 

- Disclaimer: I have never used a Mac and only use RStudio in Windows. I literally do not know what happens when you turn on a Mac. One reason I use VS Code below is to avoid dealing with your operating systems directly.
- If you use Windows, I highly recommend using the Windows Subsystem for Linux (WSL)

## Required Software
- There is some software you need installed:
    1. Python, version 3: https://www.python.org/downloads/
    2. Git: https://git-scm.com/downloads 
    3. VS Code: https://code.visualstudio.com/ 
    4. Pip should automatically be included with Python, but you can install it if you don't have it, for some reason: https://pip.pypa.io/en/stable/installation/ 

## Opening a Command Line/Terminal Window
- We want to talk directly to the computer:
    - On PC, push the windows key, type `cmd`, and enter
    - On a Mac, click Applications > Utilities > Terminal
    - On Linux, if it has a desktop interface, CTRL+ALT+T or there's a terminal option in the menu
- This lets you talk directly to your operating system, as it were, and ask it to run programs for you
- Open a terminal window, or figure out how to do it
- Typically, the command line also tells you where you are in the computer's file system
    - For PC/Mac/Linux, type `dir`/`ls`/`ls` to see a list of the files and directories at your current location

## Using VS Code
- I think it's great to be experienced and comfortable using the command line; I hopped on AWS for the first time, and there was nothing to learn because I'd been using Linux for years
- But VS Code certainly flattens the learning curve
- Open VS Code and navigate somewhere "safe" to work
- Create a new directory, and open that folder

## Using VS Code
- Click `Terminal`, `New Terminal`. A terminal panel should pop up at the bottom of your VS Code session, already in the directory where we're working
- Type
    - `python3 --version`
    - `pip --version`
- If you get an error, something is wrong with python or pip
- On many Macs and some Linux distributions, `python` refers to version 2 of python, but we want version 3. If you just type `python` and everything is OK, go ahead and just type `python` instead of `python3`

## Virtual Environments
- As a programmer working on many different projects, you will probably need many different versions of packages
- This becomes a problem: I don't necessarily want everything related to machine learning installed in the same space as everything related to, say, web design. Or, you are working on one project with version 1.2 of a package, but version 2.4 of another. 
- In order to compartmentalize your projects, Python includes a concept of a "virtual environment", where packages are installed into a directory with more limited scope
- This makes it easier to tailor your computing environment to the task at hand

## Virtual Environments 
- A virtual environment is just a directory with the desired packages (e.g. NumPy, SciPy, Sci-Kit Learn) installed in it, typically in the same directory as your code
- Typically, it's named something like `.MLproject` or `.ds_packages`
- VS Code can "see" your virtual environments and make them available as "kernels" for your work
- This allows us to curate our computing environments and manage them more effectively

## Creating/Activating a Virtual Environment
- In the terminal panel, type `python3 -m venv test_venv`
- You should see a new directory appear called `test_venv`. This is like a fresh install of Python, with no packages already downloaded
- To switch to this virtual environment, type
    - Mac/Linux: `source test_venv/bin/activate`
    - Windows: `test_venv\scripts\activate`
- You should see `(test_env)` appear in your command line, telling you the virtual environment is activated
- In both cases, there is a script called `activate`, and you are telling your operating system to run it

## Virtual Environments and .ipynb
- In VS Code, right click in the file panel and create a new file called `test.ipynb`
- Open `test.ipynb` as a Jupyter notebook, click on `Select Kernel` in the upper right. Under `Python Environments` you should see `test_env`
- This will probably set off a bunch of installing packages and VS Code extensions, particularly `ipykernel`, in order to use VS Code and Jupyter together; just approve everything
- Now your terminal and ipynb file are both using the same `test_env` virtual environment 

## Installing Packages into the Virtual Environment
- In the `test.ipynb` notebook, try importing NumPy; it should fail
- In the terminal at the bottom of VS Code -- with the virtual environment activated -- type `pip3 install numpy`
- In the `test.ipynb` notebook, try importing NumPy again; it should succeed 
- You can quickly set up an environment for our class with the command
    - `pip install numpy matplotlib pandas seaborn scikit-learn`
- Remember, in order to install into your virtual environment, **it must be activated in the terminal where you're running the pip command**

## Hiding Your Virtual Environment
- You typically want to hide the virtual environment, in particular from Git
- To set up a hidden virtual environment, put a . at the beginning of the name of the virtual environmentL: `python3 -m venv .secret_venv`
- In my VS Code session, I can see the virtual environment, but it does not show up when I type `ls`/`dir` in the terminal
- Most importantly, it does show up as an option for picking a kernel for VS Code

## How to Use Virtual Environments Effectively
- If I were you, I would:
    1. Pick a super-folder for my projects related to class; let's call it `FML` for "Foundations of Machine Learning"
    2. Create a virtual environment in that folder, called something like `.ds`, and install numpy/matplotlib/pandas/seaborn/scikit into it
    3. When I want to work locally, navigate to FML and clone repositories from GitHub into it
    4. Use the `.ds` kernel in VS Code, but don't put a virtual environment into each of the course repos (that would use a ton of memory and drive Git crazy)
- Let's do this, and make sure it works effectively

## 1. The Super-Folder

## 2. The Virtual Environment

## 3. Working with Github and VSCode

## 4. Using the .ds Kernel

- If pip/venv are working for you, the next step for machine learning/data science is miniconda: 
    1. Miniconda (not Anaconda): https://www.anaconda.com/docs/getting-started/miniconda/install#quickstart-install-instructions