# Setting up the Data Science Environment

One of the largest hurdles beginners face is setting up an environment that they can quickly get up and running and analyzing data.

### Objectives

1. Understand the difference between interactive computing and executing a file
1. Ensure that Anaconda is installed properly with Python 3
1. Know what a path is and why its useful
1. Understand the difference between a Python, iPython, Jupyter Notebook, and Jupyter Lab
1. Know how to execute a Python file from the command line
1. Be aware of Anaconda Navigator
1. Most important Jupyter Notebook tips

# Interactive Computing vs Executing a File

### Interactive Computing
Nearly all the work that we do today will be done **interactively**, meaning that we will be typing one, or at most a few lines of code into an **input** area and executing it. The result will be displayed in an **output** area.

### Executing a Python File
The other way we can execute Python code is by writing it within a file and then executing the entire contents of that file.

### Interactive Computing for Data Science
Interactive computing is the most popular way to analyze data using Python. You can get instant feedback which will direct how the analysis progresses.

### Writing Code in Files to Build Software
All software has code written in a text file. This code is executed in its entirety. You cannot add or change code once the file has been executed. Although most tutorials (including this one) will use an interactive environment to do data science, you will eventually need to take your exploratory work from an interactive session and put in inside of a file.

# Ensuring that Anaconda is Installed Properly with Python 3
You should have already [downloaded Anaconda][1]. Jan 1, 2020 will mark the last day that Python will be officially supported. Let's ensure that you are running the latest version of Python 3.

1. Open up a terminal (Mac/Linux) or the Command Prompt (and not the Anaconda Prompt on Windows) and enter in **`python`**
1. Ensure that in the header you see Python version 3.X where X >= 6 
![][2]
3. If you don't see this header with the three arrow **`>>>`** prompts and instead see an error, then we need to troubleshoot here.

## Troubleshooting

### Windows
The error message that you will see is **`'python' is not recognized as an internal or external command...`**

This means that your computer cannot find where the program **`python`** is located on your machine. Let's find out where it is located.
1. Open up the program **Anaconda Prompt**
1. Type in **`python`** and you should now be able to get the interactive prompt
1. Exit out of the prompt by typing in **`exit()`**
1. The reason you cannot get **`python`** to run in the **Command Prompt** is that during installation you did not check the box to add
![](../images/addpath2.png)
1. It is perfectly well and good to use **Anaconda Prompt** from now on
1. If you so desire, you can [manually configure][3] your **Command Prompt**

### Mac/Linux
The error message you should have received is **`python: command not found`**. Let's try and find out where Python is installed on your machine.

1. Run the command: **`$ which -a python`**    
![][4]
1. This outputs a list of all the locations where there is an executable file with the name **`python`**
1. This location must be contained in something called the **path**. The path is a list (separated by colons) containing directories to look through to find executable files
1. Let's output the path with the command: **`$ echo $PATH`**
![][5]
1. My path contains the directory (**`/Users/Ted/Anaconda/bin`**) from above so running the command **`python`** works for me.
1. If your path does not have the directory outputted from step 1 then we will need to edit a file called **`.bash_profile`** (or **`.profile`** on some linux machines)
1. Make sure you are in your home directory and run the command:
> **`nano .bash_profile`**
1. This will open up the file **`.bash_profile`**, which may be empty
1. Add the following line inside of it: **`export PATH="/Users/Ted/anaconda3/bin:$PATH"`**
1. Exit (**`ctrl + x`**) and make sure to save
1. Close and reopen the terminal and execute: **`$ echo $PATH`**
1. The path should be updated with the Anaconda directory prepended to the front
1. Again, type in **`python`** and you should be good to go
1. **`.bash_profile`** is itself a file of commands that get executed each time you open a new terminal. 

### More on the path (all operating systems)
The path is a list of directories that the computer will search in order, from left to right, to find an executable program with the name you entered on the command line. It is possible to have many executables with the same name but in different folders. The first one found will be the one executed.

### Displaying the path
* Windows: **`$ path`** or **`$ set %PATH%`**
* Mac/Linux **`$ echo $PATH`**

### Finding the location of a program
* Windows: **` where program_name`**
* Mac\Linux: **`which program_name`**

### Editing the path
* Windows: Use the [set (or setx)][6] command or from a [GUI][7]
* Mac\Linux: By editing the **`.bash_profile`** as seen above

# python vs ipython
**`python`** and **`ipython`** are both executable programs that run Python interactively from the command line. The **`python`** command runs the default interpreter, which comes prepackaged with Python. There is almost no reason to ever run this program. It has been surpassed by **`ipython`** (interactive Python) which you also run from the command line. It adds lots of functionality such as syntax highlighting and special commands.

# iPython vs Jupyter Notebook
The Jupyter Notebook is a browser based version of iPython. Instead of being stuck within the confines of the command line, you are given a powerful web application that allows you to intertwine both code, text, and images. [See this][8] for more details of the internals
![][9]


# Jupyter Lab
Jupyter Lab is yet another interactive browser-based program that allows you to have windows for notebooks, terminals, data previews, and text editors all on one screen.

# Executing Python Files
An entire file of Python code can be executed either from the command line or from within this notebook. We execute the file by placing the location of the file after the **`python`** command. For instance, if you are in the home directory of this repository, the following run the following on the command line to play a number guessing game.

**`python scripts/guess_number.py`**

### Use a magic function to run a script inside the notebook
Instead of going to the command line, you can run a script directly in the notebook. Run the next two cells.


[1]: https://www.anaconda.com/download
[2]: ../images/pythonterminal.png
[3]: https://medium.com/@GalarnykMichael/install-python-on-windows-anaconda-c63c7c3d1444
[4]: ../images/which_python.png
[5]: ../images/path_mac.png
[6]: https://stackoverflow.com/questions/9546324/adding-directory-to-path-environment-variable-in-windows
[7]: https://www.computerhope.com/issues/ch000549.htm
[8]: http://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
[9]: ../images/jupyter_internal.png

In [1]:
%matplotlib notebook

In [3]:
%run /Users/jasvirdhillon/Documents/GitHub/Intro-Data-Science-Python-master/scripts/rain.py

<IPython.core.display.Javascript object>

# Anaconda Navigator vs Command Line
Anaconda comes with a simple GUI to launch Jupyter Notebooks and Labs and several other programs. This is just a point and click method for doing the same thing on the command line.

# Important Jupyter Notebook Tips

### Code vs Markdown Cells
* Each cell is either a **Code** cell or a **Markdown** cell.
* Code cells always have **`In [ ]`** to the left of them and understand Python code
* Markdown cells have nothing to the left and understand [markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet), a simple language to quickly formatting text.

### Edit vs Command Mode
* Each cell is either in **edit** or **command** mode
* When in edit mode, the border of the cell will be **green** and there will be a cursor in the cell so you can type
* When in command mode, the border will be **blue** with no cursor present
* When in edit mode, press **ESC** to switch to command mode
* When in command mode, press **Enter** to switch to edit mode (or just click in the cell)

### Keyboard Shortcuts
* **Shift + Enter** executes the current code block and moves the cursor to the next cell
* **Ctrl + Enter** executes the current code block and keeps the cursor in the same cell
* Press **Tab** frequently when writing code to get a pop-up menu with the available commands
* When calling a method, press **Shift + Tab + Tab** to have a pop-up menu with the documentation
* **ESC** then **a** inserts a cell above
* **ESC** then **b** inserts a cell below
* **ESC** then **d + d** deletes a cell