# Overview

Before class, you should have completed several DataCamp courses introducing you to Python. You should already have some familiarity with Python syntax and data structures. This week, you will start running Python on your machine. We will be using an environment called **Jupyter Notebooks**, which allow for interactive coding and display.

We'll start with a quick overview of how to use Jupyter Notebooks. Then we'll review some of the main Python packages we'll be using in class for visualizing and analyzing data. After class, a couple of notebooks are available for you to optionally review Python and SQL basics.

## Agenda
- Introduction to Using Jupyter Notebooks
- Data Visualization in Python
- Using Pandas to Process and Analyze Data
- (opt. at home) Python and SQL Review

## TODO and Discussion items
Throughout these notebooks, you'll see sections marked **TODO** or **Discussion**. Whenever you see a todo section, you'll be asked to edit or complete code. Parts of code which you need to edit will have placeholders of three underscores: `___`. You should replace these underscores with the correct snippet of code. This syntax should be familiar with you as the convention used by DataCamp.

For example, if you are asked to edit the cell below to print out the text "Hello, world!", you should change this cell (which won't run correctly until you change it):

In [None]:
___("Hello, world!")

To this:

In [None]:
print("Hello, world!")

# I. Jupyter Notebooks
Jupyter Notebooks are an environment which can be used for running code, displaying results and visualizations, and sharing human-readable information. Jupyter notebooks consist of *cells* and each cell defines a single piece of code. 

The cell which you're reading now is called a *Markdown* cell: It is meant to be human-readable and allows formatting like:
- Bullted or numbered lists
- **bold text**
- *italics*

Double-click on this cell to see what the raw markdown looks like. Then press `"Run"` or hit "Shift+Enter" to re-execute the cell so it renders in your browser. 

The other main cell type is the **Code** cell. This contains executable Python code. When you run a code cell, it will execute the code within and display the output. This way, you can go through a notebook a run code step-by-step and inspect the output.


You can change a cell type by clicking the dropdown menu above which says **"Markdown"**. To make it a Python cell, select **"Code"**.

In [None]:
print("This is a code cell.")

### TODO
Change the cell below to be a code cell. Then run the cell by hitting either the `"Run"` button above (looks like a "play" button") or hitting "Shift+Enter". 

print("Hello, there!")

Note that in the code cell above, when we execute the cell it runs the single line of code and then displays the output underneath.

To create a new cell, press the **"+"** buttom in the menu. The default cell type is **Code**.

### TODO
Create two new cells below. Make the first cell a Markdown cell. Copy and paste the following text into the Markdown cell and edit it with your information. Run the cell. Notice how the formatting of the text is rendered once you execute the cell.

```
- **First Name**: your first name
- **Last Name**: your last name
- **Major**: Your major
```

Then, make the second cell a Code cell. Copy and paste the following code and execute it:

```
print("1 + 2 = ", 1 + 2)
```

# II. Importing libraries

Python is an open-source language has a comprehensive community. Many other people have written code which will be useful to us and we want to use in our own projects. We can use these libraries by adding them to our Python environment. The way we do this is by using the `import` statement:

In [None]:
import math

Now we can use anything which was defined in the `math` library:

In [None]:
math.sqrt(4)

In [None]:
math.floor(2.1)

In [None]:
math.pi

# III. Installing packages
Before importing a package (also called a library), you first have to have it installed. Some libraries, such as `math` and `random`, come pre-installed with Anaconda. Other times we'll have to install the package ourselves. This is something which you don't have to deal with in DataCamp, and until you get used to it can be a little confusing.

If you try to import a module which you don't have installed, you'll get an `ModuleNotFoundError`. This is going to be a very frequent error which we run into, so it's important that you know how to read it and how to fix it.

Let's look at an example here. We'll try to import a package called `spacy`, which we'll use in the NLP module later in the course:

In [None]:
import spacy

Unless you've already done some work installing packages, you should have gotten an error message which said:

**ModuleNotFoundError: No module named 'spacy'**

Python is telling us that we haven't installed spacy yet. 

To install packages we'll use a command-line tool called `pip`, which stands for **"PIP Installs Packages"**.

- Open a terminal (**"Terminal"** for Mac, **"Anaconda Prompt"** for Windows)
- Type:
```bash
pip install spacy
```

The console will print out a lot of information. You might get some warnings, which are often safe to ignore. You might get an error, in which case you'll need to do a little troubleshooting. But you should eventually get a message like this:

```bash
Successfully installed spacy-...
```

Now, you can import spacy:

In [None]:
import spacy

We'll typically start out a coding module by installing a list of packages which will be needed for that week. If you run into errors, try to read the error message. Googling any errors is often the best way to find a solution.

# Next Steps
Now that we've reviewed some basic usage of Python, let's explore some of the main libraries we'll be using in this course. Our coding workshops will be oriented to data science and we'll focus on a few very powerful data science libraries.

[./01-visualization_packages.ipynb](./01-visualization_packages.ipynb)