# Introduction to Jupyter Notebooks
?? Should we include a snippet here about what data science is??  
Welcome, and thanks for joining us for this course on working in Jupyter Notebook environments to process data. In this first lesson of our four-part series, we'll get you set up and running in a Jupyter Notebook environment. 

As we go, be sure to ask plenty of questions, and never hesitate to let us know if we're moving too quickly. 

Before we begin, let's give a quick overview of what we'll be learning in this course. This course is meant to be a quick 'soft' curriculum to get you initated with some of the basic concepts of the python coding language and data manipulation.
We hope that what you'll learn this class will give you a solid foundation and jumping off point for you to continue research and grow your knowledge.

Generally, what we'll be teaching in this four-week curriculum is known as the field of Data Science. For those of you who don't know, Data Science is an interdisciplinary field that combines statistics, computer science, which is defined as: <br>
**The application of data centric, computational, and inferential thinking to**
1. Understand the world (science).
2. Solve problems (engineering)


## First Steps
Now, it\'s time to get started to check out what Jupyter Notebooks are all about. Jupyter Notebooks are, at their core, environments that allow us to code and **interpret** our code in real time. Though we're using Python for this series, there are many other languages that are typically used in notebooks as well: R, Julia, and Matlab, to name a few, all support Jupyter Notebook environments. Let's check out some basic things we can do, such as addition and multiplication

In [5]:
3 * 5

15

In [6]:
20 / 5

4.0

Now, try this on your own!

In [3]:
# TODO: Add six and seven


In [4]:
# TODO: Multiply two by twenty-four


In [5]:
# TODO: Divide six by four 


In [7]:
# TODO: Subtract twenty from seventeen


Simple enough! Next, let's try creating a **variable**. A variable stores a value in a way which we can access it later. For example, if I set the variable *Name* to "Brandon" 

In [7]:
Name = "Brandon"

I can then call *Name* later to access the value stored there

In [8]:
Name

'Brandon'

Try it yourself by setting the variable *seven* to the number seven.

Great work! One of the last important concepts that we'll learn about today are **lists**. Lists allow us to store multiple values at once. Think of it like a grocery list; if we want to remember to buy Bananas, Captain Crunch, and Pasta, we could write it down as
```
Bananas,
Captain Crunch,
Pasta
```
We can achieve the same in Python!

In [8]:
groceries = ['Bananas', 'Captain Crunch', 'Pasta']

Since we saved our list in a variable called *groceries*, we can check them out by running 

In [10]:
groceries

['Bananas', 'Captain Crunch', 'Pasta']

How about you try making your own shopping list similar to the one above?

In [None]:
# TODO: Create a shopping list with at least four items
shopping_list = []

Pretty sweet! Now this isn't to say that you should keep grocery lists as python lists from now on; I doubt you want to haul your computer to the store just to buy three things. But lists are one of the most important **collections** within Python when it comes to data science. Now let's see some of the cool things we can do with lists

## Lists, Continued
Since we're storing multiple values within one variable, it's important that we're able to access the individual values from the list. To get certain values, we can use **indexing**. Brace yourselves, because here comes a pretty tricky concept to remember. Python uses **Zero-based indexing**, which means that it starts counting from zero instead of from one. For example, to access the first item from the groceries list, we would use

In [11]:
groceries[0]

'Bananas'

This is pretty weird to understand at first, but it'll make more sense as you go on and practice. Try it out yourself by grabbing the third item from the grocery list

In [12]:
# TODO: Get the third item from the groceries list
groceries[ ]

SyntaxError: invalid syntax (<ipython-input-12-0ebdb8c3904c>, line 2)

Nice job. It's cool to be able to grab a single value from a list at a time, but what if I want to get multiple? This can be achieved using a colon when selecting a value. If we wanted to fetch the first and second value from the groceries list, we could use 

In [13]:
groceries[0:2]

['Bananas', 'Captain Crunch']

Now, that may seem off at first: why would we specify ```0:2``` if we want ther first and the second values? Python slicing with a colon will always grab the first value that you specify, up to (and not including) the last value that you specify. For example, if we wanted to get the second and third values, we could use 

In [15]:
groceries[1:3]

['Captain Crunch', 'Pasta']

Try it yourself with your own shopping list! try getting the second and third values from your list below. 

In [17]:
# TODO: Get the second and third values from your shopping list
shopping_list[]

SyntaxError: invalid syntax (<ipython-input-17-8279085d4b72>, line 2)

Great work! Once last thing that we'll learn is shorthand notation for list indexing. That's a whole lot of fancy words to say getting ranges of values more easily from our list. This is better shown than explained, so here's some examples of how to achieve certain tasks below. For starters, we'll select all values up to the third value in the groceries list. 

In [18]:
groceries[:2]

['Bananas', 'Captain Crunch']

Notice that we don't have to put a zero in front of the colon to achieve this; Python is smart and knows to start at zero. What if we wanted to select all values after the second value? 

In [20]:
groceries[1:]

['Captain Crunch', 'Pasta']

Sick! Now, try it yourself with your shopping list.

In [21]:
# TODO: Select all values up to the fourth value in your shopping list 
shopping_list[]

SyntaxError: invalid syntax (<ipython-input-21-7626cf116992>, line 2)

In [22]:
# TODO: Select all values after the first value in your shopping list
shopping_list[]

SyntaxError: invalid syntax (<ipython-input-22-28aee99514f1>, line 2)

## Importing Packages
Before we can get started with writing our notebook and diving into some data, we have to import some packages. Packages are essentially pre-built bundles of code that allow us to achieve common tasks that we wouldn't be able to achieve in plain Python. For starters, we'll be working with two of the most commonly used packages in Python Data Science: NumPy and Pandas. 
### NumPy
Numpy, short for **Num**erical **Py**thon, is a package that provides us with tools for working with lists of numbers, or **Arrays**. 
### Pandas
Pandas is a package that comes with many built in tools for examining and manipulating data. We'll use this package a lot throughout this course to help us understand and dig deeply into our data. <br>

Without further ado, let's get started by importing both NumPy and Pandas. You'll notice that we're importing numpy "as np" and importing pandas "as pd". All this means is that we're choosing to rename the packages as we import them. This is only done because we're pretty lazy, and don't feel like typing out "numpy" or "pandas" every time we want to use the package; "np" and "pd" are much quicker to type.


In [1]:
import numpy as np 
import pandas as pd 

## Getting our data
Typically, this is one of the trickiest steps in data science; data usually never comes clean, and usually never comes bundled up in one convenient source. Luckily, we've pre-bundled the data so that it's easy to import and start working with. Our data is bundled up in a CSV, or **C**omma **S**eparated **V**alues file. All this means is that our data is divided into rows and columns by commas. For example, if we had a dataset that stored students' names and ages, the CSV file may look like:
```
Name,Age,
Carlos,17,
Sarah,16,
```

Feel free to take a look at the file itself if you'd like to see how this works. For now, though, we'll read in the data using pandas built in read_csv method. A **Method** is essentially a function built into a package that allows us to achieve a specific task. In this case, our package is pandas, and our task is to read in our data from a CSV file. To do so, we'll use the **read_csv()** method.

In [1]:
data = pd.read_csv()

NameError: name 'pd' is not defined

## Practice
That's it for this week's lesson! Now, to make sure that you understand what we learned today, be sure to complete the below notebooks. 