# Demo 1: Basics of Notebooks and Python Review!

In this demo we will get started with Jupyter Notebooks, Google CoLab, Git, and Python -- the basic ingreediants for our data science course!



In [None]:
# Magic for showing Google Slides!
from IPython.display import IFrame
# These two things are for Pandas, it widens the notebook and lets us display data easily.
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:95% !important; }</style>"))
width=1800
height=1100

Let's start today by going over what notebooks are, and the final project we'll be doing for this course

In [None]:
IFrame(src='''https://docs.google.com/presentation/d/e/2PACX-1vQeHlnuEX_rEj3bsUvW7e9WYstQFEgXmHAS3yiKK2Sbmvbh5oXZnazvdLMvzlRw8NMGfJL-QwPW8H8-/embed?start=false&loop=false&delayms=3000&slide=id.g146df145938_1_500" frameborder="0" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"''',
        width=width, height=height)

# Welcome to Notebooks!

There are a lot of useful keyboard shortcuts you can use -- Check out the Help >> Keyboard Shortcuts menu..

We can designate cells as markdown -- which lets us do some cools stuff...  A few quick and useful things.

[Note: Examples Taken from Adam-p Markdown Cheat Sheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)

First let's look at some headings..

# H1
## H2
### H3
#### H4
##### H5
###### H6

---

We can also section break things...


## Next is how to add emphsis to some text..

Emphasis, aka italics, with *asterisks* or _underscores_.

Strong emphasis, aka bold, with **asterisks** or __underscores__.

Combined emphasis with **asterisks and _underscores_**.

Strikethrough uses two tildes. ~~Scratch this.~~

## Let's also see how to do lists and alignment...

1. First ordered list item
2. Another item
 * Unordered sub-list.
1. Actual numbers don't matter, just that it's a number
 1. Ordered sub-list
4. And another item.

   You can have properly indented paragraphs within list items. Notice the blank line above, and the leading spaces (you need three spaces).


* Unordered list can use asterisks
- Or minuses
+ Or pluses
* add another element
* add another **fancy element**

## Links are really important ... code blocks...

[I'm an inline-style link](https://www.google.com)

URLs and URLs in angle brackets will automatically get turned into links.
http://www.example.com or <http://www.example.com> and sometimes
example.com (but not on Github, for example).

Inline `code has backticks` around it.

```
You can also do blocks of code.
```


You can also tell markdown what type of code you are using...

```javascript
var s = "JavaScript syntax highlighting";
alert(s);
```

```python
s = "Python syntax highlighting"
print s
```

```
No language indicated, so no syntax highlighting.
But let's throw in a <b>tag</b>.
```


## Finally, tables are a bit cumbersome...

Markdown | Less | Pretty
--- | --- | ---
*Still* | `renders` | **nicely**
1 | 2 | 3

---

Let's go back to the slides and learn a little bit more about Git!

In [None]:
IFrame(src='''https://docs.google.com/presentation/d/e/2PACX-1vQeHlnuEX_rEj3bsUvW7e9WYstQFEgXmHAS3yiKK2Sbmvbh5oXZnazvdLMvzlRw8NMGfJL-QwPW8H8-/embed?start=false&loop=false&delayms=3000&slide=id.g146df145938_1_595" frameborder="0" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"''',
        width=width, height=height)

# A Little About Directories, Git, Commands, and CoLab


What Colab does is create a [virtual machine](https://en.wikipedia.org/wiki/Virtual_machine), which is like a fresh install of an operating system made just for you.

This means that in addition to running Python code in this notebook, you can also interact with the command-line as if you were using a terminal to navigate a computer. If you have never used the command-line before, it is worth reading through [this tutorial](https://computers.tutsplus.com/tutorials/navigating-the-terminal-a-gentle-introduction--mac-3855).

Notebooks also allow you to run shell commands by preceding the command with a `!`.

In [None]:
!pwd

In [None]:
!ls

## Cloneing a git Repo

We can create and clone a git repo locally on our machine, but we'll learn more about this later in the course. For now let's clone the class repo into a CoLab notebook!

1. Mount your personal Google drive inside this virtual machine so you can write to your drive from this notebook.
2. Change the current working directory to the root folder of your Google drive.
3. Clone the repository from GitHub into this folder.

In [None]:
# Mount our personal google drive. This will pop up a
# confirmation screen giving this notebook access to your Google drive.
# You will first need a gmail account for this to work.
from google.colab import drive
drive.mount('/content/drive')

You should now see the contents of your Google drive by navigating to the folder icon in the left panel. It is viewable at `/content/drive/MyDrive`.

To list the contents of a folder, you can use the `ls` command. This should list the contents of the root folder of your Google drive.

In [None]:
!ls /content/drive/MyDrive

To change the current working directory to be the location of your Google drive, we will issue a `cd` command preceded by the `%` symbol. The difference between `!` and `%` is that `%` will actually have a persistent effect on the notebook.

In [None]:
%cd /content/drive/My Drive

In [None]:
!pwd

Our next task is to clone the course repository and store it into our personal Google drive. The course repository is viewable here: <https://github.com/nmattei/cmps3160>.

In [None]:
!git clone https://github.com/nmattei/cmps3160.git

You should now be able to list the contents of the new folder `/content/drive/My Drive/cmps3160`, which contains all the course code and data.

In [None]:
!ls cmps3160

Additionally, if you browse your Google drive as you normally would, you should see the new `cmps3160` folder there.

<br>


Now let's change directories to the cmps3160 folder.

In [None]:
%cd cmps3160

**Important** Remember to always always pull before you start working!

In [None]:
!git pull

Now we are going to change to the \_demos folder and check out the data that we have there, we'll see more of this data as we go throughout the semester.

In [None]:
%cd _demos
!ls

In [None]:
!ls data

In [None]:
!head data/nba_salaries.csv

---

Let's go back to the slides and see a few more things about Python and why it's the best!

In [None]:
IFrame(src='''https://docs.google.com/presentation/d/e/2PACX-1vQeHlnuEX_rEj3bsUvW7e9WYstQFEgXmHAS3yiKK2Sbmvbh5oXZnazvdLMvzlRw8NMGfJL-QwPW8H8-/embed?start=false&loop=false&delayms=3000&slide=id.g146df145938_1_650" frameborder="0" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"''',
       width=width, height=height)

# Let's do some Code!

The cell below loads up a few libraries and does some initialization.

In [None]:
### Standard Magic and startup initializers.

# Load Numpy
import numpy as np
# Load MatPlotLib
import matplotlib
import matplotlib.pyplot as plt
# Load Pandas
import pandas as pd

# This lets us show plots inline and also save PDF plots if we want them
%matplotlib inline
from matplotlib.backends.backend_pdf import PdfPages
matplotlib.style.use('fivethirtyeight')
# Seaborn is a plotting package for Pandas that we'll try out...
import seaborn as sns

# These two things are for Pandas, it widens the notebook and lets us display data easily.
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:95% !important; }</style>"))

### A quick note about the Data8 Textbook.

The [Data8 Textbook](https://www.inferentialthinking.com/chapters/intro) uses a package called `DataScience` which is a wrapper around Pandas.  When you do stuff in that textbook please remember that the Pandas calls are a bit different.

### First let's go over the examples from the slides here and make sure we understand them.

In [None]:
# Define a simple function.

def my_func(x, y):
    if x > y:
        return x
    else:
        return y

In [None]:
my_func(1,2)

In [None]:
def my_func(x, y):
    return (x-1, y+2)


Be careful with notebooks... we can have scope problems if we run cells out of order!

In [None]:
# What is in scope here?
(a, b) = my_func(1, 2)

In [None]:
print(a)

In [None]:
print(b)

Let's look at some simple lists and data structures.

In [None]:
a = [1, 2, 4, 'a']
a

In [None]:
# len: returns the number of items of an enumerable object
len(['c', 'm', 's', 'c', 3, 2, 0])


In [None]:
# range: returns an iterable object
list(range(10))


In [None]:
# enumerate: returns iterable tuple (index, element) of a list
a = enumerate( ['311', '320', '330'] )
print(a)

In [None]:
# Recall here that Python3 does lazy evaluation for these iterators.  We have to manually expand it.
list(a)

In [None]:
a = ['311', '320', '330']
for i,j in enumerate(a):
    print(i,j)

In [None]:
def squared(x):
    return x**2

In [None]:
## Map and Filter

# map: apply a function to a sequence or iterable

arr = [1, 2, 3, 4, 5]
out = map(squared, arr)
print(list(out))
print(arr)

In [None]:
new_arr = [x**3 for x in arr if x <= 3]

In [None]:
new_arr

In [None]:
# What happened here??
arr = [1, 2, 3, 4, 5]
out = map(lambda x: x**2, arr)
print(out)


# Remember again lazy evaluation!
print(list(out))

In [None]:
# filter: returns a list of elements for which a predicate is true

arr = [1, 2, 3, 4, 5, 6, 7]
out = filter(lambda x: x % 2 == 0, arr)
print(out)

# Remember again that we have to explicitly evaluate the iterator.
print(list(out))

In [None]:
# A more pythonic way: list comprehension...
x = [i for i in arr if i % 2 == 0]
x

In [None]:
# List comprehensions are the best!
P = [ 2**x for x in range(17) ]
P

In [None]:
# Can also do dictionaries...
D = {x:['no'] for x in range(10)}

In [None]:
D