# Day 1 Morning Review

## Intro to Jupyter Notebooks

There are different types of cells. This is a markdown cell! Markdown is just a text formatting "language" that we use to make our notebooks easy to read. The README files in each repo are also markdown files.

In [1]:
# This is a code cell! In particular, this is a comment.

To change the type of a cell, use the dropdown menu above. Or, use `Esc` to move into "command mode," where the cell outline is blue rather than green. In command mode, `y` makes the cell a code cell and `m` makes the cell a markdown cell. 

You can do all sorts of things in command mode. You can select multiple cells with `Shift` and then use `Shift + m` to merge all of those cells.

I recommend playing around with the buttons at the top of the notebook. Figure out what they all do and, if you like, commit to memory some keyboard shortcuts for them.

Above the buttons, you see a series of tabs (File, Edit, View, etc.) The important one to understand is the Kernel tab. In it, you can restart your Python kernel, clear the output, etc. You should be able to "Restart and Run All" on your notebook to ensure everything is running as you expect it to you when you run it from top to bottom.

## Data Types Review

### Numbers

There are basically only two that you need to worry about (for now!):

In [2]:
# Integers
5

5

In [3]:
# Show the type built-in operation and the int built-in operation
type(5) == int

True

In [4]:
type(5)

int

In [5]:
# Floats
# Check: why are they called 'floats'?

# Same thing, show type and int
type(5 / 3) == int # is not an integer

False

In [6]:
type(5 / 3) == float

True

In [7]:
# Demonstrate that it is actually a float
type(5 / 3)

float

In [8]:
# Show the actual float created
5 / 3

1.6666666666666667

In [9]:
# Demonstrate that something can be NOT EQUAL to a type
type('5') != int # but this is just a string

True

### Sequential Data Types

Why are they 'sequential'? Because they have _elements_! This means they can be *indexed* and *sliced*. This also means they can be _iterated upon_. They are _iterables_. We'll discuss iterables in greater depth when we discuss control flow and list comprehensions. Note: not all iterables are sequential types!

#### Tuples

Tuples are the most 'basic' sequential type. They are **immutable** and **ordered**, and there's not too much you can do with them. But they're the 'default' data structure when there is more than one element:

In [10]:
# Set-up x, y, and z variables
x = 1
y = 2
z = x, y
z

(1, 2)

In [11]:
z[1]

2

In [12]:
z = x, y, x, y

In [13]:
z

(1, 2, 1, 2)

In [14]:
z = 1, 2, 3, 4, 5

In [15]:
z

(1, 2, 3, 4, 5)

In [16]:
100_000

100000

In [17]:
# Show that z is a tuple
type(z)

tuple

In [18]:
# Set-up x, y, and z variables
x = 1
y = 2
z = x, y
z

(1, 2)

`x,y` is the same as `(x,y)`:

In [19]:
# Because tuples are immutable and ordered we can assign new variables with tuple unpacking
a, b = z

In [20]:
print("a:", a)
print("b:", b)

a: 1
b: 2


But we typically think of tuples coming to us in parentheses.

#### Lists

Lists are the 'go to' sequential type / data structure / object in native Python. Like tuples, they contain **ordered** heterogenous elements. Unlike tuples, they are **mutable**, which means we can do all sorts of things to change them, like so:

LISTS ARE MUTABLE

In [21]:
# Lists come in square brackets:
x = [1,2,7,4,5, 'stuff', 'things', '1.67', 1.75]

len(x)

9

In [22]:
x[0]

1

Another example of a list method is sort. (A method is just a function that is defined for objects of a certain type. Instead of calling it directly, we call it _on_ an object, like so: `object.method(arguments)`.)

In [23]:
# Because they are mutable, we can do stuff with them:
x.reverse()

In [24]:
x

[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1]

In [25]:
# An inplace operation meaning it sticks
x.append(6)
print(z)
x.pop()
x

(1, 2)


[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1]

In [26]:
"""Python is zero-indexed. Note that not all programing languages are like this. 
For example in MATLAB and Fortran,indecies begin from 1."""

'Python is zero-indexed. Note that not all programing languages are like this. \nFor example in MATLAB and Fortran,indecies begin from 1.'

The square brackets have this syntax: `[start:stop:step]`. To go backwards, specify a negative `step`.

In [27]:
x

[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1]

In [28]:
# If no colons are used, 'start' is assumed. Return the last element.
x[2:]

['things', 'stuff', 5, 4, 7, 2, 1]

In [29]:
# Start is unspecified, so we start at the beginning. 'Stop' is exclusive!
x[:5]

[1.75, '1.67', 'things', 'stuff', 5]

In [30]:
# Go from the beginning to the end by twos.
x[::2]

[1.75, 'things', 5, 7, 1]

In [31]:
# Go backwards from index 5 to index 3 (because stop is exclusive)
x[::-1]

[1, 2, 7, 4, 5, 'stuff', 'things', '1.67', 1.75]

To access the element of `x` corresponding to the index `i`, use `x[i]`:

In [32]:
x

[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1]

In [33]:
x.append([1, 3, 5, 7])

In [34]:
x

[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1, [1, 3, 5, 7]]

In [35]:
x.extend([1, 3, 5, 7])

In [36]:
x

[1.75, '1.67', 'things', 'stuff', 5, 4, 7, 2, 1, [1, 3, 5, 7], 1, 3, 5, 7]

#### IMPORTANT COMMANDS FOR LISTS
- .append() = Add value to end of list verbose to values you are adding, such as appending a list to a list will put a list inside the list.
- .extend() = Add values to end of list, different to append as extend will not nest a list inside a list, but add the values directly into the current list.
- .reverse() = Reverse the values in the list
- .pop() = removes the last value in the list
- .insert() = adds values to a list at a specific location in the list
- .remove() = will remove the first value that appears for the designated amount.

### An important distinction:

You've noticed that sequential data types can be _nested_. This means that we need to respect the distinction between the type of an object and the type of the elements that object contains. 

An example:

In [37]:
# What type is x?
x = 1
type(x)

int

In [38]:
# What type is y?
y = [1]
type(y)

list

This distinction becomes very important when we begin working with Numpy and Pandas. For example, we might be dealing with a column of numbers. In Pandas terminology, the column's type is a `Series`, but the type (`dtype`) of the elements in that column is `int`.

#### Sets

Sets are inspired by the mathematical notion of a set. They are **mutable** but **unordered** collections of **unique** items. They come in `{}` most of the time. (Note: you can create sets with `{}` _or_ with the `set` function. Likewise, there are `list`, `tuple`, and `dict` functions.)

In [39]:
this_is_a_set = {2, 3, 5, 3, 5, 8}
this_is_a_set # Notice that duplicates were removed

{2, 3, 5, 8}

In [40]:
z = {2, 7, 5}

In [41]:
z

{2, 5, 7}

In [42]:
# The elements of y that are also in z for .intersection()
this_is_a_set.intersection(z)

{2, 5}

In [43]:
# combine the two sets with a .union()
this_is_a_set.union(z)

{2, 3, 5, 7, 8}

#### Dictionaries

The best way to understand dictionaries it to think of how a regular language dictionary works. Instead of looking up elements according to their location in a certain order, you look up elements according to a certain 'key,' which may be any immutable object. 

Dictionaries are 'semi-ordered' collection of **key, value** pairs. I say 'semi-ordered' because Python recently implemented a kind of order for dictionaries. Nevertheless, Python throws an error if you try and access the 'first' value in a dictionary, because that is not how they are meant to be used!

In [44]:
instructors = {"Matt": {"likes": ['stats', 'pugs'], "phone_number": 205782998,},
               "Adi": {"likes": ['reading', 'movies', 'travel'], "phone_number": 800723589}
}

In [45]:
# Look up the key 'Brian' in the dictionary 'instructor'
instructors['Adi']

{'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589}

Notice that as expected, the code above returns a dictionary. So the output can be treated as a dictionay.

In [46]:
# Do one further look up to find instructor's likes
instructors['Adi']['likes']

['reading', 'movies', 'travel']

In [47]:
# Demonstrate that when indexing the key it needs to be exact 
instructors['MAtt']

KeyError: 'MAtt'

The error message above is expected - because I used a key that does not exist in the dictionary

In [48]:
# .get is a safe way of looking up a key if you're not sure it exists
instructors.get('MAtt', 'NOPE not a Key')

'NOPE not a Key'

In [49]:
instructors.update({"Josh": {"likes": ["music", "data science", "food"], "phone_number": 2025551836}})

In [50]:
instructors

{'Matt': {'likes': ['stats', 'pugs'], 'phone_number': 205782998},
 'Adi': {'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589},
 'Josh': {'likes': ['music', 'data science', 'food'],
  'phone_number': 2025551836}}

In [51]:
# .update is a way to add new values to the dictionary
instructors.update({"Kihoon": {"likes": ["his baby", "his wife", "food"], "phone_number": 865305}})

In [52]:
# Dictionary keys:
instructors.keys()

dict_keys(['Matt', 'Adi', 'Josh', 'Kihoon'])

In [53]:
# I can iterate through the keys of a dictionary:
for key in instructors.keys():
    print(key)

Matt
Adi
Josh
Kihoon


In [54]:
# Dictionary values
instructors.values()

dict_values([{'likes': ['stats', 'pugs'], 'phone_number': 205782998}, {'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589}, {'likes': ['music', 'data science', 'food'], 'phone_number': 2025551836}, {'likes': ['his baby', 'his wife', 'food'], 'phone_number': 865305}])

In [55]:
# I can iterate through the dictionary values:
for value in instructors.values():
    print(value)

{'likes': ['stats', 'pugs'], 'phone_number': 205782998}
{'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589}
{'likes': ['music', 'data science', 'food'], 'phone_number': 2025551836}
{'likes': ['his baby', 'his wife', 'food'], 'phone_number': 865305}


In [56]:
# I can see the key:value pairs using .items():
instructors.items()

dict_items([('Matt', {'likes': ['stats', 'pugs'], 'phone_number': 205782998}), ('Adi', {'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589}), ('Josh', {'likes': ['music', 'data science', 'food'], 'phone_number': 2025551836}), ('Kihoon', {'likes': ['his baby', 'his wife', 'food'], 'phone_number': 865305})])

In [57]:
# And I can iterate through the key:value pairs, but I need to use TWO iterables: key, value
for key, value in instructors.items():
    print(key, value)

Matt {'likes': ['stats', 'pugs'], 'phone_number': 205782998}
Adi {'likes': ['reading', 'movies', 'travel'], 'phone_number': 800723589}
Josh {'likes': ['music', 'data science', 'food'], 'phone_number': 2025551836}
Kihoon {'likes': ['his baby', 'his wife', 'food'], 'phone_number': 865305}


#### IMPORTANT COMMANDS FOR DICTIONARIES
- .get()    = calls dictionary keys to show their values (similar to indexing on the name, but can add a response if value is not in dictionary)
- .values() = shows the dictionary values
- .keys()   = Shows the dictionary keys
- .items()  = Shows both keys and values
- .update() = update with a new dictionary key/values

## Indexing and Slicing Sequential Data Types

Everywhere in Python, the square brackets (`[]`) are used to *index* and to *slice* things. You'll see them all the time when you subset your data.

But it is important to know how to do basic indexing and slice on the basic Python sequential data types.

#### The same bracket slicing syntax is used elsewhere:

In [58]:
string = "DATA IS THE BEST!"
# Remember that strings are sequential data types!

In [59]:
# String is reversed
string[::-1]

'!TSEB EHT SI ATAD'

In [60]:
# To split the string on whitespace, btw:
# sep is actually the default though
string.split(sep='T')

['DA', 'A IS ', 'HE BES', '!']

#### IMPORTANT METHODS FOR STRINGS
- There's a lot. Give them a shot!

You can check out the reference guide to string functions here:

https://docs.python.org/3/library/string.html