# Stuff we want to do to `DataFrame`s

+ Extract a column
+ Extract a subset of the columns
+ Extract particular rows by index
+ Extract rows that match a criterion in a column
  - "I want all the rows where the `shoe_size` column has value > `7.5`."
  
This gets confusing if we're not careful. We'll explore a few concepts that will help us.

# `print` oddities

Two ways to display values in a notebook:
+ Last expression in a Code cell, which is displayed by JupyterHub
+ Using function `print`, which prints a string representation of the value

### Printing strings: there's a difference!

In [None]:
habitat = 'forest'

In [None]:
habitat

In [None]:
print(habitat)

Built-in function `print` doesn't include the quotes.

### Printing lists: no difference in how it looks

In [None]:
habitats = ['forest', 'park', 'street']

In [None]:
habitats

In [None]:
print(habitats)

### Printing DataFrames: big difference!

In [None]:
import pandas as df
faithful_df = df.read_csv('faithful.csv')
faithful_head_df = faithful_df.head() # Just keep the top few lines to save vertical space

In [None]:
faithful_head_df

In [None]:
print(faithful_head_df)

So: to show the contents of a `DataFrame` or a `Series`, it's nicer to use an expression as the last thing in a cell rather than calling `print`.

# Python type `list`

+ A list begins with `[` and ends with `]`.
+ In between are a comma-separated list of values.
+ Each value has an _index_, starting at `0`.

### Indexing

In [None]:
habitats = ['forest', 'park', 'street']
habitats[0]

In [None]:
# Here is nicer output.
print(f'There are {len(habitats)} items in the habitats list.')
print(f'0: {habitats[0]}')
print(f'1: {habitats[1]}')
print(f'2: {habitats[2]}')
print(f'3: {habitats[3]}') # This will result in an error.

You can also count from the end, which starts at index `-1`.

In [None]:
print(habitats[-1])
print(habitats[-2])
print(habitats[-3])
print(habitats[-4]) # Another error

### Adding lists

You can add `list`s together. This makes a new `list`.

In [None]:
habitats + habitats

In [None]:
[1, 3, 5] + [2, 4, 6]

### I'll have a slice of that, please

You can extract a sublist like this:

In [None]:
coyote_counts = [59, 9, 10, 89, 19, 23, 54, 12, 0, 29, 8, 23, 30, 30]
coyote_counts[0:3]

Slicing: start at the first index, and go up to but not including the second index.

### Slicing tricks

You can omit one or both of the indexes:

In [None]:
coyote_counts[:3] # Up to but not including index 3

In [None]:
coyote_counts[3:] # Everything from index three to the end

In [None]:
coyote_counts[:] # Creates a copy of the whole list

This `[:]` syntax and concept will be important when we get to `DataFrame`s in a bit.

In [None]:
# Challenge: how to extract [9, 10, 89, 19]?
coyote_counts[?:?]

In [None]:
# How do we extract the last 4 numbers, [8, 23, 30, 30]?
coyote_counts[?:?]

In [None]:
# How do we extract the 
coyote_counts[?:?]

### Heterogeneous lists

Lists can contain a mix of types.

In [None]:
list1 = ['A', 1, 'B', 2]
list2 = ['C', 3.14159]
list1 + list2

### Some functions that use `list`s

In [None]:
print(len(coyote_counts))
print(sum(coyote_counts))
print(min(coyote_counts))
print(max(coyote_counts))

# Dictionaries: type `dict`

Dictionaries map keys to values. In human dictionaries, the keys are words and the values are their definitions.

In [None]:
# Here's a dictionary!
urbanwildlife = {
  'habitat': ['forest', 'park', 'street', 'forest', 'street', 'park', 'forest', 'park', 'street', 'forest', 'street', 'park', 'street', 'park'],
  'coyote': [59, 9, 10, 89, 19, 23, 54, 12, 0, 29, 8, 23, 30, 30],
  'dog': [72, 197, 8811, 3, 555, 374, 1535, 101, 2216, 23, 1082, 35, 1635, 1469],
  'fox': [3, 10, 63, 54, 251, 43, 69, 57, 4, 6, 0, 6, 10, 3],
  'raccoon': [986, 64, 129, 213, 221, 73, 135, 24, 17, 528, 25, 106, 140, 114], 
  'site' : ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N'],
}

* Each line inside a dictionary contains a _key_ then a `:` then a _value_.
* In the example dictionary above, each value is a list.
* The lines end with commas `,` (and it's okay if the last line has a comma, Python ignores it).
* Indentation!

### Looking up keys to get values

Given a key, we can look up its value:

In [None]:
 urbanwildlife['fox']

What is `urbanwildlife['fox'][-1]`?

What is `urbanwildlife['habitat'][:3]`?

### Making your own `DataFrame`s

You can use dictionaries to make `DataFrame`s!

In [None]:
import pandas as pd   # Hey, that's different!

df = pd.DataFrame(urbanwildlife)
df.head(3)