In [None]:
# Run cells by clicking on them and hitting CTRL + ENTER on your keyboard
from IPython.display import YouTubeVideo
from datascience import *
import numpy as np
%matplotlib inline

# Module 2.2 Part 4: Tables Review

In this lecture guide, you'll review the table manipulation operations that you've seen so far. You'll
also discover how comparison operators and the `where` method can be used to subset the rows of table.

4 videos make up this notebook, for a total run time of 30:19.

1. [Table Methods Review](#section1) *1 video, total runtime 7:16*
2. [Advanced Where](#section2) *1 video, total runtime 8:59*
3. [Check for Understanding](#section3) *2 videos, total runtime 14:04*

Textbook readings: [Chapter 8: Functions and Tables](https://www.inferentialthinking.com/chapters/08/Functions_and_Tables.html)

<a id='section1'></a>
## 1. Table Methods Review

In the first video of this lecture guide, we'll review the most important table methods that we've seen so far.
Here's a link to the `datascience` module's [table methods documentation](http://data8.org/datascience/tables.html).
It provides a complete list of methods and examples. The course's [Python Reference](http://data8.org/su20/python-reference.html)
sheet is also a helpful resource, and is easier to navigate than the `datascience` documentation.

In [None]:
YouTubeVideo('tGQfKdCISbA')

The `drinks` and `discounts` table presented in lecture 12.1 are provided in the cell below.
Use them to review the table methods alongside Professor DeNero.

In [None]:
# create the tables
drinks = Table(['Drink', 'Cafe', 'Price']).with_rows([
    ['Milk Tea', 'Tea One', 4],
    ['Espresso', 'Nefeli', 2],
    ['Coffee', 'Nefeli', 3],
    ['Espresso', "Abe's", 2]
])

discounts = Table().with_columns(
    'Coupon % Off', make_array(5, 50, 25),
    'Location', make_array('Tea One', 'Nefeli', 'Tea One')
)

In [None]:
# follow along here!
...

<a id='section2'></a>
## 2. Advanced Where

Next, you'll learn how efficiently subset the rows of a table using comparison operators and the `where` method.

In [None]:
YouTubeVideo('nUZOdd-w8-s')

Using the NBA salaries data in the cell below, filter out all players whose salary was less than \\$10,000,000.
How many players had a salary of \\$10,000,000 or more during the 2015-2016 season?

In [None]:
# load the nba player salary data
nba_salaries =  Table.read_table('https://www.inferentialthinking.com/data/nba_salaries.csv')

# subset the table
nba_salaries_over_10_mil = ...

# count the number of players getting payed over $10,000,000
...

<details>
    <summary>Solution</summary>
    # subset the table <br>
    nba_salaries_over_10_mil = nba_salaries.where(nba_salaries.column(3) > 10) <br>
    <br>
    # count the number of players getting payed over \$10,000,000 <br>
    nba_salaries_over_10_mil.num_rows <br>
</details>
<br>

<a id='section3'></a>
## 3. Check for Understanding

**A. True or False. The first argument of the** `apply` **method is the column of the table to which the specified function
will be applied.**

<details>
    <summary>Solution</summary>
    False. The first argument corresponds to the function to be applied to the column, which is provided as the
    second argument.
</details>
<br>

**B. Which table method does the following definition correspond to?**
"Return a new Table with selected rows taken by index."

<details>
    <summary>Solution</summary>
    This definition corresponds to the "take" method.
</details>
<br>

**C. True or False. The** `sort` **method returns a table of rows sorted in decreasing order of the values
in the selected column.**

<details>
    <summary>Solution</summary>
    False. Values are sorted in increasing order by default.
</details>
<br>

**D. Attempt the question presented in the following video. Use the code cell that is provided directly below the video.**

In [None]:
YouTubeVideo('79W7XQHnWxo')

In [None]:
# create the tables
drinks = Table(['Drink', 'Cafe', 'Price']).with_rows([
    ['Milk Tea', 'Tea One', 4],
    ['Espresso', 'Nefeli', 2],
    ['Coffee', 'Nefeli', 3],
    ['Espresso', "Abe's", 2]
])

discounts = Table().with_columns(
    'Coupon % Off', make_array(5, 50, 25),
    'Location', make_array('Tea One', 'Nefeli', 'Tea One')
)

# answer below:
...

**E. Spring 2016 Midterm, Question 2b. Use the code cells and data in the cells below the video for your scratchwork.**

In [None]:
YouTubeVideo('4ljo9LqtmYI')

In [None]:
# prepare the table
trips = Table.read_table('https://www.inferentialthinking.com/data/trip.csv') \
    .where('Duration', are.below(1800)) \
    .select(3, 6, 1) \
    .relabeled(['Start Station', 'End Station', 'Duration'], ['Start', 'End', 'Duration'])

In [None]:
# scratchwork here
...