# Tables: Part 1


### Table of Contents

1. <a href='#section 1'>Tables</a>

    a. <a href='#subsection 1a'>Attributes</a>

    b. <a href='#subsection 1b'>Transformations</a><br><br>

2. <a href='#section 2'>Sorting Tables</a>

In [None]:
# dependencies: THIS CELL MUST BE RUN
from datascience import *
import numpy as np
import math
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import ipywidgets as widgets
%matplotlib inline

## 1. Tables <a id='section 1'></a>

The last section covered four basic concepts of python: data, expressions, names, and functions. In this next section, we'll see just how much we can do to examine and manipulate our data with only these minimal Python skills.

**Tables** are fundamental ways of organizing and displaying data. Run the next cell to load the data.

In [None]:
ratings = Table.read_table("data/imdb_ratings.csv")
ratings

This table is organized into **columns**: one for each *category* of information collected:

You can also think about the table in terms of its **rows**. Each row represents all the information collected about a particular instance, which can be a person, location, action, or other unit. 


<div class="alert alert-warning">
<b>PRACTICE:</b> What do the rows in this table represent? By default only the first ten rows are shown. Can you see how many rows there are in total?
    </div>


Answer here: 

### 1a. Table Attributes <a id='subsection 1a'></a>

Every table has **attributes** that give information about the table, like the number of rows and the number of columns. Table attributes are accessed using the dot method. But, since an attribute doesn't perform an operation on the table, there are no parentheses (like there would be in a call expression).

Attributes you'll use frequently include `num_rows` and `num_columns`, which give the number of rows and columns in the table, respectively.

In [None]:
# get the number of columns
ratings.num_columns

<div class="alert alert-warning">
<b>PRACTICE:</b> Use `num_rows` to get the number of rows in our table.
</div>

In [None]:
# get the number of rows in the table


### 1b. Table Transformation <a id='subsection 1b'></a>

Not all of our columns are relevant to every question we want to ask. We can save computational resources and avoid confusion by *transforming* our table before we start work.

#### Subsetting columns with `select` and `drop`



### `select`
The `select` function is used to get a table containing only particular columns. `select` is called on a table using dot notation and takes one or more arguments: the name or names of the column or columns you want. Not this does not change the original table. To save your changes, you must assign your change a name.

In [None]:
# make a new table with only selected columns
ratings.select("Votes", "Title")

In [None]:
# No changes made to the original table
ratings

### `drop`

If instead you need all columns except a few, the `drop` function can get rid of specified columns. `drop` works very similarly to `select`: call it on the table using dot notation, then give it the name or names of what you want to drop.

In [None]:
# drop a column
ratings.drop("Decade")

<div class="alert alert-warning">
<b>PRACTICE:</b> Pick two columns from our table. Create a new table containing only those two columns two different ways: once using `select` and once using `drop`. 
</div>

In [None]:
# use select


In [None]:
# use drop


## 2. Sorting Tables <a id='section 2'></a>

The last section covered select and drop. In this next section, we'll test our knowledge and use what we have learned and understand how we can sort and manipulate data that are placed in tables.


### `show`

In a table, we can display a specific amount of rows using the `show` operation. The `show` operations allows you to enter the amount of rows you want displayed from a table.

In [None]:
# use show to display 20 rows
ratings.show(20)

<div class="alert alert-warning">
<b>PRACTICE:</b> Create a new table and display 33 different rows, using the 'show' operation. 
</div>


### `sort`


Some details about sort from the Data 8 lab 2 (Spring 2020):

1. The first argument to `sort` is the name of a column to sort by.
2. If the column has text in it, `sort` will sort alphabetically; if the column has numbers, it will sort numerically.
3. The `descending=True` bit is called an *optional argument*. It has a default value of `False`, so when you explicitly tell the function `descending=True`, then the function will sort in descending order.
4. The `distinct=True` bit is also an *optional argument*. When the function `distinct=True` is used, the function will delete any duplicate values based on the column_or_label value that was also passed in the `sort` function.
4. Rows always stick together when a table is sorted.  It wouldn't make sense to sort just one column and leave the other columns alone.  For example, in this case, if we sorted just the `Year` column, the movies would all end up with the wrong year.

The format for sorting code is written as....



`Table.sort(column_or_label, descending=False, distinct=False)`

In [None]:
# sort a column
ratings.sort("Year")

In [None]:
# Sorted from most recent year to oldest year
ratings.sort("Year", descending=True)

<div class="alert alert-warning">
<b>PRACTICE:</b> Sort the table from highest ranked movie to lowest ranked.
</div>

<div class="alert alert-warning">
<b>PRACTICE:</b> Sort the table by year in ascending order, making sure each year is distinct.
</div>

#### References

- Sections of "Intro to Jupyter", "Table Transformation" adapted from materials by Kelly Chen and Ashley Chien in [UC Berkeley Data Science Modules core resources](http://github.com/ds-modules/core-resources)
- "A Note on Errors" subsection and "error" image adapted from materials by Chris Hench and Mariah Rogers for the Medieval Studies 250: Text Analysis for Graduate Medievalists [data science module](https://github.com/ds-modules/MEDST-250).
- Rocket Fuel data and discussion questions adapted from materials by Zsolt Katona and Brian Bell, BerkeleyHaas Case Series

Author: Keeley Takimoto