<div style="width: 38.5%;">
    <p><strong>City College of San Francisco</strong><p>
    <hr>
    <p>MATH 108 - Foundations of Data Science</p>
</div>

# Lecture 06: Tables

Associated Textbook Sections: [3.4](https://inferentialthinking.com/chapters/03/4/Introduction_to_Tables.html)

---

## Overview

* [Tables](#Tables)
* [Attributes and Properties](#Attributes-and-Properties)
* [Some Table Methods](#Some-Table-Methods)

---

## Set Up the Notebook

In [None]:
from datascience import *
import numpy as np

---

## Tables

---

### Early Beginnings

<a href="https://academic.oup.com/book/4975/chapter-abstract/147431903" title="Tables and tabular formatting in Sumer, Babylonia, and Assyria, 2500 bce–50 ce"><img src="./Shuruppag_data_table.jpeg" alt="The world’s oldest datable mathematical table, from Shuruppag, c. 2600 BCE.  The first two columns contain identical lengths in descending order from 600 to 60 rods (c. 3600–360 m) and the final column contains the square area of their product" width=40%></a>

Ancient Mesopotamia (modern-day Iraq):
* Sumer, Babylonia, and Assyria had clay tablets from around 2600-1600 BCE that provide examples of some of the earliest recorded numerical tables
* The tablets demonstrate their proficiency in recording mathematical and astronomical data

The image above shows the world's oldest dateable mathematical table (on record), from the Sumerian city of Shuruppag, c. 2600 BCE.  The first two columns contain identical lengths in descending order from 600 to 60 rods (c. 3600-360 m) and the final column contains the square area of their product

---

### Table Structure

* The `datascience` library contains a data type called a `Table`.
* A `Table` is a sequence of labeled columns
* Each row represents one individual
* Data within a column represents one attribute of the individuals

<img src="./table_structure.png" alt="A table with the columns and rows indicated." width=50%>

---

### Loading Data

* Data analysis usually includes connecting to various data sources
* We focus on loading data from CSV files
    * Comma Separated Values
    * `.csv` extension
* The `Table.read_table` function will load the contents of a CSV into the notebook as a `Table`. 

---

### Demo: Loading Data

<a href="https://upload.wikimedia.org/wikipedia/commons/b/bb/US_States_by_Total_Area.svg" title="US States by Total Area"><img src="US_States_by_Total_Area.svg" alt="US states by area" width=40%></a>

Import data in `states_area.csv` that contains land and water area data sourced form [Wikipedia](https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_area).

In [None]:
states = ...
states

In [None]:
type(states)

---

## Attributes and Properties

A `Table` has information that we can access by command. For example:
* `t.labels` - the labels of a table called `t`
* `t.num_columns` - the number of columns in `t`
* `t.num_rows` - the number of rows in `t`
* `t.rows` - a collection of all the rows in `t`

---

### Demo: Attributes and Properties

There are various attributes of a table that you can access as well using the dot notation such as `labels`, `num_columns`, and `num_rows`.

In [None]:
...

In [None]:
...

In [None]:
...

In [None]:
...

---

## Some Table Methods

There is a collection of methods (functions) associated with every `Table` created. For example:
* `t.show(n)` - displays the first `n` rows of a table called `t`
* `t.select(label)` - constructs a new table with just the specified columns
* `t.drop(label)` - constructs a new table in which the specified columns are omitted
* `t.sort(label)` - constructs a new table with rows sorted by the specified column
* `t.where(label, condition)` - constructs a new table with just the rows that match the condition
    * Initially, the `condition` will be made up using Predicates such as `are.above`, `are.equal_to`, etc. 
* More can be found in the [`datascience` documentation](https://datascience.readthedocs.io/en/master/tables.html)

---

### Demo: `show`

Explore the `show` table method.

In [None]:
...

In [None]:
...

In [None]:
# show(3) does not produce a Table
states_show_3 = ...
type(states_show_3)

---

### Demo: `select`

Use the `select` table method to select columns by column labels and column indexes.

In [None]:
...

In [None]:
...

In [None]:
# A NameError where the column name was used incorrectly.
# states.select(State, 'Total Area (sq mi)')

In [None]:
# A ValueError where the column name was used incorrectly.
# states.select('State', 'Total Area')

---

### Demo: `drop`

Use the `drop` table method to drop columns by name and by index.

In [None]:
...

In [None]:
...

---

### Demo: `sort`

Use the `sort` table method to sort the data in the table by a certain column.

In [None]:
...

In [None]:
...

In [None]:
...

In [None]:
...

---

### Demo: `where`

Use the `where` table method to filter the data in the table.

In [None]:
# Nothing Filtered
...

In [None]:
...

In [None]:
...

In [None]:
...

<footer>
    <hr>
    <p>Adopted from UC Berkeley DATA 8 course materials.</p>
    <p>This content is offered under a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC Attribution Non-Commercial Share Alike</a> license.</p>
</footer>