# STOR 120 - Lab 2: Table Operations

Welcome to Lab 2!  In this lab you will: 
* import a module and practice table operations
* create tables and practice analyzing them with your knowledge of table operations.
* see how to represent and manipulate another fundamental type of data: text.

Recommended Reading:
 * [Introduction to tables](https://www.inferentialthinking.com/chapters/03/4/Introduction_to_Tables)

First run the cell below.

In [137]:
# Just run this cell

import numpy as np
from datascience import *

# 1. Importing code

![imports](https://external-preview.redd.it/ZVPjiFo_Ubl4JeiU63SaTjdIoq5zveSnNZimKpgn2I8.png?auto=webp&s=bf32c94b630befa121075c1ae99b2599af6dedc5) 

[source](https://www.reddit.com/r/ProgrammerHumor/comments/cgtk7s/theres_no_need_to_reinvent_the_wheel_oc/)

Most programming involves work that is very similar to work that has been done before.  Since writing code is time-consuming, it's good to rely on others' published code when you can.  Rather than copy-pasting, Python allows us to **import modules**. A module is a file with Python code that has defined variables and functions. By importing a module, we are able to use its code in our own notebook.

Python includes many useful modules that are just an `import` away.  We'll look at the `math` module as a first example. The `math` module is extremely useful in computing mathematical expressions in Python. 

Suppose we want to very accurately compute the area of a circle with a radius of 5 meters.  For that, we need the constant $\pi$, which is roughly 3.14.  Conveniently, the `math` module has `pi` defined for us:

In [139]:
# Just run this cell

import math
radius = 5
area_of_circle = radius**2 * math.pi
area_of_circle

78.53981633974483

In the code above, the line `import math` imports the math module. This statement creates a module and then assigns the name `math` to that module. We are now able to access any variables or functions defined within `math` by typing the name of the module followed by a dot, then followed by the name of the variable or function we want.

    <module name>.<name>

**Question 1.1.** The module `math` also provides the name `e` for the base of the natural logarithm, which is roughly 2.71. Compute $\sqrt{e^{2\pi}+30\pi}$, giving it the name `near_twentyfive`.

*Remember: You can access `pi` from the `math` module as well!*

<!--
BEGIN QUESTION
name: q11
-->

In [279]:
import math
near_twentyfive = math.sqrt(math.e**(2*math.pi))
near_twentyfive

23.14069263277926

## 1.1. Accessing functions

In the question above, you accessed variables within the `math` module. 

**Modules** also define **functions**.  For example, `math` provides the name `log` for the natural log function. Having imported `math` already, we can write `math.log(3)` to compute the natural log of 3. 

For your reference, below are some more examples of functions from the `math` module.

Notice how different functions take in different numbers of arguments. Often, the [documentation](https://docs.python.org/3/library/math.html) of the module will provide information on how many arguments are required for each function.

*Hint: If you press `shift+tab` while next to the function call, the documentation for that function will appear*

In [145]:
# Calculating logarithms (the logarithm of 8 in base 2).
# The result is 3 because 2 to the power of 3 is 8.
math.log(8, 2)

3.0

In [146]:
# Calculating square roots.
math.sqrt(5)

2.23606797749979

There are various ways to import and access code from outside sources. The method we used above — `import <module_name>` — imports the entire module and requires that we use `<module_name>.<name>` to access its code. 

We can also import a specific constant or function instead of the entire module. Notice that you don't have to use the module name beforehand to reference that particular value. However, you do have to be careful about reassigning the names of the constants or functions to other values!

In [148]:
# Importing just cos and pi from math.
# We don't have to use `math.` in front of cos or pi
import math
from math import cos, pi
print(cos(pi))

# We do have to use it in front of other functions from math, though
math.log(pi)

-1.0


1.1447298858494002

Or we can import every function and value from the entire module.

In [150]:
# Lastly, we can import everything from math using the *
# Once again, we don't have to use 'math.' beforehand 
from math import *
log(pi)

1.1447298858494002

Don't worry too much about which type of import to use. It's often a coding style choice left up to each programmer. In this course, you'll always import the necessary modules when you run the setup cell (like the first code cell in this lab).

Let's move on to practicing some of the table operations you've learned in lecture!

# 2. Table operations

The table `farmers_markets.csv` contains data on farmers' markets in the United States  (data collected by the USDA).  Each row represents one such market.

Run the next cell to load the `farmers_markets` table.

In [153]:
# Just run this cell
# The farmers market csv file must be saved in the same directory as this lab!

farmers_markets = Table.read_table('farmers_markets.csv')

Let's examine our table to see what data it contains.

**Question 2.1** Use the method `show` to display the first 5 rows of `farmers_markets`. 

*Note:* The terms "method" and "function" are technically not the same thing, but for the purposes of this course, we will use them interchangeably.

**Hint:** `tbl.show(3)` will show the first 3 rows of `tbl`. Additionally, make sure not to call `.show()` without an argument, as this will crash your kernel!


In [155]:
farmers_markets.show(5)

FMID,MarketName,street,city,County,State,zip,x,y,Website,Facebook,Twitter,Youtube,OtherMedia,Organic,Tofu,Bakedgoods,Cheese,Crafts,Flowers,Eggs,Seafood,Herbs,Vegetables,Honey,Jams,Maple,Meat,Nursery,Nuts,Plants,Poultry,Prepared,Soap,Trees,Wine,Coffee,Beans,Fruits,Grains,Juices,Mushrooms,PetFood,WildHarvested,updateTime,Location,Credit,WIC,WICcash,SFMNP,SNAP,Season1Date,Season1Time,Season2Date,Season2Time,Season3Date,Season3Time,Season4Date,Season4Time
1012063,Caledonia Farmers Market Association - Danville,,Danville,Caledonia,Vermont,5828,-72.1403,44.411,https://sites.google.com/site/caledoniafarmersmarket/,https://www.facebook.com/Danville.VT.Farmers.Market/,,,,Y,N,Y,Y,Y,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,Y,Y,Y,Y,Y,N,Y,Y,Y,N,Y,N,Y,N,6/28/2016 12:10:09 PM,,Y,Y,N,Y,N,06/08/2016 to 10/12/2016,Wed: 9:00 AM-1:00 PM;,,,,,,
1011871,Stearns Homestead Farmers' Market,6975 Ridge Road,Parma,Cuyahoga,Ohio,44130,-81.7286,41.3751,http://Stearnshomestead.com,,,,,-,N,Y,N,N,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,Y,N,N,N,N,N,N,N,Y,N,N,N,Y,N,4/9/2016 8:05:17 PM,,Y,Y,N,Y,Y,06/25/2016 to 10/01/2016,Sat: 9:00 AM-1:00 PM;,,,,,,
1011878,100 Mile Market,507 Harrison St,Kalamazoo,Kalamazoo,Michigan,49007,-85.5749,42.296,http://www.pfcmarkets.com,https://www.facebook.com/100MileMarket/?fref=ts,,,https://www.instagram.com/100milemarket/,N,N,Y,Y,N,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,N,Y,Y,Y,N,Y,N,N,Y,Y,N,N,N,N,4/16/2016 12:37:56 PM,,Y,Y,N,Y,Y,05/04/2016 to 10/12/2016,Wed: 3:00 PM-7:00 PM;,,,,,,
1009364,106 S. Main Street Farmers Market,106 S. Main Street,Six Mile,,South Carolina,29682,-82.8187,34.8042,http://thetownofsixmile.wordpress.com/,,,,,-,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,2013,,Y,N,N,N,N,,,,,,,,
1010691,10th Steet Community Farmers Market,10th Street and Poplar,Lamar,Barton,Missouri,64759,-94.2746,37.4956,,,,,http://agrimissouri.com/mo-grown/grodetail.php?type=mo-g ...,-,N,Y,N,Y,N,Y,N,Y,Y,Y,Y,N,Y,N,N,Y,Y,Y,Y,N,N,N,N,Y,N,N,N,N,N,10/28/2014 9:49:46 AM,,Y,N,N,N,N,04/02/2014 to 11/30/2014,Wed: 3:00 PM-6:00 PM;Sat: 8:00 AM-1:00 PM;,,,,,,


Notice that some of the values in this table are missing, as denoted by "nan." This means either that the value is not available (e.g. if we don’t know the market’s street address) or not applicable (e.g. if the market doesn’t have a street address). You'll also notice that the table has a large number of columns in it!

### `num_columns`

The table property `num_columns` returns the number of columns in a table. (A "property" is just a method that doesn't need to be called by adding parentheses.)

Example call: `<tbl>.num_columns`

**Question 2.2.** Use `num_columns` to find the number of columns in our farmers' markets dataset.

Assign the number of columns to `num_farmers_markets_columns`.

<!--
BEGIN QUESTION
name: q22
-->

In [157]:
num_farmers_markets_columns = farmers_markets.num_columns
print("The table has", num_farmers_markets_columns, "columns in it!")

The table has 59 columns in it!


### `select`

Most of the columns are about particular products -- whether the market sells tofu, pet food, etc.  If we're not interested in that information, it just makes the table difficult to read.  This comes up more than you might think, because people who collect and publish data may not know ahead of time what people will want to do with it.

In such situations, we can use the table method `select` to choose only the columns that we want in a particular table. It takes any number of arguments. Each should be the name of a column in the table. It returns a new table with only those columns in it. The columns are in the order *in which they were listed as arguments*.

For example, the value of `farmers_markets.select("MarketName", "State")` is a table with only the name and the state of each farmers' market in `farmers_markets`.



**Question 2.3.** Use `select` to create a table with only the markets' names, cities, counties, states, zip codes (`zip`), latitude (`x`), and longitude (`y`).  Call that new table `farmers_markets_locations`.

*Hint:* Make sure to be exact when using column names with `select`; double-check capitalization!

<!--
BEGIN QUESTION
name: q23
-->

In [159]:
farmers_markets_locations = farmers_markets.select("MarketName", "city","County","State", "zip", "x", "y")
farmers_markets_locations

MarketName,city,County,State,zip,x,y
Caledonia Farmers Market Association - Danville,Danville,Caledonia,Vermont,5828,-72.1403,44.411
Stearns Homestead Farmers' Market,Parma,Cuyahoga,Ohio,44130,-81.7286,41.3751
100 Mile Market,Kalamazoo,Kalamazoo,Michigan,49007,-85.5749,42.296
106 S. Main Street Farmers Market,Six Mile,,South Carolina,29682,-82.8187,34.8042
10th Steet Community Farmers Market,Lamar,Barton,Missouri,64759,-94.2746,37.4956
112st Madison Avenue,New York,New York,New York,10029,-73.9493,40.7939
12 South Farmers Market,Nashville,Davidson,Tennessee,37204,-86.7907,36.1184
125th Street Fresh Connect Farmers' Market,New York,New York,New York,10027,-73.9482,40.809
12th & Brandywine Urban Farm Market,Wilmington,New Castle,Delaware,19801,-75.5345,39.7421
14&U Farmers' Market,Washington,District of Columbia,District of Columbia,20009,-77.0321,38.917


### `drop`

`drop` serves the same purpose as `select`, but it takes away the columns that you provide rather than the ones that you don't provide. Like `select`, `drop` returns a new table.

**Question 2.4.** Suppose you just didn't want the `FMID` and `updateTime` columns in `farmers_markets`.  Create a table that's a copy of `farmers_markets` but doesn't include those columns.  Call that table `farmers_markets_without_fmid`.

<!--
BEGIN QUESTION
name: q24
-->

In [161]:
farmers_markets_without_fmid = farmers_markets.drop("FMID", "updateTime")
farmers_markets_without_fmid

MarketName,street,city,County,State,zip,x,y,Website,Facebook,Twitter,Youtube,OtherMedia,Organic,Tofu,Bakedgoods,Cheese,Crafts,Flowers,Eggs,Seafood,Herbs,Vegetables,Honey,Jams,Maple,Meat,Nursery,Nuts,Plants,Poultry,Prepared,Soap,Trees,Wine,Coffee,Beans,Fruits,Grains,Juices,Mushrooms,PetFood,WildHarvested,Location,Credit,WIC,WICcash,SFMNP,SNAP,Season1Date,Season1Time,Season2Date,Season2Time,Season3Date,Season3Time,Season4Date,Season4Time
Caledonia Farmers Market Association - Danville,,Danville,Caledonia,Vermont,5828,-72.1403,44.411,https://sites.google.com/site/caledoniafarmersmarket/,https://www.facebook.com/Danville.VT.Farmers.Market/,,,,Y,N,Y,Y,Y,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,Y,Y,Y,Y,Y,N,Y,Y,Y,N,Y,N,Y,N,,Y,Y,N,Y,N,06/08/2016 to 10/12/2016,Wed: 9:00 AM-1:00 PM;,,,,,,
Stearns Homestead Farmers' Market,6975 Ridge Road,Parma,Cuyahoga,Ohio,44130,-81.7286,41.3751,http://Stearnshomestead.com,,,,,-,N,Y,N,N,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,Y,N,N,N,N,N,N,N,Y,N,N,N,Y,N,,Y,Y,N,Y,Y,06/25/2016 to 10/01/2016,Sat: 9:00 AM-1:00 PM;,,,,,,
100 Mile Market,507 Harrison St,Kalamazoo,Kalamazoo,Michigan,49007,-85.5749,42.296,http://www.pfcmarkets.com,https://www.facebook.com/100MileMarket/?fref=ts,,,https://www.instagram.com/100milemarket/,N,N,Y,Y,N,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,N,Y,Y,Y,N,Y,N,N,Y,Y,N,N,N,N,,Y,Y,N,Y,Y,05/04/2016 to 10/12/2016,Wed: 3:00 PM-7:00 PM;,,,,,,
106 S. Main Street Farmers Market,106 S. Main Street,Six Mile,,South Carolina,29682,-82.8187,34.8042,http://thetownofsixmile.wordpress.com/,,,,,-,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,,Y,N,N,N,N,,,,,,,,
10th Steet Community Farmers Market,10th Street and Poplar,Lamar,Barton,Missouri,64759,-94.2746,37.4956,,,,,http://agrimissouri.com/mo-grown/grodetail.php?type=mo-g ...,-,N,Y,N,Y,N,Y,N,Y,Y,Y,Y,N,Y,N,N,Y,Y,Y,Y,N,N,N,N,Y,N,N,N,N,N,,Y,N,N,N,N,04/02/2014 to 11/30/2014,Wed: 3:00 PM-6:00 PM;Sat: 8:00 AM-1:00 PM;,,,,,,
112st Madison Avenue,112th Madison Avenue,New York,New York,New York,10029,-73.9493,40.7939,,,,,,-,N,Y,N,Y,Y,N,N,Y,Y,Y,Y,N,N,N,Y,N,N,Y,Y,N,N,N,N,N,N,N,N,N,N,Private business parking lot,N,N,Y,Y,N,July to November,Tue:8:00 am - 5:00 pm;Sat:8:00 am - 8:00 pm;,,,,,,
12 South Farmers Market,3000 Granny White Pike,Nashville,Davidson,Tennessee,37204,-86.7907,36.1184,http://www.12southfarmersmarket.com,12_South_Farmers_Market,@12southfrmsmkt,,@12southfrmsmkt,Y,N,Y,Y,N,Y,Y,N,Y,Y,Y,Y,Y,Y,N,N,N,Y,Y,Y,N,N,Y,N,Y,N,Y,Y,Y,N,,Y,N,N,N,Y,05/05/2015 to 10/27/2015,Tue: 3:30 PM-6:30 PM;,,,,,,
125th Street Fresh Connect Farmers' Market,"163 West 125th Street and Adam Clayton Powell, Jr. Blvd.",New York,New York,New York,10027,-73.9482,40.809,http://www.125thStreetFarmersMarket.com,https://www.facebook.com/125thStreetFarmersMarket,https://twitter.com/FarmMarket125th,,Instagram--> 125thStreetFarmersMarket,Y,N,Y,Y,Y,Y,Y,N,Y,Y,Y,Y,Y,Y,N,Y,N,Y,Y,Y,N,Y,Y,N,Y,N,Y,N,N,N,Federal/State government building grounds,Y,Y,N,Y,Y,06/10/2014 to 11/25/2014,Tue: 10:00 AM-7:00 PM;,,,,,,
12th & Brandywine Urban Farm Market,12th & Brandywine Streets,Wilmington,New Castle,Delaware,19801,-75.5345,39.7421,,https://www.facebook.com/pages/12th-Brandywine-Urban-Far ...,,,https://www.facebook.com/delawareurbanfarmcoalition,N,N,N,N,N,N,N,N,Y,Y,N,N,N,N,N,N,N,N,N,N,N,N,N,N,Y,N,N,N,N,N,"On a farm from: a barn, a greenhouse, a tent, a stand, etc",N,N,N,N,Y,05/16/2014 to 10/17/2014,Fri: 8:00 AM-11:00 AM;,,,,,,
14&U Farmers' Market,1400 U Street NW,Washington,District of Columbia,District of Columbia,20009,-77.0321,38.917,,https://www.facebook.com/14UFarmersMarket,https://twitter.com/14UFarmersMkt,,,Y,N,Y,Y,N,Y,Y,N,Y,Y,Y,Y,N,Y,N,Y,Y,Y,N,N,N,N,N,Y,Y,Y,Y,N,N,N,Other,Y,Y,Y,Y,Y,05/03/2014 to 11/22/2014,Sat: 9:00 AM-1:00 PM;,,,,,,


Now, suppose we want to answer some questions about farmers' markets in the US. For example, which market(s) are the farthest north (given by the largest values in the `y` column)? 

To answer this, we'll sort `farmers_markets_locations` by latitude.

In [163]:
farmers_markets_locations.sort('y')

MarketName,city,County,State,zip,x,y
Anne Heyliger Vegetable Market,Saint Croix,Frederiksted,Virgin Islands,,-64.8799,17.7099
La Reine Farmers Market,Saint Croix,Christiansted,Virgin Islands,,-64.7789,17.7322
"Christian ""Shan"" Hendricks Vegetable Market",Saint Croix,Christiansted,Virgin Islands,,-64.7043,17.7449
El Maercado Familiar,Arroyo zona urbana,Arroyo,Puerto Rico,714.0,-66.0617,17.9686
El Mercado Familiar,Santa Isabel,Santa Isabel,Puerto Rico,757.0,-66.4184,17.9723
El Mercado Familiar,Salinas zona urbana,Salinas,Puerto Rico,751.0,-66.2954,17.9782
El Mercado Familiar,Guayama,Guayama,Puerto Rico,,-66.0938,17.9792
El Mercado Familiar,Guanica,Guanica,Puerto Rico,653.0,-66.9205,17.9854
El Mercado Familiar,Patillas,Patillas,Puerto Rico,723.0,-66.0135,18.0069
El Mercado Familiar,Peñuelas,Penuelas,Puerto Rico,624.0,-66.3572,18.0089


Oops, that didn't answer our question because we sorted from smallest to largest latitude. To look at the largest latitudes, we'll have to sort in reverse order.

In [165]:
farmers_markets_locations.sort('y', descending=True)

MarketName,city,County,State,zip,x,y
Tanana Valley Farmers Market,Fairbanks,Fairbanks North Star,Alaska,99709.0,-147.781,64.8628
Ester Community Market,Ester,Fairbanks North Star,Alaska,99725.0,-148.01,64.8459
Fairbanks Downtown Market,Fairbanks,Fairbanks North Star,Alaska,99701.0,-147.72,64.8444
Nenana Open Air Market,Nenana,,Alaska,99704.0,-149.096,64.5566
Highway's End Farmers' Market,Delta Junction,Fairbanks North Star,Alaska,99737.0,-145.733,64.0385
MountainTraders,Talkeetna,Matanuska-Susitna,Alaska,99676.0,-150.118,62.3231
Talkeetna Farmers Market,Talkeetna,Matanuska-Susitna,Alaska,99676.0,-150.118,62.3228
Denali Farmers Market,Anchorage,,Alaska,,-150.234,62.3163
Kenny Lake Harvest II,Valdez,,Alaska,99686.0,-145.476,62.1079
Copper Valley Community Market,Copper Valley,,Alaska,99737.0,-145.444,62.0879


(The `descending=True` bit is called an *optional argument*. It has a default value of `False`, so when you explicitly tell the function `descending=True`, then the function will sort in descending order.)

### `sort`

Some details about sort:

1. The first argument to `sort` is the name of a column to sort by.
2. If the column has text in it, `sort` will sort alphabetically; if the column has numbers, it will sort numerically.
3. The value of `farmers_markets_locations.sort("y")` is a *copy* of `farmers_markets_locations`; the `farmers_markets_locations` table doesn't get modified. For example, if we called `farmers_markets_locations.sort("y")`, then running `farmers_markets_locations` by itself would still return the unsorted table.
4. Rows always stick together when a table is sorted.  It wouldn't make sense to sort just one column and leave the other columns alone.  For example, in this case, if we sorted just the `y` column, the farmers' markets would all end up with the wrong latitudes.

**Question 2.5.** Create a version of `farmers_markets_locations` that's sorted by **longitude (`x`)**, with the largest longitudes first.  Call it `farmers_markets_locations_by_longitude`.

<!--
BEGIN QUESTION
name: q25
-->

In [167]:
farmers_markets_locations_by_longitude = farmers_markets_locations.sort("x", descending=True)
farmers_markets_locations_by_longitude

MarketName,city,County,State,zip,x,y
"Christian ""Shan"" Hendricks Vegetable Market",Saint Croix,Christiansted,Virgin Islands,,-64.7043,17.7449
La Reine Farmers Market,Saint Croix,Christiansted,Virgin Islands,,-64.7789,17.7322
Anne Heyliger Vegetable Market,Saint Croix,Frederiksted,Virgin Islands,,-64.8799,17.7099
Rothschild Francis Vegetable Market,St. Thomas,St. Thomas,Virgin Islands,,-64.9326,18.3428
Feria Agrícola de Luquillo,Luquillo,Luquillo,Puerto Rico,773.0,-65.7207,18.3782
El Mercado Familiar,San Lorenzo,San Lorenzo,Puerto Rico,754.0,-65.9674,18.1871
El Mercado Familiar,Gurabo,Gurabo,Puerto Rico,778.0,-65.9786,18.2526
El Mercado Familiar,Patillas,Patillas,Puerto Rico,723.0,-66.0135,18.0069
El Mercado Familiar,Caguas zona urbana,Caguas,Puerto Rico,725.0,-66.039,18.2324
El Maercado Familiar,Arroyo zona urbana,Arroyo,Puerto Rico,714.0,-66.0617,17.9686


Now let's say we want a table of all farmers' markets in North Carolina. Sorting won't help us much here because North Carolina is closer to the middle of the dataset.

Instead, we use the table method `where`.

In [169]:
nc_farmers_markets = farmers_markets_locations.where('State', are.equal_to('North Carolina'))
nc_farmers_markets

MarketName,city,County,State,zip,x,y
Afton Village Farmers Market,Concord,Cabarrus,North Carolina,28027.0,-80.6702,35.414
Alamance County Farmers Market,Burlington,Alamance,North Carolina,27215.0,-79.4357,36.0943
Alexander County Farmers Market,Taylorsville,Alexander,North Carolina,28681.0,-81.1781,35.9197
Alleghany Farmers Market,Sparta,Alleghany,North Carolina,28675.0,-81.1226,36.503
Andrews Farmers Market,Andrews,Cherokee,North Carolina,,-83.8232,35.2027
Anson County Farmers Market,Wadesboro,Anson,North Carolina,28170.0,-80.0526,34.9408
Apex Farmers Market,Apex,Wake,North Carolina,27502.0,-78.8499,35.732
Ashboro Downtown Farmers Market,Asheboro,Randolph,North Carolina,28801.0,-79.8175,35.7049
Ashe County Farmers Market,West Jefferson,Ashe,North Carolina,28694.0,-81.4935,36.4025
Asheville City Market,Asheville,Buncombe,North Carolina,28801.0,-82.5489,35.5935


### `where`

Now let's dive into the details a bit more.  `where` takes 2 arguments:

1. The name of a column.  `where` finds rows where that column's values meet some criterion.
2. A predicate that describes the criterion that the column needs to meet.

The predicate in the example above called the function `are.equal_to` with the value we wanted, 'North Carolina'.  We'll see other predicates soon. 

`where` returns a table that's a copy of the original table, but **with only the rows that meet the given predicate**.

**Question 2.6.** Use `nc_farmers_markets` to create a table called `oc_markets` containing farmers' markets in Orange County, North Carolina
<!--
BEGIN QUESTION
name: q36
-->

In [213]:
oc_markets = nc_farmers_markets.where('County', are.equal_to('Orange')).where('State', are.equal_to('North Carolina'))
oc_markets

MarketName,city,County,State,zip,x,y
Carrboro Farmers' Market,Carrboro,Orange,North Carolina,27510,-79.0776,35.9109
Eno River Farmers Market in Downtown Hillsborough,Hillsborough,Orange,North Carolina,27278,-79.0978,36.074
Hillsborough Farmers Market,Hillsborough,Orange,North Carolina,27278,-79.0807,36.0555
Southern Village Farmers Market,Chapel Hill,Orange,North Carolina,27517,-79.0659,35.881
The Chapel Hill Farmers' Market,Chapel Hill,Orange,North Carolina,27514,-79.0294,35.9275


Recognize any of them?

So far we've only been using `where` with the predicate that requires finding the values in a column to be *exactly* equal to a certain value. However, there are many other predicates. Here are a few:

|Predicate|Example|Result|
|-|-|-|
|`are.equal_to`|`are.equal_to(50)`|Find rows with values equal to 50|
|`are.not_equal_to`|`are.not_equal_to(50)`|Find rows with values not equal to 50|
|`are.above`|`are.above(50)`|Find rows with values above (and not equal to) 50|
|`are.above_or_equal_to`|`are.above_or_equal_to(50)`|Find rows with values above 50 or equal to 50|
|`are.below`|`are.below(50)`|Find rows with values below 50|
|`are.between`|`are.between(2, 10)`|Find rows with values above or equal to 2 and below 10|

## 3. Analyzing a dataset

Now that you're familiar with table operations, let’s answer more questions using this data!

Often, we want to perform multiple operations - sorting, filtering, or others - in order to turn a table we have into something more useful. You can do these operations one by one, e.g.

```
first_step = original_tbl.where(“col1”, are.equal_to(12))
second_step = first_step.sort(‘col2’, descending=True)
```

However, since the value of the expression `original_tbl.where(“col1”, are.equal_to(12))` is itself a table, you can just call a table method on it:

```
original_tbl.where(“col1”, are.equal_to(12)).sort(‘col2’, descending=True)
```
You should organize your work in the way that makes the most sense to you, using informative names for any intermediate tables you create. 

**Question 3.1.** 

Suppose that you would like to visit a farmers' market in Orange County, NC that sells wine, eggs, coffee, and takes credit cards (`Credit`). Use the `farmers_markets` table to produce a new table `shopping_trip` that can help you to succinctly decide where to go.

In [229]:
shopping_trip = farmers_markets.where('County', are.equal_to('Orange')).where('State', are.equal_to('North Carolina')).select("MarketName", "city","County","State", "Wine", "Eggs", "Coffee", "Credit")
shopping_trip

MarketName,city,County,State,Wine,Eggs,Coffee,Credit
Carrboro Farmers' Market,Carrboro,Orange,North Carolina,Y,Y,Y,Y
Eno River Farmers Market in Downtown Hillsborough,Hillsborough,Orange,North Carolina,Y,Y,Y,Y
Hillsborough Farmers Market,Hillsborough,Orange,North Carolina,N,Y,N,Y
Southern Village Farmers Market,Chapel Hill,Orange,North Carolina,N,N,N,N
The Chapel Hill Farmers' Market,Chapel Hill,Orange,North Carolina,Y,Y,Y,Y


**Question 3.2.** 

Which farmers market (or markets) could you choose for this shopping trip based on the table that you constructed in the previous question?

Carrboro Farmers' Market, The Chapel Hill Farmers' Market, and Eno River Farmers Market in Downtown Hillsborough

**Question 3.3.** Use `num_rows` (and arithmetic) to find the *proportion* of farmers' markets in North Carolina that sell wine (`Wine`). Assign `NC_wine_prop` to this proportion

How does this compare the *proportion of* of all farmers' markets in the `farmers_market` table that sell wine? Assign `US_wine` to this proportion

<!--
BEGIN QUESTION
name: q33
-->

In [239]:
NC_wine_prop = farmers_markets.where('State', are.equal_to('North Carolina')).num_rows
US_wine_prop = farmers_markets.num_rows

print("Proportion of North Carolina farmers' markets selling wine:", NC_wine_prop)
print("Proportion of US farmers' markets selling wine:", US_wine_prop)

Proportion of North Carolina farmers' markets selling wine: 252
Proportion of US farmers' markets selling wine: 8546


**Question 3.4.**

Find (guess) a state where the proportion of farmers' markets selling wine is more than double the national average (`US_wine_prop`) and determine this proportion for your chosen state.

In [241]:
Chosen_State = 'California'

In [251]:
Chosen_State_wine_prop = ((farmers_markets.where('State', are.equal_to('California')).num_rows) / (farmers_markets.num_rows)) * 100
print("Proportion of", Chosen_State, "farmers' markets selling wine:", Chosen_State_wine_prop,"%")

Proportion of California farmers' markets selling wine: 8.83454247601217 %


# 4. Text
Programming doesn't just concern numbers. Text is one of the most common data types used in programs. 

Text is represented by a **string value** in Python. The word "string" is a programming term for a sequence of characters. A string might contain a single character, a word, a sentence, or a whole book.

To distinguish text data from actual code, we demarcate strings by putting quotation marks around them. Single quotes (`'`) and double quotes (`"`) are both valid, but the types of opening and closing quotation marks must match. The contents can be any sequence of characters, including numbers and symbols. 

We've seen strings before in `print` statements.  Below, two different strings are passed as arguments to the `print` function.

In [189]:
print("I <3", 'Data Science')

I <3 Data Science


Just as names can be given to numbers, names can be given to string values.  The names and strings aren't required to be similar in any way. Any name can be assigned to any string.

In [191]:
one = 'two'
plus = '*'
print(one, plus, one)

two * two


**Question 4.1.** Yuri Gagarin was the first person to travel through outer space.  When he emerged from his capsule upon landing on Earth, he [reportedly](https://en.wikiquote.org/wiki/Yuri_Gagarin) had the following conversation with a woman and girl who saw the landing:

    The woman asked: "Can it be that you have come from outer space?"
    Gagarin replied: "As a matter of fact, I have!"

The cell below contains unfinished code.  Fill in the `...`s so that it prints out this conversation *exactly* as it appears above.

<!--
BEGIN QUESTION
name: q4_1
-->

In [254]:
woman_asking = "The woman asked:"
woman_quote = '"Can it be that you have come from outer space?"'
gagarin_reply = 'Gagarin replied:'
gagarin_quote = '"As a matter of fact, I have!"'

print(woman_asking, woman_quote)
print(gagarin_reply, gagarin_quote)

The woman asked: "Can it be that you have come from outer space?"
Gagarin replied: "As a matter of fact, I have!"


## 4.1. String Methods

Strings can be transformed using **methods**. Recall that methods and functions are not technically the same thing, but we'll be using them interchangeably for the purposes of this course.

Here's a sketch of how to call methods on a string:

    <expression that evaluates to a string>.<method name>(<argument>, <argument>, ...)
    
One example of a string method is `replace`, which replaces all instances of some part of the original string (or a *substring*) with a new string. 

    <original string>.replace(<old substring>, <new substring>)
    
`replace` returns (evaluates to) a new string, leaving the original string unchanged.
    
Try to predict the output of this example, then run the cell!

In [256]:
# Replace one letter
hello = 'Hello'
print(hello.replace('o', 'a'), hello)

Hella Hello


You can call functions on the results of other functions.  For example, `max(abs(-5), abs(3))` evaluates to 5.  Similarly, you can call methods on the results of other method or function calls.

You may have already noticed one difference between functions and methods - a function like `max` does not require a `.` before it's called, but a string method like `replace` does.

In [198]:
# Calling replace on the output of another call to replace
'train'.replace('t', 'ing').replace('in', 'de')

'degrade'

Here's a picture of how Python evaluates a "chained" method call like that:

<img src="lab02-chaining_method_calls.jpg"/>

**Question 4.1.1.** Use `replace` to transform the string `'clarinetists'` into `'statistics'`. Assign your result to `new_word`.

<!--
BEGIN QUESTION
name: q411
-->

In [260]:
new_word = 'clarinetists'.replace("clarine", "sta").replace("ts", "tics")
new_word

'statistics'

There are many more string methods in Python, but most programmers don't memorize their names or how to use them.  In the "real world," people usually just search the internet for documentation and examples. A complete [list of string methods](https://docs.python.org/3/library/stdtypes.html#string-methods) appears in the Python language documentation. [Stack Overflow](http://stackoverflow.com) has a huge database of answered questions that often demonstrate how to use these methods to achieve various ends.

## 4.2. Converting to and from Strings

Strings and numbers are different *types* of values, even when a string contains the digits of a number. For example, evaluating the following cell causes an error because an integer cannot be added to a string.

In [263]:
8 + "8"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

However, there are built-in functions to convert numbers to strings and strings to numbers. Some of these built-in functions have restrictions on the type of argument they take:

|Function |Description|
|-|-|
|`int`|Converts a string of digits or a float to an integer ("int") value|
|`float`|Converts a string of digits (perhaps with a decimal point) or an int to a decimal ("float") value|
|`str`|Converts any value to a string|

Try to predict what data type and value `example` evaluates to, then run the cell.

In [266]:
example = 8 + int("10") + float("8")

print(example)
print("This example returned a " + str(type(example)) + "!")

26.0
This example returned a <class 'float'>!


Suppose you're writing a program that looks for dates in a text, and you want your program to find the amount of time that elapsed between two years it has identified.  It doesn't make sense to subtract two texts, but you can first convert the text containing the years into numbers.

**Question 4.2.1.** Finish the code below to compute the number of years that elapsed between `one_year` and `another_year`.  Don't just write the numbers `1979` and `2022` (or `43`); use a conversion function to turn the given text data into numbers.

<!--
BEGIN QUESTION
name: q421
-->

In [272]:
# Some text data:
one_year = "1979"
another_year = "2022"

# Complete the next line.  Note that we can't just write:
#   another_year - one_year
# If you don't see why, try seeing what happens when you
# write that here.
difference = int(another_year) - int(one_year)
difference

43

## 4.3. Passing strings to functions

String values, like numbers, can be arguments to functions and can be returned by functions. 

The function `len` (derived from the word "length") takes a single string as its argument and returns the number of characters (including spaces) in the string.

Note that it doesn't count *words*. `len("one small step for man")` evaluates to 22, not 5.

**Question 4.3.1.**  Use `len` to find the number of characters in the long string in the next cell.  Characters include things like spaces and punctuation. Assign `speech_length` to that number.

(The string is the text of North Carolina Gov. Roy Cooper's 2021 inauguration speech from the [AP News](https://apnews.com/article/health-north-carolina-inaugurations-coronavirus-pandemic-0b2b6cc2c722a00971bc5f9e4c82a948of.asp).)  

<!--
BEGIN QUESTION
name: q431
-->

In [276]:
speech = "Well, Good morning everybody. I’m Roy Cooper. Today, I’m honored to stand before you to accept both the oath of office of Governor and the duties that come along with it. I’m thankful for my family — our First Lady Kristin Cooper and my three daughters — who inspire me every day. Transitions are a time for reflection, and a time for looking forward. My first term in this office was filled with triumphs, but also trials. First, the triumphs. Historic progress to make our state more inclusive and our environment cleaner. Record jobs announcements in rural and urban parts of our state that provided rewarding work to our people. Unrelenting efforts to make health care more accessible and public schools stronger. And as for the trials — the natural disasters. The overdue reckoning on racial justice. An unprecedented global pandemic. The earthquakes – those that shook the ground and those that shook the very foundation of our democracy. We’ve had our share of tough days. But we are North Carolinians. And in our state, difficulties don’t define us. What defines us is our strength, our resilience, our readiness to succeed at what comes next. When the pandemic forced classrooms to close in March, schools and volunteers made sure our kids got fed at home. When weary health care workers needed a boost, communities sent meals and care packages. When personal protective equipment ran short, North Carolina manufacturing companies pivoted to produce face shields, gowns, masks and more. And let’s not overlook the stories of neighbors helping neighbors. Grandchildren talking and singing through windows to their grandparents in nursing homes. A painted rainbow taped up in the hallway. Snacks and signs of encouragement left for delivery workers facing long hours. An overworked nurse getting a COVID-19 vaccine and telling everyone the biggest side effect is joy. Our state deserves a collective pat on the back. And as your governor, I say thank you. And before looking ahead, it’s worth looking back a hundred years where North Carolina was in 1920. The state had lost nearly 14,000 people in the Spanish flu pandemic. And in just a few years, North Carolina roared back. New manufacturing jobs paid reliable wages for the first time to thousands of North Carolinians. With more money in their pockets, people were able to afford to buy cars. And that created the challenge of needing roads for those cars to drive on. So, North Carolina responded and became known as the Good Roads state. Those roads got people to work, but they also enabled them to vacation and enjoy the natural beauty of our state. But now a century later, that cycle of challenge-and-response confronts North Carolina again. We are living it. We can see it. And we can solve it. As your governor, I commit to focus on our most important challenges. The challenge of emerging from this pandemic smarter and stronger than ever. The challenge of educating our people and ensuring that every North Carolinian gets health care. The challenge of overcoming disinformation and lies and recommitting to the truth. We can respect our disagreements, but we must cherish our democracy. And the challenge of forming a more perfect North Carolina, where every person has opportunity and access to the liberty that they deserve and our laws promise. As we enter 2021, we carry the imprint of our people’s frustration and loss as well as our determination and resilience. I hold close the memories of the suffering and the heroic North Carolinians. This new year and this new term as Governor is more than just turning the page of a calendar. The lessons we’ve all learned must usher in a new era. An era where we can acknowledge and work around our differences while refusing to sacrifice truth and facts at the altar of ideology. Where the dangerous events that took place at our nation’s Capitol can never be justified. So let’s reach together – to find ways all North Carolinians can afford to see a doctor. To get a quality education and a good paying job. To reform our systems that hurt people of color and to live and work in an economy that leaves no one behind, no matter who they are or where they live. Hey, let’s cast aside notions of red counties or blue counties and recognize that these are artificial divisions. Let’s place integrity at the forefront. We are all North Carolinians. These times of triumph and trial have shown us that we are more connected than we ever imagined. And one thing is clear, just as we did one hundred years ago — North Carolina is ready to roar again. And we will do it together. As the Bible tells us in the Book of Ecclesiastes, “Two are better than one, because they have good reward for their toil. For if they fall, one will lift up his fellow.” North Carolinians have shown we know how to lift one another up. I’m truly humbled by the trust that you, the people of North Carolina have placed in me to serve again as your Governor. I have faith in you, and thank you for putting your faith in me. Together, may we continue to be strong, resilient and ready. God Bless North Carolina and may God bless all of you."
speech_length = len(speech)
speech_length

5134