In [1]:
import agate

Fetch File

In [2]:
!wget https://raw.githubusercontent.com/dwillis/smpa3193-exercises/master/county_population.csv

--2017-03-07 10:13:31--  https://raw.githubusercontent.com/dwillis/smpa3193-exercises/master/county_population.csv
Resolving raw.githubusercontent.com... 151.101.32.133
Connecting to raw.githubusercontent.com|151.101.32.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 188293 (184K) [text/plain]
Saving to: 'county_population.csv'


2017-03-07 10:13:31 (2.67 MB/s) - 'county_population.csv' saved [188293/188293]



Table Structure

In [3]:
counties = agate.Table.from_csv("county_population.csv")
print(counties)

| column        | data_type |
| ------------- | --------- |
| county        | Text      |
| state         | Text      |
| estimate_2010 | Number    |
| estimate_2011 | Number    |
| estimate_2012 | Number    |
| estimate_2013 | Number    |
| estimate_2014 | Number    |
| estimate_2015 | Number    |



Calc Change

In [4]:
popchange1016 = counties.compute([
    ('change', agate.PercentChange('estimate_2010', 'estimate_2015'))
])

In [5]:
print(popchange1016[0]['change'])

1.256860592755214050493962678


Rounded Change

In [6]:
from decimal import Decimal

def round_change(row):
    return row['change'].quantize(Decimal('0.1'))

rounded_change = popchange1016.compute([
    ('change_rounded', agate.Formula(agate.Number(), round_change))
])

In [7]:
print(rounded_change[0]['change_rounded'])

1.3


Sort by changea

In [40]:
sorted_counties = rounded_change.order_by('change', reverse=True)
sorted_counties.select(['county', 'state', 'change_rounded']).print_table(max_rows=50)

| county              | state          | change_rounded |
| ------------------- | -------------- | -------------- |
| McKenzie County     | North Dakota   |          100.4 |
| Williams County     | North Dakota   |           56.3 |
| Loving County       | Texas          |           34.9 |
| Mountrail County    | North Dakota   |           33.8 |
| Stark County        | North Dakota   |           32.0 |
| Dunn County         | North Dakota   |           31.2 |
| Sumter County       | Florida        |           26.1 |
| Wasatch County      | Utah           |           23.4 |
| St. Bernard Parish  | Louisiana      |           23.4 |
| Hays County         | Texas          |           23.0 |
| Richland County     | Montana        |           22.7 |
| Andrews County      | Texas          |           22.0 |
| Billings County     | North Dakota   |           21.4 |
| Fort Bend County    | Texas          |           21.3 |
| Long County         | Georgia        |           20.7 |
| Forsyth Coun

Show all of the counties in North Dakota, in order of largest change to smallest

In [13]:
nd_counties = sorted_counties.where(lambda x: x['state'] == 'North Dakota')
nd_counties.select(['county', 'state', 'change_rounded']).print_table(max_rows=50)

| county               | state        | change_rounded |
| -------------------- | ------------ | -------------- |
| McKenzie County      | North Dakota |          100.4 |
| Williams County      | North Dakota |           56.3 |
| Mountrail County     | North Dakota |           33.8 |
| Stark County         | North Dakota |           32.0 |
| Dunn County          | North Dakota |           31.2 |
| Billings County      | North Dakota |           21.4 |
| Divide County        | North Dakota |           17.9 |
| Burke County         | North Dakota |           17.8 |
| Ward County          | North Dakota |           14.8 |
| Cass County          | North Dakota |           14.1 |
| Burleigh County      | North Dakota |           13.8 |
| McHenry County       | North Dakota |           10.6 |
| Morton County        | North Dakota |            9.9 |
| Golden Valley County | North Dakota |            9.6 |
| Hettinger County     | North Dakota |            9.3 |
| McLean County        | North 

Show the bottom 50 counties nationwide in terms of population change (the smallest change)

In [27]:
nd_counties = nd_counties.order_by('change')
nd_counties.select(['county', 'state', 'change_rounded']).print_table(max_rows=50)

| county               | state        | change_rounded |
| -------------------- | ------------ | -------------- |
| Nelson County        | North Dakota |           -4.9 |
| Pembina County       | North Dakota |           -4.2 |
| Griggs County        | North Dakota |           -4.0 |
| Emmons County        | North Dakota |           -4.0 |
| Cavalier County      | North Dakota |           -3.9 |
| Dickey County        | North Dakota |           -3.4 |
| Logan County         | North Dakota |           -3.2 |
| Walsh County         | North Dakota |           -1.7 |
| McIntosh County      | North Dakota |           -1.3 |
| Traill County        | North Dakota |           -1.2 |
| Steele County        | North Dakota |           -1.0 |
| Pierce County        | North Dakota |           -0.9 |
| Kidder County        | North Dakota |           -0.9 |
| Wells County         | North Dakota |           -0.7 |
| Eddy County          | North Dakota |           -0.7 |
| Sheridan County      | North 

Show the top 50 counties sorted by 2015 estimated population, with the largest population first

In [41]:
nd_counties = rounded_change.order_by('estimate_2015', reverse=True)
nd_counties.select(['county', 'state', 'change_rounded', 'estimate_2015']).print_table(max_rows=50)

| county               | state          | change_rounded | estimate_2015 |
| -------------------- | -------------- | -------------- | ------------- |
| Los Angeles County   | California     |            3.5 |    10,170,292 |
| Cook County          | Illinois       |            0.7 |     5,238,216 |
| Harris County        | Texas          |           10.5 |     4,538,028 |
| Maricopa County      | Arizona        |            8.9 |     4,167,947 |
| San Diego County     | California     |            6.3 |     3,299,521 |
| Orange County        | California     |            5.0 |     3,169,776 |
| Miami-Dade County    | Florida        |            7.4 |     2,693,117 |
| Kings County         | New York       |            5.0 |     2,636,735 |
| Dallas County        | Texas          |            7.6 |     2,553,385 |
| Riverside County     | California     |            7.2 |     2,361,026 |
| Queens County        | New York       |            4.6 |     2,339,150 |
| San Bernardino Co... | 

Calculate an average change for all states and show the state and average in descending order

In [37]:
state_agg = rounded_change.group_by('state').aggregate([
    ('avg_change', agate.Mean('change'))
]).order_by('avg_change', reverse=True)
state_agg.select(['state','avg_change']).print_table(max_rows=50)

| state                | avg_change |
| -------------------- | ---------- |
| District of Columbia |    11.089… |
| North Dakota         |     7.788… |
| Delaware             |     6.248… |
| Utah                 |     4.946… |
| Florida              |     4.470… |
| Hawaii               |     4.455… |
| Washington           |     3.074… |
| Arizona              |     2.732… |
| Massachusetts        |     2.636… |
| Texas                |     2.588… |
| Montana              |     2.245… |
| Wyoming              |     2.178… |
| Colorado             |     1.933… |
| Maryland             |     1.801… |
| California           |     1.758… |
| Alaska               |     1.600… |
| Oregon               |     1.565… |
| Virginia             |     1.374… |
| South Dakota         |     1.291… |
| Tennessee            |     1.281… |
| South Carolina       |     1.267… |
| North Carolina       |     1.263… |
| Oklahoma             |     1.031… |
| New Jersey           |     0.687… |
| Nevada    