First, import agate.

In [1]:
import agate

Then, import the data. We'll force the 'FIPS' and 'CombinedFIPS' columns to be Text.

In [2]:
tester = agate.TypeTester(force={
        'FIPS': agate.Text(),
        'CombinedFIPS': agate.Text()
    })
amenities = agate.Table.from_csv('naturalamenities.csv', column_types=tester)

Print the column names and types.

In [3]:
print(amenities)

|-------------------------+---------------|
|  column_names           | column_types  |
|-------------------------+---------------|
|  FIPS                   | Text          |
|  CombinedFIPS           | Text          |
|  STATE                  | Text          |
|  County name            | Text          |
|  CensusDivision         | Number        |
|  RuralUrbanCode         | Number        |
|  UrbanInfluenceCode     | Number        |
|  MeanJanuaryTemp        | Number        |
|  MeanJanuarySun         | Number        |
|  MeanJulyTemp           | Number        |
|  MeanJulyHumidity       | Number        |
|  TopographyCode         | Number        |
|  PercentWaterArea       | Number        |
|  NaturalLogPercentWater | Number        |
|  JanTempZScore          | Number        |
|  JanSunZScore           | Number        |
|  JulyTempZScore         | Number        |
|  JulyHumidityZScore     | Number        |
|  TopoZScore             | Number        |
|  WaterAreaZScore        | Numb

I'm only interested in certain columns, so I created a new table with only those columns.

In [4]:
amenities = amenities.select(['CombinedFIPS', 'STATE', 'County name', 'NaturalAmenityScale'])

Now, I order the entries by the amenity scale.

In [5]:
ordered = amenities.order_by('NaturalAmenityScale')

And print those out.

In [6]:
ordered.print_table(max_rows=50)

|---------------+-------+--------------+----------------------|
|  CombinedFIPS | STATE | County name  | NaturalAmenityScale  |
|---------------+-------+--------------+----------------------|
|  27125        | MN    | RED LAKE     |               -6.40  |
|  27167        | MN    | WILKIN       |               -6.10  |
|  18159        | IN    | TIPTON       |               -5.40  |
|  27107        | MN    | NORMAN       |               -5.37  |
|  27099        | MN    | MOWER        |               -5.18  |
|  38067        | ND    | PEMBINA      |               -5.18  |
|  38097        | ND    | TRAILL       |               -5.12  |
|  27039        | MN    | DODGE        |               -5.08  |
|  38035        | ND    | GRAND FORKS  |               -5.01  |
|  27113        | MN    | PENNINGTON   |               -4.97  |
|  27069        | MN    | KITTSON      |               -4.90  |
|  19075        | IA    | GRUNDY       |               -4.86  |
|  38017        | ND    | CASS         |

So, these are the 50 worst counties. Now, I'll create a new table with only these records.

In [7]:
top50 = ordered.where(lambda row: row['NaturalAmenityScale'] < -3.949)

And print it.

In [8]:
top50.print_table()

|---------------+-------+--------------+----------------------|
|  CombinedFIPS | STATE | County name  | NaturalAmenityScale  |
|---------------+-------+--------------+----------------------|
|  27125        | MN    | RED LAKE     |               -6.40  |
|  27167        | MN    | WILKIN       |               -6.10  |
|  18159        | IN    | TIPTON       |               -5.40  |
|  27107        | MN    | NORMAN       |               -5.37  |
|  27099        | MN    | MOWER        |               -5.18  |
|  38067        | ND    | PEMBINA      |               -5.18  |
|  38097        | ND    | TRAILL       |               -5.12  |
|  27039        | MN    | DODGE        |               -5.08  |
|  38035        | ND    | GRAND FORKS  |               -5.01  |
|  27113        | MN    | PENNINGTON   |               -4.97  |
|  27069        | MN    | KITTSON      |               -4.90  |
|  19075        | IA    | GRUNDY       |               -4.86  |
|  38017        | ND    | CASS         |

Just for fun, I grouped the records by state.

In [9]:
states = top50.group_by('STATE')

Counted them.

In [10]:
states = states.aggregate([
    ('count', agate.Count())
])

Ordered them.

In [11]:
states = states.order_by('count')

And printed them.

In [12]:
states.print_table()

|--------+--------|
|  STATE | count  |
|--------+--------|
|  WI    |     1  |
|  KY    |     1  |
|  NE    |     3  |
|  IN    |     5  |
|  IL    |     5  |
|  ND    |     7  |
|  IA    |    13  |
|  MN    |    15  |
|--------+--------|


Then, I sent the top50 and ordered sets out to a .csv

In [13]:
top50.to_csv('Top50.csv')

In [14]:
ordered.to_csv('ordered.csv')