#  60+ degrees in February in Nebraska? 
It's going to be crazy warm today in Lincoln, Neb. -- the forecast high is only two degrees below the record of 70 for the day. That's not shocking for a Spring day, but for February, it's quite surprising. It's still Winter in a place that earns a capital W on Winter. 

But how crazy it is? Let's find out.

First, we need some data. [The National Centers for Environmental Information at NOAA](http://www.ncdc.noaa.gov/) has a mind boggling amount of climate data, and it's easy to query and download (though, warning, downloads aren't instantaneous: you have to order the data and it's processed as they get it). I've downloaded every weather station in Lincoln for the past 50 years. That'll be overkill -- I'll cull the herd later -- but for now, it's a good solid chunk of data.

We'll use Agate to analyze the data and answer one simple question: How often does it get to be 60 degrees or more in February in Lincoln? First, some preliminaries.

In [1]:
import agate

tester = agate.TypeTester(limit=100)

fiftyyears = agate.Table.from_csv('../Data/temp1.csv', column_types=tester)

The `tester` bits just tells Agate to interpolate field types only using the first 100 rows of data. That speeds up the importing of a 3.5 MB text file a little. 

In [2]:
print(fiftyyears)

|---------------+---------------|
|  column_names | column_types  |
|---------------+---------------|
|  STATION      | Text          |
|  STATION_NAME | Text          |
|  ELEVATION    | Number        |
|  LATITUDE     | Number        |
|  LONGITUDE    | Number        |
|  Date         | Date          |
|  TMAX         | Number        |
|  TMIN         | Number        |
|---------------+---------------|



First things first, TMAX and TMIN -- the high and low temperatures for the day -- are represented without the decimal, and they're in Celsius. This being the United States of Fahrenheit, we have to convert it.

The general formula for converting C to F is `TempInC * 1.8 + 32` but, in our case, we need to first divide our temp by 10 to get that decimal back.

To do all of this in one fell swoop, we'll use Agate's Formula capabilities to create a new column of data called `tmax_f` and `tmin_f` which will be what they say on the tin -- the high and low temps in Fahrenheit. 

In [3]:
from decimal import Decimal

def make_high_f(row):
    return ((row['TMAX']/10)*Decimal(1.8)+32).quantize(Decimal('0.1'))

def make_low_f(row):
    return ((row['TMIN']/10)*Decimal(1.8)+32).quantize(Decimal('0.1'))

converted_temps = fiftyyears.compute([
    ('tmax_f', agate.Formula(agate.Number(), make_high_f)),
    ('tmin_f', agate.Formula(agate.Number(), make_low_f))
])

In [4]:
print(converted_temps)

|---------------+---------------|
|  column_names | column_types  |
|---------------+---------------|
|  STATION      | Text          |
|  STATION_NAME | Text          |
|  ELEVATION    | Number        |
|  LATITUDE     | Number        |
|  LONGITUDE    | Number        |
|  Date         | Date          |
|  TMAX         | Number        |
|  TMIN         | Number        |
|  tmax_f       | Number        |
|  tmin_f       | Number        |
|---------------+---------------|



Now, since we're only concerned with the month of February, we need to be able to filter out other months. I'm just learning Agate, and the way I know how to do this is to create a field with the month in it and filter on that. We do that similar to our Fahrenheit fields: with Formula.

In [5]:
temps_with_months = converted_temps.compute([
    ('month', agate.Formula(agate.Text(), lambda row: '%s' % row['Date'].month))
])

In [6]:
print(temps_with_months)
print(len(temps_with_months.rows))

|---------------+---------------|
|  column_names | column_types  |
|---------------+---------------|
|  STATION      | Text          |
|  STATION_NAME | Text          |
|  ELEVATION    | Number        |
|  LATITUDE     | Number        |
|  LONGITUDE    | Number        |
|  Date         | Date          |
|  TMAX         | Number        |
|  TMIN         | Number        |
|  tmax_f       | Number        |
|  tmin_f       | Number        |
|  month        | Text          |
|---------------+---------------|

43551


Now, if you were following along, you saw that I printed the number of rows in the table we're working with: 43,551. So that means we have more than 43,000 weather observations across the city since 1966. So let's cut that pile down a bit. We don't need to count every 60 degree reading at every weather station, so let's just use the official weather station in town at the Lincoln Airport. I'm going to filter it out by the Station ID of the Airport. 

In [7]:
airport = temps_with_months.where(lambda row: row['STATION'] == 'GHCND:USW00014939')
print(len(airport.rows))

15872


So now we're down to 15,872 weather observations since 1966. We just want February. 

In [8]:
airport_feb = airport.where(lambda row: row['month'] == '2')
print(len(airport_feb.rows))

1228


This is a time for a reality check. If you just took 28 times 50, you'd get 1,400. Every four years, there's 29 days in February, so something's amiss here. It means we don't have all the data we think we have. Indeed, if you look at the data, you'll see there's only airport data going back to September 1972. Which means for our purposes, it's 1973 -- the first February we'd encounter. So instead of 50 years, we have 43. 

If you want to see how many February days in the 60s we've had at the airport since 1973, it's 88.

In [9]:
warm_feb_days = airport_feb.where(lambda row: 60 <= row['tmax_f'] < 70)
print(len(warm_feb_days.rows))

88


To see that a little more graphically, we can bin the temps and `print_bars` instead. 

In [10]:
binned_temps = airport_feb.bins('tmax_f', 11, -20, 90)
binned_temps.print_bars('tmax_f', 'count', width=50)

tmax_f      count
[-20 - -10)     0 ▓                               
[-10 - 0)       3 ▓                               
[0 - 10)       22 ▓░░                             
[10 - 20)     103 ▓░░░░░░░░                       
[20 - 30)     203 ▓░░░░░░░░░░░░░░░░               
[30 - 40)     355 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░   
[40 - 50)     233 ▓░░░░░░░░░░░░░░░░░░             
[50 - 60)     199 ▓░░░░░░░░░░░░░░░                
[60 - 70)      88 ▓░░░░░░░                        
[70 - 80)      22 ▓░░                             
[80 - 90]       0 ▓                               
                  +-------+-------+------+-------+
                  0      100     200    300    400


So the next question I have is when have those 88 days occurred? Were they all recent and a scary sign of climate change? Or was there a February or two in the past that were off-the-charts warm? We've got the data. Let's find out. First we'll add a year field to our warm days dataset.

In [11]:
warm_feb_days_with_years = warm_feb_days.compute([
    ('year', agate.Formula(agate.Text(), lambda row: '%s' % row['Date'].year))
])

Now we'll group those together into unique years and then count them up. 

In [12]:
by_year = warm_feb_days_with_years.group_by('year')

In [13]:
by_year_count = by_year.aggregate([
    ('count', agate.Length())
])

In [14]:
by_year_count.print_bars('year', 'count', width=50)

year count
1974     2 ▓░░░░░░░░░░                            
1976     6 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░          
1977     8 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
1981     6 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░          
1982     2 ▓░░░░░░░░░░                            
1983     1 ▓░░░░░                                 
1984     3 ▓░░░░░░░░░░░░░░                        
1986     2 ▓░░░░░░░░░░                            
1987     4 ▓░░░░░░░░░░░░░░░░░░░                   
1988     1 ▓░░░░░                                 
1990     3 ▓░░░░░░░░░░░░░░                        
1991     3 ▓░░░░░░░░░░░░░░                        
1992     5 ▓░░░░░░░░░░░░░░░░░░░░░░░░              
1994     2 ▓░░░░░░░░░░                            
1995     3 ▓░░░░░░░░░░░░░░                        
1996     3 ▓░░░░░░░░░░░░░░                        
1997     2 ▓░░░░░░░░░░                            
1998     1 ▓░░░░░                                 
1999     4 ▓░░░░░░░░░░░░░░░░░░░                   
2000     7 ▓░░░░░░░░

So a few years had some warm days. What about the average? Is it going up? Do warm days have an outsize impact? How do we calculate an average for February by year to compare?

In [21]:
average_february = airport_feb.compute([
    ('year', agate.Formula(agate.Text(), lambda row: '%s' % row['Date'].year))
])

In [22]:
average_feb_by_year = average_february.group_by('year')

In [23]:
average_feb_by_year = average_feb_by_year.aggregate([
    ('average', agate.Mean('tmax_f'))
])

In [24]:
average_feb_by_year.print_bars('year', 'average', width=80)

year                       average
1973 36.80000000000000000000000000 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░                 
1974 41.85357142857142857142857143 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             
1975 28.27857142857142857142857143 ▓░░░░░░░░░░░░░░░░░░░░░                       
1976 50.33793103448275862068965517 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░       
1977 47.46785714285714285714285714 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░         
1978 22.75000000000000000000000000 ▓░░░░░░░░░░░░░░░░░                           
1979 23.27857142857142857142857143 ▓░░░░░░░░░░░░░░░░░                           
1980 30.73448275862068965517241379 ▓░░░░░░░░░░░░░░░░░░░░░░░                     
1981 45.22500000000000000000000000 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           
1982 33.10000000000000000000000000 ▓░░░░░░░░░░░░░░░░░░░░░░░░                    
1983 39.96071428571428571428571429 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░               
1984 46.35517241379310344827586207 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░    

The most days over 60 happened in 1977, with eight, followed by 2000 with seven. In the 90s, there were seven straight years -- from 1994 to 2000 -- with at least one day in the 60s in February. 

So, is a day or two in the 60s in February unheard of? Not at all. But they aren't exactly common. So when it happens, GO OUTSIDE.