# Will it be an early Spring?

On February 2<sup>nd</sup> every year Punxsutawney Phil makes a prediction about if there will be an early Spring or if Winter will continue for 6 more weeks (till about mid-March). He is however not very accurate (well, according to [The Inner Circle](https://www.groundhog.org/inner-circle) he is 100% correct but the human handler may not interpret his response correctly). The overall goal is to be able to predict if it will be an early Spring.

For this project you must go through most steps in the checklist. You must write responses for all items however sometimes the item will simply be "does not apply". Some of the parts are a bit more nebulous and you simply show that you have done things in general (and the order doesn't really matter). Keep your progress and thoughts organized in this document and use formatting as appropriate (using markdown to add headers and sub-headers for each major part). Do not do the final part (launching the product) and your presentation will be done as information written in this document in a dedicated section, no slides or anything like that. It should however include the best summary plots/graphics/data points.

You are intentionally given very little information thus far. You must communicate with your client (me) for additional information as necessary. But also make sure that your communications are efficient, thought out, and not redundant as your client might get frustrated and "fire" you (this only applies to getting information from your client, this does not necessary apply to asking for help with the actual project itself).

Each group from 200-level and 300-level sections with the best results on the 10% of the data that I kept for myself will earn +5 extra credit (if multiple groups are close points may be given to multiple groups).

Frame The Problem
----

**1. Define the objective in business terms.**  
    ACME Seed company is trying to understand weather patterns for their new corn seed product. The company needs to guarantee if there will be an early spring based upon the farmers product yield. If weather permitting, the farmers can get 2 full harvests with the ACME corn seeds. Our objective is to understand when there will be an early spring and when there won't be.  
    
**2. How will your solution be used?**  
    If our model can successfully predict if there will be an early spring (before March 15th) the company will be able to send out a guarantee for the seeds making sales flourish.

**3. What are the current solutions/workarounds (if any)?**   
    Current solutions to knowing about early springs are very iffy. Weather is an always changing and hard to calculate thing. We are currently using Farmer's Almanacs, Meterologist Models, and Groundhogs to predict weather trends.  

**4. How should you frame this problem (supervised/unsupervised, online/offline, ...)?**  
    This is going to be a Supervised (Regression Based) problem with a most likely offline system. Supervised because of the historical data being put into our model and Regression based because of the dynamic changing rates of weather patterns. On top of this for the moment we are keeping it to an offline system because of the not needed constant input of new data to update.  

**5. How should performance be measured? Is the performance measure aligned with the business objective?**   
    Based on the ideology that we will be trying to guarantee ACME seeds that there will be an early spring. Performance will be measured by the Recall score of our model because of the True Positive nature that guarantees a result. We will disregard false negatives and say with 100% certainty that True Positives will be the best results for both the Company and the Farmers.  

**6. What would be the minimum performance needed to reach the business objective?**  
    Guaranteed Early Springs with very high certainty. There is a slight tolerance for error in missing a few early springs, but in contrast to that we do not want to inform the company that there will be an early spring if it actually in reality is still winter. No direct miminum performance has been classified but we hold high standards.  

**7. What are comparable problems? Can you reuse experience or tools?**  
    There are instances in other weather machine learning problems that could be useful such as prediciting percipitation patterns, but other than that not too much direct comparisons.  

**8. Is human expertise available?**     
    Humans on their own have almost no ability to be able to predict the weather. there may be Meterologist who know a bit more but prediciting a whole season is not an easy feat.  

**9. How would you solve the problem manually?**  
    This is definetly not a problem that is a good one to try and solve manually. You can really dedicate your life to understanding weather patterns and logging them, but machine learning is the best way to go about this problem.  
    
**10. List the assumptions you (or others) have made so far. Verify assumptions if possible.**  
    One assumption could be that we need to wait at least until the beginning of february every year to make the prediction. We must also assume that march 15th will be the date of guarantee every year.  
    

In [None]:
import numpy as np
import matplotlib.pylab as plt
import pandas as pd
import scipy as sp
import seaborn as sns

from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler, StandardScaler, FunctionTransformer
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer

from sklearn.ensemble import VotingClassifier, BaggingClassifier, AdaBoostClassifier, GradientBoostingRegressor, StackingClassifier

from sklearn.metrics import accuracy_score, mean_squared_error

Get the Data
--

**1. List the data you need and how much you need**  
We need data from January 1st to February 2nd. The data needs to be for each day. The data must contain as many features relevant to the weather as possible. We also need to know which years in the past were early springs or not. Our data should go back as far as possible.  

**2. Find and document where you can get that data**  
Done. Provided by an intern.  

**3. Get access authorizations**  
Done.  

**4. Create a workspace (with enough storage space)**  
Done. Visual Studio Code Jupyter Notebooks

**5. Get the data**  
Done.  

**6. Convert the data to a format you can easily manipulate (without changing the data itself)**

In [None]:
def load_weather_data():
    """
    Loads the CSV file which contains our data for weather.
    """
    return pd.read_csv('weather.csv')

In [None]:
def load_phil_data():
    """
    Loads the CSV file which contains our data for phil's predictions.
    """
    return pd.read_csv('phil_pred.csv')

In [None]:
def load_spring_data():
    """
    Loads the CSV file which contains our data for actuality of season.
    """
    return pd.read_csv('early_spring.csv')

In [None]:
def read_temperature_data(filename):
    """
    Reads temperature data from the given file. M values are assumed to be
    missing values (returned as nan). T values are trace values and returned as
    0.0025 inches for precipitation and snowfall and 0.025 inches for snowdepth
    (see https://www.chicagotribune.com/news/weather/ct-wea-asktom-0415-20180413-column.html).
    """
    def convert_precipitation(raw):
        return 0.0025 if raw == 'T' else np.nan if raw == 'M' else pd.to_numeric(raw)
    def convert_depth(raw):
        return 0.025 if raw == 'T' else np.nan if raw == 'M' else pd.to_numeric(raw)
    return pd.read_csv(filename, na_values=['M'], parse_dates=[0],
        converters={
            "precipitation":convert_precipitation,
            "snowfall":convert_precipitation,
            "snowdepth":convert_depth,
        })

In [None]:
weather_data = load_weather_data()
phil_data = load_phil_data()
spring_data = load_spring_data()

**7. Ensure sensitive information is deleted or protected (e.g. anonymized)**   
Not needed.

**8. Check the size and type of data (time series, geographical, ...)**  
weather_data:
We have 7 features, 6 of which are floats. The date feature is a string. There are 2211 entries in total.

phil_data (groundhog's predictions):
There are 2 features. One is an int and the other is a bool. There are 60 entries in total.

spring_data (which years were early spring):
There are 2 features. One is an int and the other is a bool. There are 67 entries in total.


In [None]:
weather_data.info()
weather_data.describe()

In [None]:
weather_data['date'].apply(lambda x: type(x) == str).all()

In [None]:
phil_data.info()
phil_data.describe()

In [None]:
spring_data.info()
spring_data.describe()

**9. Sample a test set, put it aside, and never look at it (no data snooping!)**  

In [None]:
#this line is used for converting strings to datetimes
weather_data['date'] = weather_data['date'].astype('datetime64[ns]')

In [None]:
#function that groups date sets of 33 into individual years
def convert_dates_to_year():
    weather_data['year'] = weather_data['date'].dt.year

    days = []
    for year in range(0, 67):
        for day in range (0, 33):
            days.append(day)

    weather_data['day_of_year'] = days

In [None]:
convert_dates_to_year()

In [None]:
weather_data.drop(columns=['date'], inplace=True)

In [None]:
pivot_weather = weather_data.pivot(index='year', columns='day_of_year')
pivot_weather

In [None]:
pivot_weather.columns = ["_".join(str(x) for x in a) for a in pivot_weather.columns.to_flat_index()]

In [None]:
def merge_spring_and_weather_data():
    return pd.merge(pivot_weather, spring_data, on='year', how='inner')

In [None]:
data = merge_spring_and_weather_data()
data

In [55]:
data = data[data['year'] != 2001]

Unnamed: 0,year,max_temp_0,max_temp_1,max_temp_2,max_temp_3,max_temp_4,max_temp_5,max_temp_6,max_temp_7,max_temp_8,...,snowdepth_23,snowdepth_24,snowdepth_25,snowdepth_26,snowdepth_27,snowdepth_28,snowdepth_29,snowdepth_30,snowdepth_31,snowdepth_32
52,2005,55.0,56.0,56.0,57.0,44.0,53.0,40.0,50.0,43.0,...,7.0,7.0,6.0,5.0,5.0,5.0,4.0,3.0,3.0,2.0
47,2000,68.0,64.0,64.0,62.0,36.0,57.0,56.0,45.0,48.0,...,4.0,2.0,3.0,3.0,3.0,2.0,2.0,4.0,4.0,5.0
39,1989,36.0,38.0,36.0,28.0,35.0,,41.0,57.0,38.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
35,1984,30.0,34.0,33.0,37.0,35.0,37.0,,37.0,38.0,...,4.0,3.0,1.0,1.0,0.025,1.0,2.0,6.0,6.0,5.0
1,1948,46.0,41.0,32.0,34.0,34.0,32.0,32.0,41.0,49.0,...,8.0,10.0,10.0,9.0,9.0,8.0,5.0,5.0,5.0,4.0
12,1959,38.0,42.0,46.0,41.0,20.0,18.0,23.0,27.0,20.0,...,0.025,0.025,3.0,3.0,3.0,3.0,0.025,0.025,0.025,0.0
18,1966,62.0,42.0,43.0,43.0,51.0,43.0,40.0,33.0,30.0,...,11.0,10.0,8.0,8.0,8.0,8.0,11.0,13.0,12.0,15.0
13,1960,42.0,46.0,49.0,31.0,32.0,37.0,41.0,40.0,48.0,...,3.0,2.0,2.0,1.0,0.025,0.025,0.0,0.0,0.0,0.025
24,1972,47.0,42.0,46.0,42.0,40.0,27.0,34.0,37.0,47.0,...,0.0,0.0,0.0,0.0,3.0,3.0,2.0,2.0,2.0,1.0
36,1985,70.0,57.0,33.0,42.0,35.0,35.0,35.0,33.0,22.0,...,16.0,18.0,18.0,18.0,17.0,14.0,12.0,10.0,9.0,13.0


In [None]:
train_set, test_set = train_test_split(data, test_size=0.2, random_state=250)

Explore the Data
--

**1. Copy the data for exploration, downsampling to a manageable size if necessary.**  
Downsizing not necessary

**2. Study each attribute and its characteristics: Name; Type (categorical, numerical, 
bounded, text, structured, ...); % of missing values; Noisiness and type of noise (stochastic, outliers, rounding errors, ...); 
Usefulness for the task; Type of distribution (Gaussian, uniform, logarithmic, ...)**  

In [None]:
train_set.info(verbose=True)

In [None]:
missing_values = train_set.isnull().sum() / len(train_set) * 100

In [None]:
variables = {}
attrs = ['max_temp', 'min_temp', 'avg_temp', 'precipitation', 'snowfall', 'snowdepth']
for i, attr in enumerate(attrs):
    start = 1+(33*i)
    s = missing_values[start:start+33]
    s = pd.DataFrame(s)
    variables[attr] = s

variables

In [None]:
variables['max_temp'].describe()

In [None]:
variables['min_temp'].describe()

In [None]:
variables['avg_temp'].describe()

In [None]:
variables['precipitation'].describe()

In [None]:
variables['snowfall'].describe()

In [None]:
variables['snowdepth'].describe()

In [None]:
plt.figure(figsize=(20,20))
for i, attr in enumerate(attrs):
    plt.subplot(6, 1, i+1)
    plt.title(attr)
    plt.hist(weather_data[attr], bins=75)

Attributes and Characteristics
---
**early_spring** -- Type: Bool, Missing Values: Mean = None, Noise: None

**max_temp** -- Type: Float64, Missing Values: Mean = 1.99, Type of Distribution: Gaussian Distribution, No Skew.

**min_temp** -- Type: Float64, Missing Values: Mean = 1.85, Type of Distribution: Gaussian Distribution, Left Skewed.

**average_temp** -- Type: Float64, Missing Values: Mean = 2.30, Type of Distribution: Gaussian Distribution, Left Skewed.

**percipitation** -- Type: Float64, Missing Values: Mean = 0.13, Type of Distribution: Logarithmic Distribution.

**snowdepth** -- Type: Float64, Missing Values: Mean = 0.58, Type of Distribution: Logarithmic Distribution.

**snowfall** -- Type: Float64, Missing Values: Mean = 1.67, Type of Distribution: Logarithmic Distribution.

**year** -- Type: Int64, Missing Values: Mean = None, Noise: None

**Usefullness for Task** -- Early Spring will be very useful as it is our target attribute that we will base our future reasoning on. To find out coorelations between an early spring and weather pattern we must utilize all of the given features that contain weather events and temperatures. We have the ability to utilize all features within our dataset.

In [None]:
train_set.describe()

**3. For supervised learning tasks, identify the target attribute(s)**  

The target attributes are going to be max_temp, min_temp, avg_temp, precipitation, snowfall, snowdepth. These attributes are useful for solving the problem. year is probably not going to have a meaningful correlation with early_spring and early_spring itself will be the label.

**4. Visualize the data**  


In [None]:
plt.figure(figsize=(15, 5))
plt.title('correlation of numerical weather attributes with each other')
sns.heatmap(weather_data[attrs].corr(), annot=True, vmin=-1, vmax=1, cmap='seismic')
plt.tight_layout()

In [None]:
def add_number_to_attr(attr, start, stop):
    return [attr + '_' + str(i) for i in range(start, stop)]

In [None]:
plt.figure(figsize=(10, 100))
for i, attr in enumerate(attrs):
    plt.subplot(len(attrs), 1, i+1)
    plt.title('correlation of ' + attr + ' with early spring')
    correlations_with_label = train_set[add_number_to_attr(attr, 0, 33)].corrwith(train_set['early_spring'])
    correlations_with_label = pd.DataFrame(correlations_with_label)
    sns.heatmap(correlations_with_label, annot=True, vmin=-1, vmax=1, cmap='seismic')
    plt.tight_layout()

**5. Study the correlations between attributes**  

avg_temp highly correlates with both min_temp and max_temp at above 0.9. min_temp and max_temp have the next highest correlation with each other at 0.75. The temperature attributes probably correlate with each other because the extremes don't fall far from the average.

The next highest correlation is snowfall and snowdepth. This is probably because most of the time snowdepth is equal to or greater than snowfall and a high snowfall means there will be a high snowdepth.

Precipitation has no correlation with snowdepth at all. This is probably because whether or not it snowed the previous day doesn't affect whether or not it will rain the current day. However, snowfall does have a correlation with precipitation because rain can turn into snow and snow can turn into rain if the temperature changes.

max_temp, min_temp, and avg_temp correlate almost the same with precipitation, snowfall, and snowdepth. This is probably because they all correlate with each other. I will call these three attributes the temperature.

The temperature has a slight to moderate positive correlation with precipitation because it's more likely to rain if it's warmer.

The temperature has a slight to moderate negative correlation with snowfall because it's more likely to snow if it's colder.

The temperature has a moderate to strong negative correlation with snowdepth. The reason this correlation is stronger than the correlation with snowfall is probably because it's more likely that snow has accumulated compared to the probability that it is currently snowing.

The temperature from January 1st to January 10th correlates slight to moderately with whether or not there will be an early spring.

The temperature on January 12th, 13th, 17th, and February 2nd has a slight to moderate neegative correlation with whether or not there will be an early spring. The only exception being min_temp on January 12th.

The correlation between precipitation and snowfall with early spring is random. One some days there is a slight to moderate positive correlation and on some days this is negative. Most of the days have no correlation.

Snowdepth has mostly no correlation with whether or not there will be an early spring.

**6. Study how you would solve the problem manually**  

We could manually predict an early spring by seeing if the average temperature is warm from January 1st to January 10th. We could also factor in whether it snowed or rained on certain days.

**7. Identify the promising transformations you may want to apply**  

A standardscaler could be used on max_temp, min_temp, and avg_temp. A logarithmic scaler can be used on precipitation, snowfall, and snowdepth. max_temp, min_temp, and avg_temp can be normalized to be in between -1 and 1. precipitation, snowfall, and snowdepth can be normalized to be in between 0 and 1.

**8. Identify extra data that would be useful (go back to “Get the Data”)**  

We have all the data we need.

**9. Document what you have learned**  
We have 264 rows of data missing. There are 8 years missing.

Prepare the Data
---
**1. Data cleaning:** Fix/remove outliers (optional); Fill in missing values (with 0, mean, 
median...) or drop rows/columns 

Prepare the Data
---
**1. Data cleaning:** Fix/remove outliers (optional); Fill in missing values (with 0, mean, 
median...) or drop rows/columns  

**2. Feature selection (optional):** Drop attributes that provide no useful information 
for the task  

remove year

**3. Feature engineering, where appropriate:** Discretize continuous features; Decompose features (categorical, date/time, ...), 
Add promising transformations of features (log(𝑥𝑥), √𝑥𝑥, 𝑥𝑥2, ...); Aggregate features into promising new features  

**4. Feature scaling:** standardize or normalize features 

In [None]:
def split_labels(data, label_feature):
    """
    Split the given column of of the data, returning the full data set (without that
    feature) and the split off feature.
    """
    return data.drop(columns=label_feature), data[label_feature]

In [None]:
data, labels = split_labels(train_set, "early_spring")

In [None]:
class RemoveFeatureTransformer(BaseEstimator, TransformerMixin):
    """
    This transformer removes an entire feature from the data.
    """
    def __init__(self, attr):
        super().__init__()
        self.attr = attr

    def fit(self, X, y=None, **kwargs):
        # This transformer has nothing to learn from the training data
        return self

    def transform(self, X):
        return X.drop(self.attr)

In [None]:
max_temp_pipeline = Pipeline([
    #figure out how to drop 2001 or any year without filled values
    ('imputer', SimpleImputer(strategy='median')),
    ('scalar', StandardScaler()),
])

min_temp_pipeline = Pipeline([
    #figure out how to drop 2001 or any year without filled values
    ('imputer', SimpleImputer(strategy='median')),
    ('scalar', StandardScaler()),
])

avg_temp_pipeline = Pipeline([
    #figure out how to drop 2001 or any year without filled values
    ('imputer', SimpleImputer(strategy='median')),
    ('scalar', StandardScaler()),
])

precipitation_pipeline = Pipeline([
    #figure out how to drop 2001 or any year without filled values
    ('imputer', SimpleImputer(strategy='median')),
    ('log', FunctionTransformer(np.log1p)),
    ('scalar', StandardScaler()),
])

snowfall_pipeline = Pipeline([
    #figure out how to drop 2001 or any year without filled values
    ('imputer', SimpleImputer(strategy='median')),
    ('log', FunctionTransformer(np.log1p)),
    ('scalar', StandardScaler()),
])

snowdepth_pipeline = Pipeline([
    ('imputer', SimpleImputer(strategy='median')),
    ('log', FunctionTransformer(np.log1p)),
    ('scalar', StandardScaler()),
])

preprocessor = ColumnTransformer(transformers=[
    ('max_temp', max_temp_pipeline, add_number_to_attr('max_temp', 0, 33)),
    ('min_temp', min_temp_pipeline, add_number_to_attr('min_temp', 0, 33)),
    ('avg_temp', avg_temp_pipeline, add_number_to_attr('avg_temp', 0, 33)),
    ('precip', precipitation_pipeline, add_number_to_attr('precipitation', 0, 33)),
    ('snowfall', snowfall_pipeline, add_number_to_attr('snowfall', 0, 33)),
    ('snowdepth', snowdepth_pipeline, add_number_to_attr('snowdepth', 0, 33))
])


In [48]:
data

Unnamed: 0,year,max_temp_0,max_temp_1,max_temp_2,max_temp_3,max_temp_4,max_temp_5,max_temp_6,max_temp_7,max_temp_8,...,snowdepth_23,snowdepth_24,snowdepth_25,snowdepth_26,snowdepth_27,snowdepth_28,snowdepth_29,snowdepth_30,snowdepth_31,snowdepth_32
52,2005,55.0,56.0,56.0,57.0,44.0,53.0,40.0,50.0,43.0,...,7.0,7.0,6.0,5.0,5.0,5.0,4.0,3.0,3.0,2.0
47,2000,68.0,64.0,64.0,62.0,36.0,57.0,56.0,45.0,48.0,...,4.0,2.0,3.0,3.0,3.0,2.0,2.0,4.0,4.0,5.0
39,1989,36.0,38.0,36.0,28.0,35.0,,41.0,57.0,38.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
35,1984,30.0,34.0,33.0,37.0,35.0,37.0,,37.0,38.0,...,4.0,3.0,1.0,1.0,0.025,1.0,2.0,6.0,6.0,5.0
1,1948,46.0,41.0,32.0,34.0,34.0,32.0,32.0,41.0,49.0,...,8.0,10.0,10.0,9.0,9.0,8.0,5.0,5.0,5.0,4.0
12,1959,38.0,42.0,46.0,41.0,20.0,18.0,23.0,27.0,20.0,...,0.025,0.025,3.0,3.0,3.0,3.0,0.025,0.025,0.025,0.0
18,1966,62.0,42.0,43.0,43.0,51.0,43.0,40.0,33.0,30.0,...,11.0,10.0,8.0,8.0,8.0,8.0,11.0,13.0,12.0,15.0
13,1960,42.0,46.0,49.0,31.0,32.0,37.0,41.0,40.0,48.0,...,3.0,2.0,2.0,1.0,0.025,0.025,0.0,0.0,0.0,0.025
24,1972,47.0,42.0,46.0,42.0,40.0,27.0,34.0,37.0,47.0,...,0.0,0.0,0.0,0.0,3.0,3.0,2.0,2.0,2.0,1.0
36,1985,70.0,57.0,33.0,42.0,35.0,35.0,35.0,33.0,22.0,...,16.0,18.0,18.0,18.0,17.0,14.0,12.0,10.0,9.0,13.0


In [49]:
preprocessor.fit(data)

ColumnTransformer(transformers=[('max_temp',
                                 Pipeline(steps=[('imputer',
                                                  SimpleImputer(strategy='median')),
                                                 ('scalar', StandardScaler())]),
                                 ['max_temp_0', 'max_temp_1', 'max_temp_2',
                                  'max_temp_3', 'max_temp_4', 'max_temp_5',
                                  'max_temp_6', 'max_temp_7', 'max_temp_8',
                                  'max_temp_9', 'max_temp_10', 'max_temp_11',
                                  'max_temp_12', 'max_temp_13', 'max_temp_14',
                                  'max_temp_15',...
                                  'snowdepth_6', 'snowdepth_7', 'snowdepth_8',
                                  'snowdepth_9', 'snowdepth_10', 'snowdepth_11',
                                  'snowdepth_12', 'snowdepth_13',
                                  'snowdepth_14', 'snowdepth_15',

In [50]:
prepped_data = preprocessor.transform(data)

In [51]:
data

Unnamed: 0,year,max_temp_0,max_temp_1,max_temp_2,max_temp_3,max_temp_4,max_temp_5,max_temp_6,max_temp_7,max_temp_8,...,snowdepth_23,snowdepth_24,snowdepth_25,snowdepth_26,snowdepth_27,snowdepth_28,snowdepth_29,snowdepth_30,snowdepth_31,snowdepth_32
52,2005,55.0,56.0,56.0,57.0,44.0,53.0,40.0,50.0,43.0,...,7.0,7.0,6.0,5.0,5.0,5.0,4.0,3.0,3.0,2.0
47,2000,68.0,64.0,64.0,62.0,36.0,57.0,56.0,45.0,48.0,...,4.0,2.0,3.0,3.0,3.0,2.0,2.0,4.0,4.0,5.0
39,1989,36.0,38.0,36.0,28.0,35.0,,41.0,57.0,38.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
35,1984,30.0,34.0,33.0,37.0,35.0,37.0,,37.0,38.0,...,4.0,3.0,1.0,1.0,0.025,1.0,2.0,6.0,6.0,5.0
1,1948,46.0,41.0,32.0,34.0,34.0,32.0,32.0,41.0,49.0,...,8.0,10.0,10.0,9.0,9.0,8.0,5.0,5.0,5.0,4.0
12,1959,38.0,42.0,46.0,41.0,20.0,18.0,23.0,27.0,20.0,...,0.025,0.025,3.0,3.0,3.0,3.0,0.025,0.025,0.025,0.0
18,1966,62.0,42.0,43.0,43.0,51.0,43.0,40.0,33.0,30.0,...,11.0,10.0,8.0,8.0,8.0,8.0,11.0,13.0,12.0,15.0
13,1960,42.0,46.0,49.0,31.0,32.0,37.0,41.0,40.0,48.0,...,3.0,2.0,2.0,1.0,0.025,0.025,0.0,0.0,0.0,0.025
24,1972,47.0,42.0,46.0,42.0,40.0,27.0,34.0,37.0,47.0,...,0.0,0.0,0.0,0.0,3.0,3.0,2.0,2.0,2.0,1.0
36,1985,70.0,57.0,33.0,42.0,35.0,35.0,35.0,33.0,22.0,...,16.0,18.0,18.0,18.0,17.0,14.0,12.0,10.0,9.0,13.0
