# Predict the Flip! (Data Collection)

## Imports

In [2]:
import functions as dlf
import pandas as pd
import numpy as np
import requests
import regex
import pickle
from importlib import reload
from bs4 import BeautifulSoup
from genderize import Genderize

## Links

Below are some of the links that inspired this project and/or features to add in the future. The link to Wikipedia is the source of this dataset.

> [Wikipedia](https://en.wikipedia.org/wiki/List_of_United_States_Senate_elections) <-- Where the dataset was scraped from!

> [Link to Brookings](https://www.brookings.edu/multi-chapter-report/vital-statistics-on-congress/)

> [Link to BallotPedia](https://ballotpedia.org/Legislative_Branch)

> [Link to Wiki-Category (U.S. Senator)](https://commons.wikimedia.org/wiki/Category:Senators_of_the_United_States)

## Background

This dataset will be created by scraping the various pages on Wikipedia containing information on past United States Senate elections. 

Although the Senate has existed in some form since 1788, the U.S. Senate election process we know and recognize today came to fruition after the ratification of the 17th Amendment of the U.S. Constitution in 1913.

This Amendment made it so that the senators representing any given state needed to be elected by popular vote of the people, rather than being appointed by the state's legislature (unless it is an interim position until a special election is held).

Given this, the dataset will feature information beginning in 1914 through present. In some cases it will only feature 1920 through present as the acting senators during the ratification of the 17th Amendment were allowed to finish their six-year terms.

More information about this can be found here: [17th Amendment](https://en.wikipedia.org/wiki/Seventeenth_Amendment_to_the_United_States_Constitution)

## Building code

**<font size=3>DISCLAIMER!</font>**

This section includes all of the functions and code used to scrape and clean up this dataset. For brevity, a vast majority of the code used during this process has been moved to the *functions.py* file in this repository. I will put comments in each cell describing what is going on with each step.

#### Wikitable scraping

> The following function connects to Wikipedia, uses Beautiful Soup to scrape each link to a different year on the election. It then separates and stores the wikitables and the table of contents (toc) from each page in a dictionary.

In [None]:
elect_yr_tables, elect_yr_tocs = dlf.wiki_senate_scraper()

#### Wikitable Organizing

> The following function iterates through `elect_yr_tables`' wikitables and separates them into three groups: senate leaders, year's summary, state general elections. Stores them as a list in a dictionary by year.

In [3]:
election_collection_raw = dlf.election_collector(elect_yr_tables, elect_yr_tocs)

#### Year Summary Collection & Cleaning

**<font size=3>DISCLAIMER!</font>**

*The following functions work for all years up until 2018 (the targeted tables do not exist on Wikipedia)* 

> The following function iterates through `election_collection_raw` to move each year's summary to a  new dictionary of wikitables as dataframes.

In [None]:
yr_sum_tables = dlf.yr_summary_collector(election_collection_raw)

> The following function iterates through `yr_sum_tables` to clean the summary and store a state-senator reference dataframe for the same year in a dictionary.

In [None]:
yr_sum_dict = dlf.yr_sum_formatter(yr_sum_tables)

#### State-Senator Reference Table

**<font size=3>DISCLAIMER!</font>**

*The following function works for all years up until 2018 (the targeted tables do not exist on Wikipedia)* 

> The following function iterates through `yr_sum_dict` to grab the reference table from every year and concatenate them into a master state-senator reference dataframe. It also adds the `Terms_in_office` and `Cln_name` features.

In [4]:
name_lookup_df = dlf.master_tabler(yr_sum_dict)

#### Senate Leader Collection & Cleaning

> The following function iterates over `election_collection_raw` and stores every wikitable with the 1st row dropped (if possible).

In [None]:
sen_leader_tables = dlf.sen_leader_collector(election_collection_raw)

> The following function iterates over `sen_leader_tables`, first splitting the dataframes into two (if necessary). It then re-indexes, transposes, and (if necessary) concatenate the dataframe(s) before storing them in dictionary by year.

In [None]:
sen_leader_dict = dlf.sen_leader_cleaner(sen_leader_tables)

> The following function iterates over `sen_leader_dict`, combining all of the years into one master dataframe of all-time senate leaders. 

In [None]:
ldr_master_df = dlf.master_leader_tabler(sen_leader_dict)

> The following function goes over `ldr_master_df` and either:
 * Replaces values
 * Fills NAs
 * All the above
     * There is an option for different strategies to be implemented

In [None]:
ldr_master_df_model = dlf.master_leader_cleaner(ldr_master_df, rd_stand=True, fillna_stand=True)

> The following function iterates over the leader's names in `ldr_master_df_model` and cleans them via regex.

In [8]:
ldr_master_df_model = dlf.regex_subber_bycol(ldr_master_df_model, 'Leader', 
                                         [r'\(([^\)]+)\)', r'\[([^\)]+)\]'],
                                         multi_patt=True)

#### State Election Collection & Cleaning

> The following function iterates over `election_collection_raw` and stores all of each year's state elections into a new dictionary for further cleaning.

In [None]:
yr_st_elects = dlf.st_election_collector(election_collection_raw)
print(('---'*10)+'Step 1 Complete!'+('---'*10))

> The following function iterates over `yr_st_elects` to collect each year's wikitables and clean them via the `st_election_formatter` function found in the `functions.py` file; it re-formats the wikitables by moving data from rows into columns and NA handling, returning a dictionary of formatting dataframes

In [None]:
yr_st_dict = dlf.st_election_cleaner(yr_st_elects)
print(('---'*10)+'Step 2 Complete!'+('---'*10))

> The following function iterates over `yr_st_dict` to further filter out dataframes that represent non-general elections. It then collects necessary columns from each dataframe, finally concatenating them into one consisting of all a year's state general elections.

In [None]:
yr_st_dict_joined = dlf.st_election_aggregator(yr_st_dict)
print(('---'*10)+'Step 3 Complete!'+('---'*10))

> The following function iterates over `yr_st_dict_joined` referencing `name_lookup_df` to clean candidate names in each year's dataframe via regex and use the names to match candidates to the correct state and election. It also merges the `Turnout` & `Total Votes` columns into one (edge case control). It lastly stores each year's cleaned dataframe into a dictionary by year.

In [None]:
yr_st_dict_mapped = dlf.st_election_state_mapper(yr_st_dict_joined, name_lookup_df)
print(('---'*10)+'Step 4 Complete!'+('---'*10))

> The following function iterates over `yr_st_dict_mapped` to go through the `Turnout`, `%`, `Candidate`, and `State` columns to either manually or programmatically fill NAs.

In [None]:
yr_st_mapped_full = dlf.yr_st_mapped_NA_handler(yr_st_dict_mapped)
print(('---'*10)+'Step 5 Complete!'+('---'*10))

> The following function iterates over `yr_st_mapped_full`, referencing `name_lookup_df` and `ldr_master_df_model` to augment or create existing and new columns in each year's dataframe. Examples include `Terms_in_office` and `Seats_before%`.

In [8]:
yr_st_mapped_cln = dlf.st_mapped_cleaner(yr_st_mapped_full, name_lookup_df, ldr_master_df_model)
print(('---'*10)+'Step 6 Complete!'+('---'*10))

4 NA operations completed.
------------------------------Step 5 Complete!------------------------------
20 loops done! Just finished 1958.
20 loops done! Just finished 1998.
Finished 49 years! Latest year collected 2016.
------------------------------Step 6 Complete!------------------------------


> The following function iterates over `yr_st_mapped_cln` in order to concatenate every year into one master dataframe for EDA/modeling.

In [9]:
fin_df = dlf.cln_st_combiner(yr_st_mapped_cln)

## Pickles

In order to keep my computer running at an acceptable pace and save runtime during this process, it was necessary to save my variables periodically via the `pickle` package. I have left them in here for future usage.

### Writing..

In [None]:
# with open('Pickles/election_collection_raw.pickle', 'wb') as f:
#     pickle.dump(election_collection_raw, f)
#     f.close()

In [None]:
# with open('Pickles/yr_sum_tables.pickle', 'wb') as f:
#     pickle.dump(yr_sum_tables, f)
#     f.close()

In [None]:
# with open('Pickles/yr_sum_dict.pickle', 'wb') as f:
#     pickle.dump(yr_sum_dict, f)
#     f.close()

In [None]:
# with open('Pickles/name_lookup_df.pickle', 'wb') as f:
#     pickle.dump(name_lookup_df, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_elects.pickle', 'wb') as f:
#     pickle.dump(yr_st_elects, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict.pickle', 'wb') as f:
#     pickle.dump(yr_st_dict, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict_joined.pickle', 'wb') as f:
#     pickle.dump(yr_st_dict_joined, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict_mapped.pickle', 'wb') as f:
#     pickle.dump(yr_st_dict_mapped, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_mapped_full.pickle', 'wb') as f:
#     pickle.dump(yr_st_mapped_full, f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_mapped_cln.pickle', 'wb') as f:
#     pickle.dump(yr_st_mapped_cln, f)
#     f.close()

In [None]:
# with open('Pickles/sen_leader_tables.pickle', 'wb') as f:
#     pickle.dump(sen_leader_tables, f)
#     f.close()

In [None]:
# with open('Pickles/sen_leader_dict.pickle', 'wb') as f:
#     pickle.dump(sen_leader_dict, f)
#     f.close()

In [None]:
# with open('Pickles/ldr_master_df.pickle', 'wb') as f:
#     pickle.dump(ldr_master_df, f)
#     f.close()

In [None]:
# with open('Pickles/ldr_master_df_model.pickle', 'wb') as f:
#     pickle.dump(ldr_master_df_model, f)
#     f.close()

In [9]:
# fin_df.to_csv(r'C:\Users\d_ful\Documents\GitHub\Capstone_Project\Senate_generals_thru_2016.csv', index=False)

### Reading..

In [None]:
# with open('Pickles/election_collection_raw.pickle', 'rb') as f:
#     election_collection_raw = pickle.load(f)
#     f.close()

## Must do this befor running code for state elections. Error in code
# do_not_use = election_collection_raw['2018'][2].pop(0)

In [None]:
# with open('Pickles/yr_sum_tables.pickle', 'rb') as f:
#     yr_sum_tables = pickle.load(f)
#     f.close()

In [12]:
with open('Pickles/yr_sum_dict.pickle', 'rb') as f:
    yr_sum_dict = pickle.load(f)
    f.close()

In [10]:
with open('Pickles/name_lookup_df.pickle', 'rb') as f:
    name_lookup_df = pickle.load(f)
    f.close()

In [None]:
# with open('Pickles/yr_st_elects.pickle', 'rb') as f:
#     yr_st_elects = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict.pickle', 'rb') as f:
#     yr_st_dict = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict_joined.pickle', 'rb') as f:
#     yr_st_dict_joined = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_dict_mapped.pickle', 'rb') as f:
#     yr_st_dict_mapped = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/yr_st_mapped_full.pickle', 'rb') as f:
#     yr_st_mapped_full = pickle.load(f)
#     f.close()

In [13]:
with open('Pickles/yr_st_mapped_cln.pickle', 'rb') as f:
    yr_st_mapped_cln = pickle.load(f)
    f.close()

In [None]:
# with open('Pickles/sen_leader_tables.pickle', 'rb') as f:
#     sen_leader_tables = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/sen_leader_dict.pickle', 'rb') as f:
#     sen_leader_dict = pickle.load(f)
#     f.close()

In [None]:
# with open('Pickles/ldr_master_df.pickle', 'rb') as f:
#     ldr_master_df = pickle.load(f)
#     f.close()

In [None]:
with open('Pickles/ldr_master_df_model.pickle', 'rb') as f:
    ldr_master_df_model = pickle.load(f)
    f.close()

In [22]:
fin_df = pd.read_csv('Senate_generals_thru_2016.csv')

## Workshop

### Dfs

In [21]:
test = dlf.table_checker(dict_=yr_sum_dict)
display(test[0].head(15))
test[0].info()

Key to pull:1994


Unnamed: 0,State,Senator,Party,Electoral_History,Results,Candidates
0,Arizona,Dennis DeConcini,Democratic,197619821988,Incumbent retired.New senator elected.Republic...,Jon Kyl (Republican) 53.7% Sam Coppersmith (De...
1,California,Dianne Feinstein,Democratic,1992 (Special),Incumbent re-elected.,Dianne Feinstein (Democratic) 46.7% Michael Hu...
2,Connecticut,Joe Lieberman,Democratic,1988,Incumbent re-elected.,Joe Lieberman (Democratic) 67% Jerry Labriola ...
3,Delaware,William Roth,Republican,19701971 (Appointed)197619821988,Incumbent re-elected.,William Roth (Republican) 55.8% Charles Oberly...
4,Florida,Connie Mack III,Republican,1988,Incumbent re-elected.,Connie Mack III (Republican) 70.5% Hugh Rodham...
5,Hawaii,Daniel Akaka,Democratic,1990 (Appointed)1990 (Special),Incumbent re-elected.,Daniel Akaka (Democratic) 71.8% Maria Hustace ...
6,Indiana,Richard Lugar,Republican,197619821988,Incumbent re-elected.,Richard Lugar (Republican) 67.4% Jim Jontz (De...
7,Maine,George J. Mitchell,Democratic,1980 (Appointed)19821988,Incumbent retired.New senator elected.Republic...,Olympia Snowe (Republican) 60.2% Thomas Andrew...
8,Maryland,Paul Sarbanes,Democratic,197619821988,Incumbent re-elected.,Paul Sarbanes (Democratic) 59.1% Bill Brock (R...
9,Massachusetts,Ted Kennedy,Democratic,1962 (Special)19641970197619821988,Incumbent re-elected.,Ted Kennedy (Democratic) 58.1% Mitt Romney (Re...


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35 entries, 0 to 34
Data columns (total 6 columns):
State                35 non-null object
Senator              35 non-null object
Party                35 non-null object
Electoral_History    35 non-null object
Results              35 non-null object
Candidates           35 non-null object
dtypes: object(6)
memory usage: 1.7+ KB


In [11]:
display(name_lookup_df.head(15))
name_lookup_df.info()

Unnamed: 0,State_id,Incumbent,Party,Year,Terms_in_office,Cln_name
0,Alabama,Francis S. White,Democratic,1914,1,Francis White
1,Arizona,Marcus A. Smith,Democratic,1914,1,Marcus Smith
2,Arkansas,James Paul Clarke,Democratic,1914,1,James Paul Clarke
3,California,George Clement Perkins,Republican,1914,1,George Clement Perkins
4,Colorado,Charles S. Thomas,Democratic,1914,1,Charles Thomas
5,Connecticut,Frank B. Brandegee,Republican,1914,1,Frank Brandegee
6,Florida,Duncan U. Fletcher,Democratic,1914,1,Duncan Fletcher
7,Georgia,M. Hoke Smith,Democratic,1914,1,Hoke Smith
8,Idaho,James H. Brady,Republican,1914,1,James Brady
9,Illinois,Lawrence Y. Sherman,Republican,1914,1,Lawrence Sherman


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1899 entries, 0 to 1898
Data columns (total 6 columns):
State_id           1899 non-null object
Incumbent          1899 non-null object
Party              1899 non-null object
Year               1899 non-null object
Terms_in_office    1899 non-null int64
Cln_name           1899 non-null object
dtypes: int64(1), object(5)
memory usage: 89.1+ KB


In [20]:
test2 = dlf.table_checker(dict_=yr_st_mapped_cln)
display(test2.tail(15))
test2.info()

Key to pull:1994


Unnamed: 0,%,Turnout,Incumb_Y,State,Cln_name,Year,Terms_in_office,Party_enc,First_name,Seats_up%,Seats_before%
111,50.32,211672,1,Vermont,Jim Jeffords,1994,1,R,Jim,0.295455,0.44
112,40.57,211672,0,Vermont,Jan Backus,1994,0,D,Jan,0.392857,0.56
113,0.65,211672,0,Vermont,Jerry Levy,1994,0,T,Jerry,0.0,0.0
114,45.61,2057463,1,Virginia,Chuck Robb,1994,1,D,Chuck,0.392857,0.56
115,42.88,2057463,0,Virginia,Oliver North,1994,0,R,Oliver,0.295455,0.44
116,55.8,1700173,1,Washington,Slade Gorton,1994,2,R,Slade,0.295455,0.44
117,44.3,1700173,0,Washington,Ron Sims,1994,0,D,Ron,0.392857,0.56
118,69.0,420936,1,West_Virginia,Robert Byrd,1994,6,D,Robert,0.392857,0.56
119,31.0,420936,0,West_Virginia,Stan Klos,1994,0,R,Stan,0.295455,0.44
120,58.3,1565090,1,Wisconsin,Herb Kohl,1994,1,D,Herb,0.392857,0.56


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 126 entries, 0 to 125
Data columns (total 11 columns):
%                  126 non-null object
Turnout            126 non-null int64
Incumb_Y           126 non-null int64
State              126 non-null object
Cln_name           126 non-null object
Year               126 non-null object
Terms_in_office    126 non-null int64
Party_enc          126 non-null object
First_name         126 non-null object
Seats_up%          126 non-null float64
Seats_before%      126 non-null float64
dtypes: float64(2), int64(3), object(6)
memory usage: 10.9+ KB


In [38]:
display(ldr_master_df_model.head(15))
ldr_master_df_model.info()

Unnamed: 0,Leader,Leaders_seat,Leader_since,Party,Seats_up,Seats_before,Year
0,Mitch McConnell,Kentucky,"January 3, 2007",Republican,9,51,2018
1,Chuck Schumer,New York,"January 3, 2017",Democratic,24,47,2018
2,3rd Party,3rd Party,"January 3, 2017",Independent,2,2,2018
3,John W. Kern,Indiana,"March 4, 1913",Democratic,16,50,1914
4,Jacob H. Gallinger,New Hampshire,"March 4, 1913",Republican,18,44,1914
5,3rd Party,3rd Party,"March 4, 1913",Progressive,0,1,1914
6,John W. Kern,Indiana,"March 4, 1913",Democratic,17,56,1916
7,Jacob H. Gallinger,New Hampshire,"March 4, 1913",Republican,15,40,1916
8,Henry Cabot Lodge,Massachusetts,"March 4, 1919",Republican,17,43,1918
9,Oscar Underwood,Alabama,"April 27, 1920",Democratic,24,53,1918


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 148 entries, 0 to 147
Data columns (total 7 columns):
Leader          148 non-null object
Leaders_seat    148 non-null object
Leader_since    148 non-null object
Party           148 non-null object
Seats_up        148 non-null object
Seats_before    148 non-null object
Year            148 non-null object
dtypes: object(7)
memory usage: 8.2+ KB


In [23]:
display(fin_df.tail(15))
fin_df.info()

Unnamed: 0,%,Turnout,Incumb_Y,State,Cln_name,Year,Terms_in_office,Party_enc,First_name,Seats_up%,Seats_before%
5573,60.57,2049893.0,1,South_Carolina,Tim Scott,2016,2,R,Tim,0.444444,0.54
5574,36.93,2049893.0,0,South_Carolina,Thomas Dixon,2016,0,D,Thomas,0.227273,0.44
5575,0.09,2049893.0,0,South_Carolina,Write-Ins,2016,0,T,Write-Ins,0.0,0.0
5576,71.83,369619.0,1,South_Dakota,John Thune,2016,2,R,John,0.444444,0.54
5577,28.17,369619.0,0,South_Dakota,Jay Williams,2016,0,D,Jay,0.227273,0.44
5578,68.15,1115608.0,1,Utah,Mike Lee,2016,1,R,Mike,0.444444,0.54
5579,27.06,1115608.0,0,Utah,Misty Snow,2016,0,D,Misty,0.227273,0.44
5580,59.99,320467.0,1,Vermont,Patrick Leahy,2016,7,D,Patrick,0.227273,0.44
5581,32.34,320467.0,0,Vermont,Scott Milne,2016,0,R,Scott,0.444444,0.54
5582,2.86,320467.0,0,Vermont,Cris Ericson,2016,0,T,Cris,0.0,0.0


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5588 entries, 0 to 5587
Data columns (total 11 columns):
%                  5588 non-null float64
Turnout            5583 non-null float64
Incumb_Y           5588 non-null int64
State              5588 non-null object
Cln_name           5588 non-null object
Year               5588 non-null int64
Terms_in_office    5588 non-null int64
Party_enc          5588 non-null object
First_name         5588 non-null object
Seats_up%          5588 non-null float64
Seats_before%      5588 non-null float64
dtypes: float64(4), int64(3), object(4)
memory usage: 480.3+ KB


## Appendix

### NA Handling Plan

#### Removal plan for 'Turnout'

**1928:** MA: 36-39 [1,524,914], PA: 91-97 [3,026,864]

**1930:** MA: 42-46 [1,207,011]

**1932:** AR: 6-8 [33,980], CO: 16-18 [425,634], NC: 87-88 [706,440]

**1934:** TN: 98-100 [308,274]

**1936:** MA: 41-50 [1,803,674]

**1942:** ID: 13-14 [142,342], RI: 73-74 [238,487] 

**1952:** MI: 34-39 [2,821,133]

**1958:** NC: 59-60 [616,469]

**1974:** IA: 45-47 [853,521]

**1976**: UT: 102-103 [514,169]

**1980**: OK: 83-87 [1,098,294]

**1986**: KY: 45-46 [677,105]

**1994**: HI: 20-22 [356,902]

**1998**: Just drop the other rows

**2000**: CT: 12-15 [1,311,261], MS: 61-65 [944,144], NJ: 80-90 [3,015,662], TN: 106-112 [1,928,613], TX: 113-116 [6,267,964], WV: 127-129 [603,477], WI: 130-135 [2,540,083] 

**2002**: AR: 7-8 [808,256], NH: 64-66 [444,542], NJ: 67-72 [2,112,604] 

**2004**: AR: 10-12 [1,039,439], CT: 24-27 [1,424,726]

**2006**: Just drop the other rows

**2012**: ME: 19-21 [693,787], MN: 34-36 [2,839,572]

* **1928**: MA doesn't report here. PA maybe under 'Totals'
* **1930**: MA doesn't report here.
* **1932**: AR/CO/NC doesn't report here.
* **1934**: TN doesn't report here.
* **1936**: MA maybe under 'total'
* **1942**: ID/RI doesn't report here.
* **1952**: MI doesn't report here.
* **1958**: NC doesn't report here.
* **1974**: IA doesn't report here.

* **1976**: UT doesn't report here.
* **1980**: OK doesn't report here.
* **1986**: KY doesn't report here.
* **1994**: Idx [3,4] need to be dropped. HI doesn't report here.
* **1998**: Idx [18-22] need to be dropped.
* **2000**: Idx [2-4] need to be dropped. CT/MS/NJ/TN/TX/WV/WI doesn't report here.
* **2002**: AR/NH/NJ doesn't report here.
* **2004**: AR/CT doesn't report here.
* **2006**: Idx [3-6] & [103-107] need to be dropped. 
* **2012**: ME/MN doesn't report here.

#### Removal plan for 'Candidate'

Only in 1932:
* **one** of these idx [30, 31, 32] needs to be changed to: 	Duncan U. Fletcher (Incumbent), Democratic, 1, Florida, 100%
* The rest need to be dropped.

#### %

* **1932**: Taken care of in 'Candidate'
* **1956**: Idx [68] needs to be changed to: 100%. Idx [69, 70] need to be dropped.
* **2012**: Idx [25, 26, 52, 53, 54] needs to be (53.74, 46.19, 58.87, 39.37, 0.50) resp.
* **2014**: Idx [91, 92, 93, 94] needs to be (48.82, 47.26, 3.74, 0.18) resp.

#### State

Just forward fill states after all other NAs have been filled or dropped.

**1924**: Idx [0-1] = 'Alabama'

**1950**: Idx [0-1] = 'Alabama'

**1960**: Idx [0-1] = 'Alabama'

**1962**: Idx [0-1] = 'Alabama'

**1966**: Idx [0-3] = 'Alabama'

**1992**: Idx [0-3] = 'Alabama'; Idx [4-9] DROP