# Scrape the Senate "Voter Power Index" From *FiveThirtyEight*

This notebook scrapes the Senate "Voter Power Index" for 538 at a given moment and puts the resulting CSV in the `data/fivethirtyeight` directory.

In [1]:
import requests
import pandas as pd
import json

## Fetch *FiveThirtyEight* Senate Election Forecast

The data we want is embedded in a JSON file on the *FiveThirtyEight* site that populates the webpage.

In [2]:
URL = "https://projects.fivethirtyeight.com/2018-midterm-election-forecast/senate/home.json"

In [3]:
res = requests.get(URL)

In [4]:
data = json.loads(res.content)

## Extract "Voter Power Index"

In [5]:
seats = []
for seat in data["seatForecasts"]:
    if "vpi" in seat.keys():
        seat_dict = {
            "state": seat["state"],
            "vpi": seat["vpi"]["classic"],
            "class": seat["class"]
        }
        seats.append(seat_dict)

In [6]:
len(seats)

35

In [7]:
voter_power_index = pd.DataFrame(seats)

In [8]:
voter_power_index.head()

Unnamed: 0,class,state,vpi
0,1,AZ,3.769032
1,1,CA,0.0
2,1,CT,0.065269
3,1,DE,0.081507
4,1,FL,1.030212


*Note: Two states: Mississippi and Minnesota have both of their senate seats up for grabs. This gives voters in those states more power.*

In [9]:
voter_power_index["state"].nunique()

33

In [10]:
voter_power_index["state"].value_counts()[:5]

MN    2
MS    2
NM    1
CT    1
OH    1
Name: state, dtype: int64

In [11]:
total_voter_power_index = voter_power_index.groupby("state")["vpi"].sum().to_frame()

In [12]:
len(total_voter_power_index)

33

In [13]:
total_voter_power_index.to_csv("../data/fivethirtyeight/senate-voter-power-index.csv")

# Extract Senate Candidate Odds

In addition to the "Voter Power Index" we also use the current forecast for each Senate race to look at the demographics for close races. This section pulls the odds of each senate candidate winning their respective election.

In [14]:
candidates = []
for d in data["seatForecasts"]:
    for f in d["forecast"]:
        if d["state"] != "US":
            candidate_dict = {
                "state": d["state"],
                "class": d["class"]
            }
            candidate_dict["candidate"] = f["candidate"]
            candidate_dict["party"] = f["party"]
            candidate_dict["classic_prob"] = f["models"]["classic"]["winprob"]
            candidates.append(candidate_dict)

In [15]:
len(candidates)

97

In [16]:
senate_candidates = pd.DataFrame(candidates)

In [17]:
senate_candidates.head()

Unnamed: 0,candidate,class,classic_prob,party,state
0,Kyrsten Sinema,1,61.63,D,AZ
1,Angela Green,1,0.002,G,AZ
2,Martha McSally,1,38.368,R,AZ
3,Dianne Feinstein,1,98.37,D,CA
4,Kevin de Leon,1,1.63,D,CA


In [18]:
senate_candidates["state"].nunique()

33

In [19]:
senate_candidates.to_csv("../data/fivethirtyeight/senate_candidate_odds.csv", index=None)

---

---

---