### aggregate queries ###

1. In this first query, we count the number of records for each state in the 2016 Senate Election. We employ the COUNT aggregate function in this query.

In [2]:
%%bigquery 
SELECT hs1.state, COUNT(*) as number_of_records
FROM hdv_modeled.Results_Beam_DF hs1
JOIN hdv_modeled.Jurisdiction_Beam_DF hs2
ON hs1.state = hs2.state and hs1.jname = hs2.jname
WHERE hs1.office = 'US Senate' and hs1.year = 2016
GROUP by hs1.state
ORDER by hs1.state




Unnamed: 0,state,number_of_records
0,AK,240
1,AL,134
2,AR,225
3,AZ,165
4,CA,116
5,CO,448
6,CT,1014
7,FL,804
8,GA,477
9,HI,20


2. The following query investigates the total votes casted for 2016 and 2018 House elections for the state of Texas. Here, we employ the SUM() aggregate function. This query can be used to compare turnout between midterm and presidential election years.

In [29]:
%%bigquery 
SELECT e1.year, SUM(e1.votes) as total_house_votes_tx
FROM hdv_modeled.Results_Beam_DF e2
JOIN hdv_modeled.Results_Beam_DF e1
ON e1.year = e2.year and e1.jname = e2.jname and e1.cname = e2.cname
WHERE e1.state = 'TX' and e1.office = 'US House'
GROUP BY e1.year
ORDER BY e1.year


Unnamed: 0,year,total_house_votes_tx
0,2016,8528526
1,2018,8202708


3. The following query begins by computing the vote totals a candidate received in any jurisdiction. Then, for each candidate, their most prolific jurisdiction in a state is noted. The query then ends by presenting candidate/state pairs ordered by the votes a candidate received in their most successful jurisdiction in that state. This can be used to determine where a candidate's strongest base of support in a given state lies, as well as to compare the strengths of these support bases. 

In [66]:
%%bigquery 
SELECT s1.cname, s1.state, MAX(s1.votes) as max_votes
FROM hdv_modeled.Results_Beam_DF s1
JOIN hdv_modeled.Candidate_Beam_DF s2
ON s1.cname = s2.cname
WHERE s1.office = 'President'and s1.year = 2016
GROUP BY s1.cname, s1.state
HAVING max_votes >= 1000
ORDER BY max_votes desc
limit 11

Unnamed: 0,cname,state,max_votes
0,Hillary Clinton,CA,2464364
1,Hillary Clinton,IL,1611946
2,Donald Trump,CA,769743
3,Donald Trump,AZ,747361
4,Hillary Clinton,WA,718322
5,Hillary Clinton,TX,707914
6,Hillary Clinton,AZ,702907
7,Donald Trump,MS,700714
8,Donald Trump,KS,671018
9,Hillary Clinton,FL,624146


4. In this query, we count the number of independent candidates in the House Elections. 

In [18]:
%%bigquery 
SELECT s1.year, COUNT(distinct cname) as number_of_independents
FROM hdv_modeled.Results_Beam_DF s1
WHERE s1.office = 'US House' and s1.party = 'Other'
GROUP BY s1.year
ORDER BY s1.year

Unnamed: 0,year,number_of_independents
0,2016,332
1,2018,341


5. Here, we compare the percent support for the Republican presidential candidate in 2016 to the proportion of people with a highest education level at high school for a given jurisdiction. We impose a limit to suppress a long result.

In [65]:
%%bigquery 
SELECT s1.state, s1.jname, s1.vote_pct as rep_support, s2.High_School_Only__ + s2.__High_School__ as hs_highest_edu
FROM hdv_modeled.Results_Beam_DF s1
JOIN education.education s2
ON (s1.fipscode) = s2.FIPS_Code and s1.state = s2.state
WHERE s2.High_School_Only > 0 and office = 'President' and s1.party = 'Republican'
ORDER BY s1.state, s1.jname
limit 11

Unnamed: 0,state,jname,rep_support,hs_highest_edu
0,AL,Autauga,72.77,43.9
1,AL,Baldwin,76.55,37.3
2,AL,Barbour,52.1,62.7
3,AL,Bibb,76.4,64.1
4,AL,Blount,89.33,53.8
5,AL,Bullock,24.2,64.5
6,AL,Butler,56.13,59.3
7,AL,Calhoun,68.66,48.3
8,AL,Chambers,56.42,57.0
9,AL,Cherokee,83.42,58.1


6. In this final query, we count how many districts a presidential candidate is registered in. By imposing a count limit of 1000, we only look towards the more seriously considered candidates.

In [19]:
%%bigquery 
SELECT  s1.cname, COUNT(*) as registered_jur
FROM hdv_modeled.Results_Beam_DF s1
JOIN hdv_modeled.Candidate_Beam_DF s2
ON s1.cname = s2.cname
WHERE s1.office = 'President'and s1.year = 2016 
GROUP BY s1.cname
HAVING COUNT(*) >= 1000
ORDER BY registered_jur desc, s1.cname

limit 11

Unnamed: 0,cname,registered_jur
0,Donald Trump,4497
1,Gary Johnson,4497
2,Hillary Clinton,4497
3,Jill Stein,4178
4,Darrell Castle,2775
5,Evan Mcmullin,2774
6,Laurence Kotlikoff,2395
7,Rocky Roque De La Fuente,1987
8,Cherunda Fox,1699
9,Tom Hoefling,1490


### 2 views for visualization ###

We choose queries 5 and 6 and create corresponding views:

In [6]:
%%bigquery
create view hdv_modeled.v_Presidential_HS_Repub_Correlation as
SELECT s1.state, s1.jname, s1.vote_pct as rep_support, s2.High_School_Only__ + s2.__High_School__ as hs_highest_edu
FROM `alert-result-266803.hdv_modeled.Results_Beam_DF` s1
JOIN `alert-result-266803.education.education` s2
ON (s1.fipscode) = s2.FIPS_Code and s1.state = s2.state
WHERE s2.High_School_Only > 0 and office = 'President' and s1.party = 'Republican'
ORDER BY s1.state, s1.jname

In [5]:
%%bigquery
create view hdv_modeled.v_Presidential_Jur_Regs as
SELECT  s1.cname, COUNT(*) as registered_jur
FROM `alert-result-266803.hdv_modeled.Results_Beam_DF` s1
JOIN `alert-result-266803.hdv_modeled.Candidate_Beam_DF` s2
ON s1.cname = s2.cname
WHERE s1.office = 'President'and s1.year = 2016 
GROUP BY s1.cname
HAVING COUNT(*) >= 1000
ORDER BY registered_jur desc, s1.cname

### subqueries ###

1. In this first subquery, we show which states have an above average number of candidates running in the 2016 Senate Election. We employ the COUNT and AVG aggregate functions in this query.

In [20]:
%%bigquery 
select * from 
    (select state, count(distinct cname) as num_sen_candidates
    from hdv_modeled.Results_Beam_DF
    where office = 'US Senate' and year = 2016
    group by state)
where num_sen_candidates >
    (select AVG(num_sen_candidates) from
        (select state, count(distinct cname) as num_sen_candidates
        from hdv_modeled.Results_Beam_DF
        where office = 'US Senate' and year = 2016
        group by state))

Unnamed: 0,state,num_sen_candidates
0,AK,6
1,AZ,11
2,CO,7
3,CT,6
4,FL,12
5,IL,10
6,LA,24
7,MD,9
8,MO,11
9,NV,6


2. The following query shows which states Hillary Clinton won over 50% of the popular vote in the 2016 presidential election. 

In [14]:
%%bigquery
select state, clinton_pct
from
    (select state, sum(votes)/sum(total_votes)*100 as clinton_pct
    from hdv_modeled.Results_Beam_DF
    where cname = 'Hillary Clinton'
    group by state)
where clinton_pct > 50
order by state

Unnamed: 0,state,clinton_pct
0,CA,61.7264
1,CT,54.566301
2,DE,53.353337
3,HI,62.221492
4,IL,54.479408
5,MA,59.050415
6,MD,60.325744
7,NJ,55.453084
8,OR,50.071852
9,RI,54.354619


3. The following subquery compares the number of registered House candidates in each state between midterm and presidential election years as a percent change from 2016 to 2018.

In [5]:
%%bigquery 
select state, (num_house_candidates_2018-num_house_candidates_2016)/num_house_candidates_2016*100 as pct_change_house_candidates from
    (select x.state, num_house_candidates_2016, num_house_candidates_2018 from (select state, count(distinct cname) as num_house_candidates_2016 from hdv_modeled.Results_Beam_DF
    where office = 'US House' and year = 2016 group by state) x
    join 
    (select state, count(distinct cname) as num_house_candidates_2018 from hdv_modeled.Results_Beam_DF
    where office = 'US House' and year = 2018 group by state) y 
    on x.state = y.state)
order by state
limit 11

Unnamed: 0,state,pct_change_house_candidates
0,AK,-50.0
1,AL,18.181818
2,AR,33.333333
3,AZ,-7.692308
4,CA,0.0
5,CO,59.090909
6,CT,-34.782609
7,DE,25.0
8,FL,-15.714286
9,GA,13.043478


4. This query compares statewide presidential results, measured as percentage support for Donald Trump, with the percent of adults in that state that have a maximum educational attainment of high school, for each state in the country.

In [25]:
%%bigquery 
select x.state, trump_pct, e.High_School_Only__ + e.__High_School__ as hs_highest_edu 
from
    (select state, sum(votes)/sum(total_votes)*100 as trump_pct
    from hdv_modeled.Results_Beam_DF
    where cname = 'Donald Trump'
    group by state) x
join education.education e
on x.state = e.State and mod(e.FIPS_code,1000) = 0
order by state
limit 11

Unnamed: 0,state,trump_pct,hs_highest_edu
0,AK,51.32405,35.3
1,AL,62.083092,45.1
2,AR,60.574102,47.9
3,AZ,52.373407,37.2
4,CA,31.617107,37.7
5,CO,43.251397,30.0
6,CT,40.926914,36.6
7,DE,41.922824,41.8
8,FL,49.021941,40.8
9,GA,51.048719,41.2


### 1 view for visualization ###

We create views using query ?

In [19]:
%%bigquery
create view hdv_modeled.v_General_vs_Midterm_House_Candidates_Count as
select state, (num_house_candidates_2018-num_house_candidates_2016)/num_house_candidates_2016*100 as pct_change_house_candidates from
(select x.state, num_house_candidates_2016, num_house_candidates_2018 from (select state, count(distinct cname) as num_house_candidates_2016 from `alert-result-266803.hdv_modeled.Results_Beam_DF`
where office = 'US House' and year = 2016 group by state) x
join 
(select state, count(distinct cname) as num_house_candidates_2018 from `alert-result-266803.hdv_modeled.Results_Beam_DF`
where office = 'US House' and year = 2018 group by state) y 
on x.state = y.state)
order by state
