## **Aggregate Queries**

### Below there are 6 examples of aggregate queries for our dataset

#### This updated query joins on all dates within the entity tables Date and State to compare the State temperature with the Global Average Temperature on the same date. This join is then grouped by Date and State, and then Ordered by Date, ultimately showing the max recorded state temperature and global temperate (if various records were taken on the same date) sorted by ascending order of date and alphabetical order of state name.

In [17]:
%%bigquery
SELECT State.dt as Date, 
State.State as State,
MAX(State.AverageTemperature) as StateTemp,
MAX(Date.LandAverageTemperature) as GlobalTemp
FROM kaggle_modeled.State
JOIN kaggle_modeled.Date
ON State.dt = Date.dt
WHERE State.AverageTemperature IS NOT NULL
GROUP BY Date, State
ORDER BY Date, State
LIMIT 12

Unnamed: 0,Date,State,StateTemp,GlobalTemp
0,1750-01-01,Alabama,7.444,3.034
1,1750-01-01,Arkhangel'Sk,-13.213,3.034
2,1750-01-01,Belgorod,-6.624,3.034
3,1750-01-01,Bryansk,-5.881,3.034
4,1750-01-01,Chuvash,-11.199,3.034
5,1750-01-01,City Of St. Petersburg,-4.101,3.034
6,1750-01-01,Connecticut,-4.043,3.034
7,1750-01-01,Delaware,-0.466,3.034
8,1750-01-01,District Of Columbia,-1.08,3.034
9,1750-01-01,Florida,15.298,3.034


#### This query joins on Country for our State and Country table to return average aggregated state and country temperatures ordered by date recorded

In [25]:
%%bigquery
SELECT State.dt as Date, 
State.State as State,
AVG(State.AverageTemperature) as AvgStateTemp,
AVG(Country.AverageTemperature) as AvgCountryTemp
FROM kaggle_modeled.State
JOIN kaggle_modeled.Country
ON State.dt = Country.dt and State.Country = Country.Country
WHERE Country.AverageTemperature IS NOT NULL
GROUP BY Date, State
ORDER BY Date
LIMIT 12

Unnamed: 0,Date,State,AvgStateTemp,AvgCountryTemp
0,1768-09-01,Newfoundland And Labrador,6.999,5.257
1,1768-09-01,Saskatchewan,9.808,5.257
2,1768-09-01,Manitoba,8.937,5.257
3,1768-09-01,Alberta,8.772,5.257
4,1768-09-01,Iowa,16.899,15.42
5,1768-09-01,Prince Edward Island,12.302,5.257
6,1768-09-01,New Brunswick,11.316,5.257
7,1768-09-01,Nova Scotia,12.612,5.257
8,1768-09-01,Ohio,16.926,15.42
9,1768-09-01,Ontario,10.016,5.257


#### The below query joins the Country beam and City beam tables to display aggregated averages for city and country temperature. These results are then grouped and ordered by ascending city temperature and date.

In [37]:
%%bigquery
SELECT City_Beam_DF.City, 
AVG(City_Beam_DF.AverageTemperature) as CityTemp, 
AVG(Country_Beam_DF.AverageTemperature) as CountryTemp,
Country_Beam_DF.Country,
City_Beam_DF.dt as RecordingDate
FROM kaggle_modeled.City_Beam_DF
LEFT JOIN kaggle_modeled.Country_Beam_DF
ON Country_Beam_DF.dt = City_Beam_DF.dt and Country_Beam_DF.Country = City_Beam_DF.Country
WHERE Country_Beam_DF.AverageTemperature IS NOT NULL
GROUP BY City, Country, RecordingDate
ORDER BY CityTemp, RecordingDate
Limit 12

Unnamed: 0,City,CityTemp,CountryTemp,Country,RecordingDate
0,Norilsk,-42.704,-24.371,Russia,1979-02-01
1,Kyzyl,-41.101,-30.497,Russia,1893-01-01
2,Norilsk,-39.919,-27.705,Russia,1966-02-01
3,Norilsk,-39.683,-27.136,Russia,1974-01-01
4,Norilsk,-39.403,-28.584,Russia,1979-01-01
5,Kyzyl,-39.038,-29.443,Russia,1919-01-01
6,Kyzyl,-38.951,-28.76,Russia,1872-01-01
7,Kyzyl,-38.906,-29.789,Russia,1969-01-01
8,Norilsk,-38.784,-29.561,Russia,1969-02-01
9,Norilsk,-38.715,-26.354,Russia,1922-02-01


#### The below query joins the Country beam and City beam tables to display aggregated averages for city and country temperature. These results are then grouped and ordered by ascending city temperature and date.

In [80]:
%%bigquery 
SELECT City_Beam_DF.City,
City_Beam_DF.dt as Date,
AVG(City_Beam_DF.AverageTemperature) as AvgTemp,
AVG(City_Beam_DF.AverageTemperatureUncertainty) as AvgTempUncertainty
FROM kaggle_modeled.City_Beam_DF
WHERE City_Beam_DF.AverageTemperature IS NOT NULL
GROUP BY City,Date
HAVING MIN(City_Beam_DF.major_city) = 1
ORDER BY Date
LIMIT 12

Unnamed: 0,City,Date,AvgTemp,AvgTempUncertainty
0,Ankara,1755-10-01,11.088,4.733
1,Montreal,1755-10-01,4.743,3.346
2,Izmir,1755-10-01,17.982,4.911
3,New York,1755-10-01,8.669,3.393
4,Moscow,1755-10-01,4.626,5.901
5,Kiev,1755-10-01,7.147,6.002
6,Chicago,1755-10-01,11.196,3.692
7,Rome,1755-10-01,11.814,6.516
8,Paris,1755-10-01,9.972,7.81
9,Istanbul,1755-10-01,14.596,5.076


#### The below query returns the global average land temperature, along with the aggregated max land max temperature and min land min temperature in ascending order by average land temperature

In [68]:
%%bigquery
SELECT Date.dt as Date,
Date.LandAverageTemperature as AvgLandTemp,
MAX(Date.LandMaxTemperature) as AvgLandMaxTemp,
MIN(Date.LandMinTemperature) as AvgLandMinTemp,
FROM kaggle_modeled.Date
WHERE Date.LandAverageTemperature IS NOT NULL AND
Date.LandMaxTemperature IS NOT NULL AND
Date.LandMinTemperature IS NOT NULL
GROUP BY Date,AvgLandTemp
ORDER BY AvgLandTemp
LIMIT 12

Unnamed: 0,Date,AvgLandTemp,AvgLandMaxTemp,AvgLandMinTemp
0,1861-01-01,0.404,7.743,-3.256
1,1893-01-01,0.5,5.9,-5.345
2,1862-01-01,0.685,7.361,-4.11
3,1850-01-01,0.749,8.242,-3.206
4,1887-01-01,0.824,6.864,-4.678
5,1885-01-01,1.003,6.421,-4.621
6,1854-01-01,1.281,8.786,-3.552
7,1895-01-01,1.295,6.961,-4.319
8,1909-01-01,1.395,7.121,-4.298
9,1886-01-01,1.436,7.023,-3.755


#### The below query return the aggregated max average temperature and average temperature uncertainty from the date table for results in the current century (2000 and onward)

In [72]:
%%bigquery
SELECT State_Beam_DF.dt as Date,
State_Beam_DF.State as State,
State_Beam_DF.Country as Country,
MAX(State_Beam_DF.AverageTemperature) as AvgTemp,
MAX(State_Beam_DF.AverageTemperatureUncertainty) as AvgTempUncertainty
FROM kaggle_modeled.State_Beam_DF
GROUP BY Date, State, Country
HAVING MAX(State_Beam_DF.dt) > '1999-12-31'
ORDER BY Date

Unnamed: 0,Date,State,Country,AvgTemp,AvgTempUncertainty
0,2000-01-01,Goa,India,25.379,0.266
1,2000-01-01,Acre,Brazil,25.934,0.379
2,2000-01-01,Amur,Russia,-28.286,0.572
3,2000-01-01,Iowa,United States,-5.230,0.184
4,2000-01-01,Komi,Russia,-18.236,0.529
...,...,...,...,...,...
39579,2013-09-01,British Columbia,Canada,10.332,2.327
39580,2013-09-01,District Of Columbia,United States,19.643,1.050
39581,2013-09-01,Prince Edward Island,Canada,15.021,1.778
39582,2013-09-01,Northwest Territories,Canada,5.831,4.031


### Creating the views for Data Studio based on 2 of our queries

In [93]:
%%bigquery
CREATE OR REPLACE VIEW kaggle_modeled.v_Global_Land_Temperature_Variations AS
  SELECT dt as Date,
  LandAverageTemperature as AvgLandTemp,
  MAX(LandMaxTemperature) as AvgLandMaxTemp,
  MIN(LandMinTemperature) as AvgLandMinTemp,
  FROM `electric-spark-266716.kaggle_modeled.Date`
  WHERE LandAverageTemperature IS NOT NULL AND
  LandMaxTemperature IS NOT NULL AND
  LandMinTemperature IS NOT NULL
  GROUP BY Date,AvgLandTemp
  ORDER BY AvgLandTemp

In [92]:
%%bigquery
CREATE OR REPLACE VIEW kaggle_modeled.v_Major_City_Temperature AS
    SELECT City,
    dt as Date,
    AVG(AverageTemperature) as AvgTemp,
    AVG(AverageTemperatureUncertainty) as AvgTempUncertainty
    FROM `electric-spark-266716.kaggle_modeled.City_Beam_DF`
    WHERE AverageTemperature IS NOT NULL
    GROUP BY City,Date
    HAVING MIN(major_city) = 1
    ORDER BY Date


## **Queries and Subqueries**

The following query finds all dates and the associated temperatures that are higher than the average temperatue of the United States.

In [4]:
%%bigquery
SELECT s.dt, s.AverageTemperature
FROM kaggle_modeled.State_Beam_DF as s
WHERE s.AverageTemperature > 
    (SELECT AVG(AverageTemperature)
     FROM kaggle_modeled.State_Beam_DF
     WHERE Country = "United States")

Unnamed: 0,dt,AverageTemperature
0,1822-03-01,28.749
1,1815-03-01,25.881
2,1814-07-01,23.781
3,1939-09-01,25.296
4,1936-03-01,27.030
...,...,...
315686,2009-04-01,12.296
315687,1930-10-01,12.544
315688,1971-11-01,12.646
315689,1935-11-01,14.277


This query finds the cities/countries and their associated uncertainties for dates with valid AverageTemperatures and higher uncertainties than the minimum.

In [9]:
%%bigquery
SELECT c.City as City,
t.Country as Country,
c.AverageTemperatureUncertainty as CityTempUncertainty,
t.AverageTemperatureUncertainty as CountryTempUncertainty,
FROM kaggle_modeled.City_Beam_DF as c
LEFT JOIN kaggle_modeled.Country_Beam_DF as t
ON c.Country = t.Country
WHERE c.AverageTemperatureUncertainty =
    (SELECT MIN(AverageTemperatureUncertainty)
     FROM kaggle_modeled.City_Beam_DF
     WHERE AverageTemperature is not NULL)
ORDER BY c.City

Unnamed: 0,City,Country,CityTempUncertainty,CountryTempUncertainty
0,Kroonstad,South Africa,0.034,0.250
1,Kroonstad,South Africa,0.034,0.250
2,Kroonstad,South Africa,0.034,0.250
3,Kroonstad,South Africa,0.034,0.250
4,Kroonstad,South Africa,0.034,0.250
...,...,...,...,...
5614,Welkom,South Africa,0.034,0.265
5615,Welkom,South Africa,0.034,0.265
5616,Welkom,South Africa,0.034,0.265
5617,Welkom,South Africa,0.034,0.390


The following query selects the dates and states and the corresponding max state temperatures and global temperatures, with an AverageTemperatureUncertainty greater than the maximum in Texas.

In [15]:
%%bigquery
SELECT State.dt as Date,  
State.State as State, 
MAX(State.AverageTemperature) as StateTemp, 
MAX(Date.LandAverageTemperature) as GlobalTemp 
FROM kaggle_modeled.State 
JOIN kaggle_modeled.Date 
ON State.dt = Date.dt 
WHERE AverageTemperatureUncertainty >       
    (SELECT MAX(s.AverageTemperatureUncertainty)
     FROM kaggle_modeled.State as s
     where s.State = "Texas")
GROUP BY Date, State
ORDER BY Date, State


Unnamed: 0,Date,State,StateTemp,GlobalTemp
0,1752-01-01,Komi,-21.183,0.348
1,1752-01-01,Nenets,-20.383,0.348
2,1752-03-01,Komi,-8.854,5.806
3,1752-03-01,Nenets,-12.504,5.806
4,1752-04-01,Komi,-3.300,8.265
...,...,...,...,...
12728,1933-01-01,Koryak,-24.759,2.169
12729,1933-12-01,Chukot,-23.719,3.104
12730,1933-12-01,Koryak,-13.814,3.104
12731,1935-12-01,Chukot,-22.914,3.423


This query finds the City, AverageTemperature, Longitude, Latitude of all cities in India on dates that have an average temperature greater than the country's maximum. 

In [26]:
%%bigquery
SELECT City, AverageTemperature, Longitude, Latitude
FROM kaggle_modeled.City_Beam_DF
WHERE AverageTemperature > (
    SELECT MAX(t.AverageTemperature)
    FROM kaggle_modeled.Country_Beam_DF as t
    WHERE t.Country = "India")

Unnamed: 0,City,AverageTemperature,Longitude,Latitude
0,Jiroft,32.044,57.27E,28.13N
1,Mecca,31.843,40.67E,21.70N
2,Mecca,32.065,40.67E,21.70N
3,Rabak,33.028,32.20E,13.66N
4,Damaturu,31.955,11.51E,12.05N
...,...,...,...,...
146169,Tempe,32.086,112.02W,32.95N
146170,Peoria,32.937,112.02W,32.95N
146171,Mesa,32.765,112.02W,32.95N
146172,Mesa,32.429,112.02W,32.95N


The following creates a view of one of the four queries above:

In [30]:
%%bigquery
CREATE OR REPLACE VIEW kaggle_modeled.v_Higher_Uncertainty_than_Texas AS
    SELECT q.dt as Date,  
    q.State as State, 
    MAX(q.AverageTemperature) as StateTemp, 
    MAX(r.LandAverageTemperature) as GlobalTemp 
    FROM `electric-spark-266716.kaggle_modeled.State` as q
    JOIN `electric-spark-266716.kaggle_modeled.Date` as r
    ON q.dt = r.dt 
    WHERE AverageTemperatureUncertainty >       
        (SELECT MAX(s.AverageTemperatureUncertainty)
         FROM `electric-spark-266716.kaggle_modeled.State` as s
         where s.State = "Texas")
    GROUP BY Date, State
    ORDER BY Date, State

