# **Cross Dataset Queries**

### Below are our 3 cross dataset queries and the respective views created from these queries

#### This query join the Health Statistic table from our second dataset to the Country entity table of our primary dataset to query for temperature and male survival to 65 percent for all countries in both datasets. Then this data is chronologically ordered.

In [36]:
%%bigquery
SELECT Country_Beam_DF.Country as Country,
Health_Statistics_Beam_DF.dt as Date,
AVG(Health_Statistics_Beam_DF.statistic) as MaleSurvivalTo65Percent,
AVG(Country_Beam_DF.AverageTemperature) as AvgTemp
FROM kaggle2_modeled.Health_Statistics_Beam_DF
JOIN kaggle_modeled.Country_Beam_DF
ON Health_Statistics_Beam_DF.countryName = Country_Beam_DF.Country
WHERE Health_Statistics_Beam_DF.countryName is not Null
AND Health_Statistics_Beam_DF.metric = "Survival to age 65, male (% of cohort)"
GROUP BY Country,Date
ORDER BY Date
LIMIT 12

Unnamed: 0,Country,Date,MaleSurvivalTo65Percent,AvgTemp
0,Aruba,1960-01-01,63.02099,27.92039
1,Mali,1960-01-01,14.52538,28.441977
2,Oman,1960-01-01,33.91443,26.916863
3,Cuba,1960-01-01,63.2238,25.407426
4,Benin,1960-01-01,26.84703,27.171999
5,Peru,1960-01-01,40.4589,19.935974
6,Iraq,1960-01-01,42.11173,21.775629
7,Chad,1960-01-01,24.68679,27.120466
8,Chile,1960-01-01,49.16909,9.383474
9,Fiji,1960-01-01,49.28894,25.038672


#### This query join the Life Statistic table from our second dataset to the Country entity table of our primary dataset to query for temperature and crude death rate per 1000 people. The results are then presented in chronological order. The results of this query exhibt the corelation, or lack of, between average country temperate and temperature change over time in relation to crude death rates. 

In [8]:
%%bigquery
SELECT Country_Beam_DF.Country as Country,
Life_Statistics_Beam_DF.dt as Date,
AVG(Life_Statistics_Beam_DF.statistic) as CrudeDeathRate,
AVG(Country_Beam_DF.AverageTemperature) as AvgTemp
FROM kaggle2_modeled.Life_Statistics_Beam_DF
JOIN kaggle_modeled.Country_Beam_DF ON Life_Statistics_Beam_DF.countryName = Country_Beam_DF.Country
WHERE Life_Statistics_Beam_DF.countryName is not Null
AND Life_Statistics_Beam_DF.metric = "Death rate, crude (per 1,000 people)"
GROUP BY Country,Date
ORDER BY Date
LIMIT 12

Unnamed: 0,Country,Date,CrudeDeathRate,AvgTemp
0,Aruba,1960-01-01,6.389,27.92039
1,Mali,1960-01-01,36.838,28.441977
2,Oman,1960-01-01,22.457,26.916863
3,Cuba,1960-01-01,8.83,25.407426
4,Benin,1960-01-01,28.827,27.171999
5,Peru,1960-01-01,18.736,19.935974
6,Iraq,1960-01-01,17.534,21.775629
7,Chad,1960-01-01,26.546,27.120466
8,Chile,1960-01-01,12.355,9.383474
9,Fiji,1960-01-01,11.54,25.038672


#### This query join the Population Statistic table from our second dataset to the Country entity table of our primary dataset to query for temperature and total country population for India in both datasets. The results of this query can be indicative of changes in average temperature in relation to population changes.

In [12]:
%%bigquery
SELECT Country_Beam_DF.Country as Country,
Population_Statistics_Beam_DF.dt as Date,
AVG(Population_Statistics_Beam_DF.statistic) as TotalPopulation,
AVG(Country_Beam_DF.AverageTemperature) as AvgTemp
FROM kaggle2_modeled.Population_Statistics_Beam_DF
JOIN kaggle_modeled.Country_Beam_DF ON Population_Statistics_Beam_DF.countryName = Country_Beam_DF.Country
WHERE Population_Statistics_Beam_DF.countryName is not Null
AND Population_Statistics_Beam_DF.metric = "Population, total"
AND Country_Beam_DF.Country = "India"
GROUP BY Country,Date
ORDER BY Date, TotalPopulation
LIMIT 12

Unnamed: 0,Country,Date,TotalPopulation,AvgTemp
0,India,1960-01-01,449661874.0,23.873789
1,India,1961-01-01,458691457.0,23.873789
2,India,1962-01-01,468054145.0,23.873789
3,India,1963-01-01,477729958.0,23.873789
4,India,1964-01-01,487690114.0,23.873789
5,India,1965-01-01,497920270.0,23.873789
6,India,1966-01-01,508402908.0,23.873789
7,India,1967-01-01,519162069.0,23.873789
8,India,1968-01-01,530274729.0,23.873789
9,India,1969-01-01,541844848.0,23.873789


### Creating 3 Views from the above 3 queries

In [42]:
%%bigquery
CREATE OR REPLACE VIEW kaggle2_modeled.v_Male_Survival_To_65_Percentage AS
    SELECT CB.Country as Country,
    HS.dt as Date,
    AVG(HS.statistic) as MaleSurvivalTo65Percent,
    AVG(CB.AverageTemperature) as AvgTemp
    FROM `electric-spark-266716.kaggle2_modeled.Health_Statistics_Beam_DF` as HS
    JOIN `electric-spark-266716.kaggle_modeled.Country_Beam_DF` as CB
    ON HS.countryName = CB.Country
    WHERE HS.countryName is not Null
    AND HS.metric = "Survival to age 65, male (% of cohort)"
    GROUP BY Country,Date
    ORDER BY Date

In [1]:
%%bigquery
CREATE OR REPLACE VIEW kaggle2_modeled.v_Crude_Death_Rate AS
    SELECT CB.Country as Country,
    LS.dt as Date,
    AVG(LS.statistic) as CrudeDeathRate,
    AVG(CB.AverageTemperature) as AvgTemp
    FROM `electric-spark-266716.kaggle2_modeled.Life_Statistics_Beam_DF` as LS
    JOIN `electric-spark-266716.kaggle_modeled.Country_Beam_DF` as CB
    ON LS.countryName = CB.Country
    WHERE LS.countryName is not Null
    AND LS.metric = "Death rate, crude (per 1,000 people)"
    GROUP BY Country,Date
    ORDER BY Date

In [13]:
%%bigquery
CREATE OR REPLACE VIEW kaggle2_modeled.v_Population_and_Temp_India AS
    SELECT CB.Country as Country,
    PS.dt as Date,
    AVG(PS.statistic) as TotalPopulation,
    AVG(CB.AverageTemperature) as AvgTemp
    FROM `electric-spark-266716.kaggle2_modeled.Population_Statistics_Beam_DF` as PS
    JOIN `electric-spark-266716.kaggle_modeled.Country_Beam_DF` as CB
    ON PS.countryName = CB.Country
    WHERE PS.countryName is not Null
    AND PS.metric = "Population, total"
    AND CB.Country = "India"
    GROUP BY Country,Date
    ORDER BY Date, TotalPopulation