# Where are all the athletes from

 join together the athletes and oregions tables to return countries with participating athletes

```
-- Athlete count by country and region
SELECT reg.region
  , reg.country
  , COUNT(DISTINCT ath.athlete_id) AS no_athletes -- Athletes can compete in multiple events
FROM athletes ath
INNER JOIN oregions reg
  ON reg.olympic_cc = ath.country_code
GROUP BY reg.region, reg.country
ORDER BY no_athletes;
```

# Using different joins to explore athletes' regions

Many Russians watch figure skating. Kenyans have a strong presence in running events. Canadians dominate hockey. Do these trends hold true across regions? Do all European countries have figure skating Olympians? Do all North Americans have Olympic-level hockey teams?

See which European countries sent figure skating competitors to the 2014 Winter Olympics. The athletes table is already filtered to figure skating athletes, and the regions table is filtered to European countries. Use the different join types and compare the results.

```
SELECT reg.region, reg.country
  , COUNT(DISTINCT ath.athlete_id) AS no_athletes
FROM regions reg
LEFT  JOIN athletes ath
  ON reg.olympic_cc = ath.country_code
GROUP BY reg.region, reg.country
ORDER BY no_athletes DESC;
```

```
SELECT reg.region, reg.country
  , COUNT(DISTINCT ath.athlete_id) AS no_athletes
FROM athletes ath
RIGHT JOIN regions reg
  ON ath.country_code = reg.olympic_cc
GROUP BY reg.region, reg.country
ORDER BY no_athletes DESC;
```

```
SELECT reg.region, reg.country
  , COUNT(DISTINCT ath.athlete_id) AS no_athletes
FROM athletes ath
INNER JOIN regions reg
  ON ath.country_code = reg.olympic_cc
GROUP BY reg.region, reg.country
ORDER BY no_athletes DESC;

```

# What about the weather

Regionally, Africa has a reputation for dominating in the field of running. However, Africa has the fewest athletes per (competing) country. Why?

Running events are only found in the Summer Olympics, so maybe Africa does not send many athletes to the Winter Games. This would explain the low number of athletes when looking across all Olympic Games.

Explore that hypothesis by looking at athlete counts by season (Summer versus Winter).

```
SELECT reg.region
  , ath.season
  , COUNT(DISTINCT ath.athlete_id) AS no_athletes
  , COUNT(DISTINCT reg.olympic_cc) AS no_countries
  , COUNT(DISTINCT ath.athlete_id)/COUNT(DISTINCT reg.olympic_cc) AS athletes_per_country
FROM athletes ath
INNER JOIN oregions reg
  ON ath.country_code = reg.olympic_cc
GROUP BY reg.region, ath.season -- Group by region and season
ORDER BY reg.region , athletes_per_country;
```

# Filtering to freezing with a subquery

From the first Olympics in 1904 through the 2016 games, African countries have sent 7,845 athletes. However, only 55 of those athletes competed in the Winter Olympics.

One-quarter of Africa is covered by the Sahara, and the non-desert areas have year-round heat. Perhaps this lack of cold weather and snow limits the training opportunities for Winter Olympians.

Explore climate data to see if all African countries lack winter sports conditions. The World Bank collects average temperatures and precipitation for each country. Monthly and annual 40-year averages are preloaded with temperature in degrees Celsius (0 is freezing) and precipitation in millimeters.

Examine the climate data, looking for countries below freezing all year. Are there any in Africa?

```
-- Countries cold enough for snow year-round
SELECT country_code
  , country
  , COUNT (DISTINCT athlete_id) AS winter_athletes -- Athletes can compete in multiple events 
FROM athletes
WHERE country_code IN (SELECT olympic_cc FROM oclimate WHERE temp_annual < 0)
AND season = 'Winter'
GROUP BY country_code, country;
```

# Where winter is white

Canada, Russia, and Mongolia are the only countries with Olympians and average annual temperatures below freezing. More commonly, countries have cold weather occurring only during winter months. Countries with only a few months of freezing temperatures and snow still provide athletes the opportunity to train for events like skiing and bobsledding.

With this in mind, you will look at climate data for countries with Olympic athletes using the 40-year average monthly temperatures. You are really intrigued by the low Winter Olympics participation in Africa and decide to look at the temperature for all the Olympic regions in the southern hemisphere.

Write the query to optimize for readability by using a common table expression (CTE).

```
WITH south_cte AS -- CTE
(
  SELECT region
    , ROUND(AVG(temp_06),2) AS avg_winter_temp
    , ROUND(AVG(precip_06),2) AS avg_winter_precip
  FROM oclimate
  WHERE region IN ('Africa','South America','Australia and Oceania')
  GROUP BY region
)

SELECT south.region, south.avg_winter_temp, south.avg_winter_precip
  , COUNT(DISTINCT ath.athlete_id)
FROM south_cte as south
INNER JOIN athletes_recent ath
  ON south.region = ath.region
  AND ath.season = 'Winter'
GROUP BY south.region, south.avg_winter_temp, south.avg_winter_precip
ORDER BY south.avg_winter_temp;
```

# Countries with subqueries or CTEs

Of the Olympic regions, Africa has the warmest winters. However, some African countries do participate in the Winter Olympics. For instance, Morocco has sent more than 20 athletes over the years. Perhaps looking at the data at the regional level is too broad. Maybe some African countries have colder winters than other African countries.

Take another look at winter climate data, but look on a country level. The athletes have been filtered down to include only African athletes, with the first African Olympians competing in 1960.

Use both a subquery and CTE structure and compare the results.

### Subquery

```
-- Climate by country with Olympian athletes
SELECT country
  , temp_06
  , precip_06
FROM climate
WHERE region = 'Africa'
AND olympic_cc IN (SELECT DISTINCT country_code FROM athletes_wint)
ORDER BY temp_06;
```

### CTE

```
WITH countries_cte AS -- CTE
(
    SELECT olympic_cc
      , country
      , temp_06
      , precip_06
    FROM climate
    WHERE region = 'Africa'
)

SELECT DISTINCT cte.country
  , cte.temp_06
  , cte.precip_06
FROM athletes_wint AS wint
INNER JOIN countries_cte AS cte
  ON wint.country_code = cte.olympic_cc
ORDER BY temp_06;
```

# Canadians temp table

Canada has an average annual temperature below freezing, so you decide to look at Canadian athletes. You want to know all the winter sports that Canadians participate in and which sport has the most Canadian competitors.

The preloaded base table of Olympic athletes, athletes_recent, is quite large. Even though it only includes athletes from two Olympic games, it has thousands of rows and could be slow to query.

Since you want to look at Canadian athletes only and then perform some exploratory analysis, you will first create a temporary table of Canadian athletes. Use this table to find the sport with the most athletes.

```
-- Create a temp table of Canadians
CREATE TEMP TABLE canadians AS
    (SELECT *
    FROM athletes_recent
    WHERE country_code = 'CAN'
    AND season = 'Winter'); -- The table has both summer and winter athletes

-- Find the most popular sport
SELECT sport
  , COUNT(DISTINCT athlete_id) as no_athletes
FROM canadians
GROUP BY sport 
ORDER BY no_athletes DESC;
```

# Analyze that temp table

You have access to the table athletes, but it has thousands of entries, making it slow to query. One solution is to create a temporary table of all country codes. You can then analyze statistics related to country characteristics using the temporary table, which allow the query planner to optimize the query execution better.

```
-- Create temp countries table
CREATE TEMP TABLE countries AS
    (SELECT DISTINCT o.region, a.country_code, o.country
    FROM athletes a
    INNER JOIN oregions o
      ON a.country_code = o.olympic_cc);
      
ANALYZE countries; -- Collect the statistics

-- Count the entries
SELECT COUNT(*) FROM countries;
```