How have American baby name tastes changed since 1920? Which names have remained popular for over 100 years, and how do those names compare to more recent top baby names? These are considerations for many new parents, but the skills you'll practice while answering these queries are broadly applicable. After all, understanding trends and popularity is important for many businesses, too!

You'll be working with data provided by the United States Social Security Administration, which lists first names along with the number and sex of babies they were given to in each year. For processing speed purposes, the dataset is limited to first names which were given to over 5,000 American babies in a given year. The data spans 101 years, from 1920 through 2020.

## The Data

### `baby_names`

| column         | type    | description                                                                  |
| -------------- | ------- | ------------------------------------------------------------------------ |
| `year`         | int     | year                                                                     |
| `first_name`   | varchar | first name                                                               |
| `sex`          | varchar | `sex` of babies given `first_name`                                       |
| `num`          | int     | number of babies of `sex` given `first_name` in that `year`              |


In [21]:
-- Run this code to view the data in baby_names
SELECT *
FROM baby_names
LIMIT 5;

Unnamed: 0,year,first_name,sex,num
0,1920,Mary,F,70982
1,1920,Dorothy,F,36643
2,1920,Helen,F,35097
3,1920,Margaret,F,27994
4,1920,Ruth,F,26101


In [22]:
-- Use this table for the answer to question 1:
-- List the overall top five names in alphabetical order and find out if each name is "Classic" or "Trendy."
SELECT 
	first_name, 
	SUM(num) AS sum, 
	CASE WHEN COUNT(year) >= 50 THEN 'Classic'
		 ELSE 'Trendy' END AS popularity_type
FROM baby_names
GROUP BY first_name
ORDER BY first_name
LIMIT 5;

Unnamed: 0,first_name,sum,popularity_type
0,Aaliyah,15870,Trendy
1,Aaron,530592,Classic
2,Abigail,338485,Trendy
3,Adam,497293,Trendy
4,Addison,107433,Trendy


In [23]:
-- Use this table for the answer to question 2:
-- What were the top 20 male names overall, and how did the name Paul rank?

SELECT RANK() OVER (ORDER BY first_name) AS name_rank,
	   first_name,
	   COUNT(num) AS sum
FROM baby_names
WHERE sex = 'M'
GROUP BY first_name
ORDER BY name_rank
LIMIT 20;


Unnamed: 0,name_rank,first_name,sum
0,1,Aaron,51
1,2,Adam,46
2,3,Adrian,23
3,4,Aidan,8
4,5,Aiden,18
5,6,Alan,21
6,7,Albert,38
7,8,Alex,24
8,9,Alexander,37
9,10,Alfred,3


In [24]:
-- Use this table for the answer to question 3:
-- Which female names appeared in both 1920 and 2020?
WITH first_tab AS (
				SELECT a.first_name, a.num
				FROM baby_names a
				WHERE a.year = 1920 and a.sex = 'F'
				),
	second_tab AS (
			    SELECT a.first_name, a.num
				FROM baby_names a
				WHERE a.year = 2020 and a.sex = 'F'
				
				)
SELECT f.first_name,
	   f.num + s.num AS total_occurences
FROM first_tab f
JOIN second_tab s
	ON f.first_name = s.first_name;

Unnamed: 0,first_name,total_occurences
0,Emma,20818
1,Evelyn,23283
2,Elizabeth,23125
3,Eleanor,14832
4,Grace,12741
5,Hazel,12765


In [25]:
-- Use this table for the answer to question 3:
-- Which female names appeared in both 1920 and 2020?

WITH names_1920 AS (
    SELECT first_name, SUM(num) AS total_1920
    FROM baby_names
    WHERE year = 1920 AND sex = 'F'
    GROUP BY first_name
),
names_2020 AS (
    SELECT first_name, SUM(num) AS total_2020
    FROM baby_names
    WHERE year = 2020 AND sex = 'F'
    GROUP BY first_name
)
SELECT n1.first_name, 
       n1.total_1920 + n2.total_2020 AS total_occurrences
FROM names_1920 n1
JOIN names_2020 n2
    ON n1.first_name = n2.first_name
ORDER BY n1.first_name;




Unnamed: 0,first_name,total_occurrences
0,Eleanor,14832
1,Elizabeth,23125
2,Emma,20818
3,Evelyn,23283
4,Grace,12741
5,Hazel,12765
