# Two hour Python test-drive

Questions:

1. How do I run a python environment? (And what is python? What is an environment?)
1. What is python good for?
1. Where can I get more information?

Objectives:

1. Define python's data types and structures.
1. Store file content in a variable.
1. Save data to file, read data from file and URL.

#### Python is an interpreted language:

Code can be run interactively using an interpeter; does not have to be compiled.

Things to know - 

Python versions 2 and 3 are available. Recommended to use latest update to version 3.

Python and many IDEs are free and open source. We recommend the Anaconda distribution.

* complexity of interpreters can vary from command line interface to full featured IDEs
* any interpeter or IDE should be able to run any python script - interoperable and cross-platform
* large and active community - where to go for more info?
    * the main python website: https://www.python.org/, includes documentation, tutorials, etc. - walk through some
    * Stack Overflow - how to get help on a specific problem (python list index out of range)
    * documentation for specific libraries (pandas, matplotlib - most have tutorials, etc.)


## The interpreter

In [1]:
# recall - python is an interpreted language
# we can execute commands - in this case mathematical operations - within the interpreter

# addition
3 + 3

6

In [2]:
# multiplication
3 * 3

9

In [3]:
# subtraction
9-5

4

In [4]:
# division - there are different types of division!
# the standard division
9/4

2.25

In [5]:
# don't linger too long on modulo and integer division
# modulo (returns the remainder)
9%4

1

In [6]:
# Integer division
# returns the whole number, no remainder
9//4

2

Python recognizes different data types. We have used to two common numeric data types - integer and floating point number.

In [7]:
type(9)

int

In [8]:
type(4)

int

In [9]:
type(2.25)

float

In [10]:
type(9/4)

float

Another common data type is a string - a character string

In [11]:
'my cat is hiding'

'my cat is hiding'

In [12]:
type('my cat is hiding')

str

In [13]:
# Quotes make a string
type(9)

int

In [14]:
type('9')

str

In [15]:
type(cat)

NameError: name 'cat' is not defined

**Question:**

What is the output of the following:

```
type(True)
```

What kind of data type is 'bool'? Where can you find out more info?

In [16]:
type(True)

bool

## Variables

Variables are used to store values.

In [17]:
a = 5
b = 10
a + b

15

In [18]:
# can be reassigned
# can be reassigned manually
b = 24
a + b

29

In [19]:
# current value of a variable can be used to reassign that variable
t = 84
print('initial value of t:', t)
t = t + 5
print('final value of t:', t)

initial value of t: 84
final value of t: 89


In [20]:
# values of variables can also be udated programmtically
a = 5
b = 10
while a < b:
    print('value of a is:', a)
    a = a + 1

print('the final value of a is:', a)

value of a is: 5
value of a is: 6
value of a is: 7
value of a is: 8
value of a is: 9
the final value of a is: 10


In [21]:
# variables have data types
# 'a' in the next line refers to the variable with that name
type(a)

int

In [22]:
# note the difference
# 'a' in the next line does not refer to the variable but to a string
type('a')

str

In [23]:
animal = 'cat'
type(animal)

str

Given the following variable assignments:

```
x = 12
y = str(14)
z = donuts
```

Predict the output of the following:
```
1. y + z
2. x + y
3. x + int(y)
4. str(x) + y
```
Check your answers in the interpreter.

### Variable Naming Rules

Variable names are case senstive and:

1. Can only consist of one "word" (no spaces).
2. Must begin with a letter or underscore character ('_').
3. Can only use letters, numbers, and the underscore character.

We further recommend using variable names that are meaningful within the context of the script and the research.

## Read and save tabular data from a URL to a file

As an application of what we have done so far, here we demonstrate using variables to download data and save it to a file on our local system.

In [24]:
# need to add functionality to base python - import library
import requests

In [25]:
help(requests) # the basic usage is all we need for today but help is available

Help on package requests:

NAME
    requests

DESCRIPTION
    Requests HTTP Library
    ~~~~~~~~~~~~~~~~~~~~~
    
    Requests is an HTTP library, written in Python, for human beings.
    Basic GET usage:
    
       >>> import requests
       >>> r = requests.get('https://www.python.org')
       >>> r.status_code
       200
       >>> b'Python is a programming language' in r.content
       True
    
    ... or POST:
    
       >>> payload = dict(key1='value1', key2='value2')
       >>> r = requests.post('https://httpbin.org/post', data=payload)
       >>> print(r.text)
       {
         ...
         "form": {
           "key1": "value1",
           "key2": "value2"
         },
         ...
       }
    
    The other HTTP methods are supported - see `requests.api`. Full documentation
    is at <https://requests.readthedocs.io>.
    
    :copyright: (c) 2017 by Kenneth Reitz.
    :license: Apache 2.0, see LICENSE for more details.

PACKAGE CONTENTS
    __version__
    _internal_utils

In [26]:
file_url = "https://raw.githubusercontent.com/unmrds/cc-python/master/tutorials/beowulf_babynames/names/2010"

In [27]:
r = requests.get(file_url) # dot syntax - "get" is a function or method  of a requests object, file_url is the arg

In [28]:
print(r)

<Response [200]>


In [29]:
print(type(r))

<class 'requests.models.Response'>


In [30]:
# okay but what can we do with this? Didn't we just download a file?
# NOTE: inspecting object, getting data types and help are part of a workflow!
help(r)

Help on Response in module requests.models object:

class Response(builtins.object)
 |  The :class:`Response <Response>` object, which contains a
 |  server's response to an HTTP request.
 |  
 |  Methods defined here:
 |  
 |  __bool__(self)
 |      Returns True if :attr:`status_code` is less than 400.
 |      
 |      This attribute checks if the status code of the response is between
 |      400 and 600 to see if there was a client error or a server error. If
 |      the status code, is between 200 and 400, this will return True. This
 |      is **not** a check to see if the response code is ``200 OK``.
 |  
 |  __enter__(self)
 |  
 |  __exit__(self, *args)
 |  
 |  __getstate__(self)
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self)
 |      Allows you to use a response as an iterator.
 |  
 |  __nonzero__(self)
 |      Returns True if :attr:`status_code` is less than 400.
 |      
 |      This attribute checks if

In [31]:
print(r.status_code)

200


In [32]:
print(r.text)

name,sex,count
Isabella,F,22925
Sophia,F,20648
Emma,F,17354
Olivia,F,17030
Ava,F,15436
Emily,F,14278
Abigail,F,14250
Madison,F,13189
Chloe,F,11757
Mia,F,10644
Addison,F,10328
Elizabeth,F,10272
Ella,F,9879
Natalie,F,8783
Samantha,F,8408
Alexis,F,8270
Lily,F,7989
Grace,F,7685
Hailey,F,7022
Hannah,F,6996
Alyssa,F,6992
Lillian,F,6953
Avery,F,6676
Leah,F,6538
Nevaeh,F,6429
Sarah,F,6338
Anna,F,6334
Sofia,F,6327
Ashley,F,6314
Brianna,F,6279
Zoe,F,6270
Victoria,F,6230
Gabriella,F,6180
Brooklyn,F,6125
Kaylee,F,6101
Taylor,F,5899
Layla,F,5894
Allison,F,5864
Evelyn,F,5840
Riley,F,5541
Amelia,F,5461
Khloe,F,5406
Makayla,F,5396
Savannah,F,5377
Aubrey,F,5362
Charlotte,F,5357
Zoey,F,5213
Bella,F,5129
Kayla,F,5057
Alexa,F,5041
Peyton,F,4969
Audrey,F,4952
Claire,F,4916
Arianna,F,4845
Julia,F,4675
Aaliyah,F,4668
Kylie,F,4602
Lauren,F,4466
Sophie,F,4415
Sydney,F,4334
Camila,F,4304
Jasmine,F,4178
Morgan,F,4074
Alexandra,F,4020
Jocelyn,F,3992
Maya,F,3968
Gianna,F,3959
Mackenzie,F,3882
Kimberly,F,3870
Kathe

In [33]:
# save to file

with open('2010', 'w') as o:
    o.write(r.text)

In [34]:
# Before proceeding - check is anyone not using Anaconda
# demo installing pandas as needed
import pandas as pd

In [61]:
names_2010 = pd.read_csv('2010', encoding='latin1')

In [62]:
names_2010 # inspect the data - note 34078 rows have been excluded, this is the head and tail

Unnamed: 0,name,sex,count
0,Isabella,F,22925
1,Sophia,F,20648
2,Emma,F,17354
3,Olivia,F,17030
4,Ava,F,15436
...,...,...,...
34084,Zymaire,M,5
34085,Zyonne,M,5
34086,Zyquarius,M,5
34087,Zyran,M,5


In [63]:
# other ways to inspect the data - note again this is an important part of a workflow
# not just something we're demonstrating here
names_2010.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34089 entries, 0 to 34088
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   name    34089 non-null  object
 1   sex     34089 non-null  object
 2   count   34089 non-null  int64 
dtypes: int64(1), object(2)
memory usage: 799.1+ KB


In [64]:
names_2010.head()

Unnamed: 0,name,sex,count
0,Isabella,F,22925
1,Sophia,F,20648
2,Emma,F,17354
3,Olivia,F,17030
4,Ava,F,15436


In [65]:
# attributes - no parenthesis
names_2010.shape

(34089, 3)

In [66]:
# descriptive stats
# default is to only show stats for numeric data types
names_2010.describe()

Unnamed: 0,count
count,34089.0
mean,108.352812
std,697.685909
min,5.0
25%,7.0
50%,11.0
75%,29.0
max,22925.0


In [67]:
# in our case it can be useful to get all stats
names_2010.describe(include='all')

Unnamed: 0,name,sex,count
count,34089,34089,34089.0
unique,31643,2,
top,Isabella,F,
freq,2,19823,
mean,,,108.352812
std,,,697.685909
min,,,5.0
25%,,,7.0
50%,,,11.0
75%,,,29.0


In [56]:
# we know 34089 babies were registered with the US SSA in 2010
# the data provide counts by name
# what about counts by sex?
# a one liner!
names_2010.groupby('sex').count()

Unnamed: 0_level_0,name,count,sex_categorical
sex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
F,19823,19823,19823
M,14266,14266,14266


In [59]:
# that is a total count of names - there were 19823 different girl names registered, 14266 boy names
# what about the total number of boys and girls?
# also a one liner!
# note in this case we have to specify the numeric column we are summing - pandas will complain otherwise
names_2010.groupby('sex')['count'].sum()

sex
F    1776223
M    1917416
Name: count, dtype: int64

In [45]:
# we know the most popular girl name by the way the data are sorted
# what about the boy name?
# the below tells us how many boys had the most popular boy name, but not the name
names_2010.groupby('sex')['count'].max()

sex
F    22925
M    22139
Name: count, dtype: int64

In [46]:
names_2010.groupby('sex').first() # note this only works with the data sorted by count

Unnamed: 0_level_0,name,count
sex,Unnamed: 1_level_1,Unnamed: 2_level_1
F,Isabella,22925
M,Jacob,22139


In [47]:
# this *should* work with unsorted data - but this has not been tested yet
# mainly this demonstrates how we can build a short analytic pipeline into a single line of code
# using dot syntax to pipe output of one function to another
names_2010.sort_values(['count'], ascending=False).groupby('sex').first()

Unnamed: 0_level_0,name,count
sex,Unnamed: 1_level_1,Unnamed: 2_level_1
F,Isabella,22925
M,Jacob,22139


Pandas is quite powerful and we could spend days on it! For now let's dig a little deeper into data structures, beginning with lists.

In [45]:
# about lists
# collections of objects, separated by commas
# ordered and mutable

number_list = [1, 2, 3, 4, 4, 3, 9, 12]
string_list = ['cat', 'bat', 'hat', 'mat', 'pat']

In [46]:
print(number_list)

[1, 2, 3, 4, 4, 3, 9, 12]


In [47]:
print(string_list)

['cat', 'bat', 'hat', 'mat', 'pat']


In [48]:
mixed_type_list = [1, 'dog', 99, 'pencil', 3.14]

In [49]:
for i in mixed_type_list:
    print(i, type(i))

1 <class 'int'>
dog <class 'str'>
99 <class 'int'>
pencil <class 'str'>
3.14 <class 'float'>


In [50]:
# objects in a list can be other lists
# rather than create a new list, use the append() method
mixed_type_list.append(['a', 1, 'b', 2])
mixed_type_list.append({'c1': 'v1', 'c2': 'v2'})

In [51]:
# rerun our loop
for i in mixed_type_list:
    print(i, type(i))

1 <class 'int'>
dog <class 'str'>
99 <class 'int'>
pencil <class 'str'>
3.14 <class 'float'>
['a', 1, 'b', 2] <class 'list'>
{'c1': 'v1', 'c2': 'v2'} <class 'dict'>


In [52]:
# but just because we can do stuff like this, it generally makes more sense to
# keep lists to a single data type like the number_list and string_list above

# list indexing - every object in a list has an index position
# the first object
number_list[0]

1

In [53]:
string_list[0]

'cat'

In [54]:
# the second
number_list[1]

2

In [55]:
string_list[1]

'bat'

In [56]:
# the last object
number_list[-1]

12

In [57]:
string_list[-1]

'pat'

In [58]:
# second from last, etc.
number_list[-2]

9

In [59]:
string_list[-2]

'mat'

In [60]:
# slicing - subsetting lists
# index pos to right of colon is up to but not including - in this case, index pos 0, 1, 2 but not 3
number_list[0:3]

[1, 2, 3]

In [61]:
# when starting from the beginning of a list, we can leave the start position out - same as above
number_list[:3]

[1, 2, 3]

In [62]:
number_list[2:5]

[3, 4, 4]

In [63]:
number_list[2:]

[3, 4, 4, 3, 9, 12]

In [64]:
number_list

[1, 2, 3, 4, 4, 3, 9, 12]

In [65]:
number_list[:]

[1, 2, 3, 4, 4, 3, 9, 12]

In [66]:
number_list[0:-1] # exercise - what will this output, and why?

[1, 2, 3, 4, 4, 3, 9]

In [67]:
# we don't always know how many objects are in a list
# to find out, use the len() function
len(number_list)

8

In [68]:
len(string_list)

5

## Start Part 2 Here

In [69]:
# nested lists are very handy - a good way to represent tabular data
# indexing and slicing nested lists
nested_list = [['a', 'b', 'c'], [1, 2, 3], [4, 5, 6]]

In [70]:
nested_list[0]

['a', 'b', 'c']

In [71]:
nested_list[0][0]

'a'

**Exercise**

Given the following nested list:

```
my_data = [['a', 'b', 'c'], [[1, 2, 3], [4, 5, 6]], [['cat', 'cow', 'dog'], ['red', 'green', 'blue']]]
```

Write a statement to produce the following outputs:

```
5
```

and 

```
[['cat', 'cow', 'dog'], ['red', 'green', 'blue']]
```

and

```
['b', 'c']
```

**Hint:** experiment and build your statement iteratively!

In [72]:
my_data = [['a', 'b', 'c'], [[1, 2, 3], [4, 5, 6]], [['cat', 'cow', 'dog'], ['red', 'green', 'blue']]]

In [73]:
# 5
my_data[1][1][1]

5

In [74]:
# last two lists
my_data[2]

[['cat', 'cow', 'dog'], ['red', 'green', 'blue']]

In [75]:
# b and c
my_data[0][1:]

['b', 'c']

## Dictionaries

In [99]:
# Dictionaries
# a final data structure for today

# similar to lists - collections of objects
# unordered, mutable
# store objects as key:value pairs

# allows us to work with larger collections since the keys are indexed

'''
Think of things that have properties - what are those properties? For example, a car:
    
make
model
color
mpg
transmission
'''

'\nThink of things that have properties - what are those properties? For example, a car:\n    \nmake\nmodel\ncolor\nmpg\ntransmission\n'

In [100]:
my_car = {'make': 'honda',
         'model': 'fit',
         'year': '2013',
         'color': 'blue',
         'transmission': 'manual'}

In [101]:
# indexing dictionaries with keys
my_car['model']

'fit'

In [102]:
# what if we don't know the keys?
my_car.keys() # note the output is a list!

dict_keys(['make', 'model', 'year', 'color', 'transmission'])

In [103]:
# we can also get the values
my_car.values() # note the output is a list!

dict_values(['honda', 'fit', '2013', 'blue', 'manual'])

In [104]:
# a value in a dictionary can be any data type - str, int, list, dictionary
# let's say we want to add info about optional features in my car - we can use a list
my_car['options'] = ['radio', 'air conditioning', 'seat covers']

In [105]:
my_car

{'make': 'honda',
 'model': 'fit',
 'year': '2013',
 'color': 'blue',
 'transmission': 'manual',
 'options': ['radio', 'air conditioning', 'seat covers']}

In [106]:
my_car['mpg'] = 35

In [107]:
my_car

{'make': 'honda',
 'model': 'fit',
 'year': '2013',
 'color': 'blue',
 'transmission': 'manual',
 'options': ['radio', 'air conditioning', 'seat covers'],
 'mpg': 35}

In [108]:
# we can build a catalog or directory of people's cars
deans_car = {'make': 'chevrolet', 'model': 'impala', 'color': 'black'}

In [109]:
# Exercise - create a dictionary for your vehicle (or if you don't drive, a famous movie car)

In [110]:
# now we can build a catalog of nested dictionaries
cars = {}
cars['jon'] = my_car
cars['dean'] = deans_car

In [111]:
cars

{'jon': {'make': 'honda',
  'model': 'fit',
  'year': '2013',
  'color': 'blue',
  'transmission': 'manual',
  'options': ['radio', 'air conditioning', 'seat covers'],
  'mpg': 35},
 'dean': {'make': 'chevrolet', 'model': 'impala', 'color': 'black'}}

In [112]:
# add your car to the catalog
# now how do we get nested values?
# model of Dean's car
cars['dean']['model']

'impala'

## Loops

In [113]:
# for loops
# once we have data of specific types stored witin data structures, what do we do?
# slicing and indexing is not very useful if we're just retrieving one object at a time

# so we use loops to do things

In [114]:
# syntax
# the value of the loop variable is updated each time the loop runs
# the collection can be anything

'''
for loop_variable in collection:
    do something
'''

'\nfor loop_variable in collection:\n    do something\n'

In [115]:
for letter in 'snailshell':
    print(letter)

s
n
a
i
l
s
h
e
l
l


In [116]:
s = 'snailshell'
for letter in s:
    print(letter)

s
n
a
i
l
s
h
e
l
l


In [117]:
for word in ['cat', 'hat', 9, 18, s]:
    print(word)

cat
hat
9
18
snailshell


In [118]:
some_list = ['cat', 'hat', 9, 18, s]

In [119]:
for obj in some_list:
    print(obj)

cat
hat
9
18
snailshell


In [120]:
# for loops for dictionaries are a little different
# we need loop variables for the keys and values, and the 'items()' method

'''
for key, value in dictionary.items():
   do something
'''

'\nfor key, value in dictionary.items():\n   do something\n'

In [121]:
# first of all - what is 'items()'?
# the output is a list of tuples
# not really going into tuples today

print(my_car.items())

dict_items([('make', 'honda'), ('model', 'fit'), ('year', '2013'), ('color', 'blue'), ('transmission', 'manual'), ('options', ['radio', 'air conditioning', 'seat covers']), ('mpg', 35)])


In [122]:
print(my_car) # compare with above

{'make': 'honda', 'model': 'fit', 'year': '2013', 'color': 'blue', 'transmission': 'manual', 'options': ['radio', 'air conditioning', 'seat covers'], 'mpg': 35}


In [123]:
# this is a common error

for key, value in my_car:
    print(key)
    print(value)

ValueError: too many values to unpack (expected 2)

In [124]:
for key, value in my_car.items():
    print(key, ":", value)

make : honda
model : fit
year : 2013
color : blue
transmission : manual
options : ['radio', 'air conditioning', 'seat covers']
mpg : 35


In [126]:
# finally - conditionals
# evaluate a statement
# do something different depending on whether the statement evaluates to True or False

a = 5
b = -19

# what do we mean by evaluate True or False?
print(a < b)

False


In [127]:
print(a > b)

True


In [128]:
# based on this evaluation, we might do something different

checking = 1000
savings = 20
bills = 250

if checking > bills:
    print('move money into savings!')

move money into savings!


In [129]:
# we can specify an else statement for all other cases
if savings > bills:
    print('move money from savings!')
else:
    print('be sure to avoid an overdraft!')

be sure to avoid an overdraft!


In [130]:
# if we only have one condition, we can use a single if statement
# but we may have multiple conditions to evaluate

checking = 20
savings = 1000
bills = 200

# additional conditions use elif (else if)
# the else statement handles all other cases - no condition needed
if checking > bills:
    print('pay bills!')
elif savings + checking > bills:
    print('move money from savings and pay bills')
else:
    print('tighten that belt!')

move money from savings and pay bills


In [2]:
# another kind of loop is the while loop - using conditional tests to determine 
# if/how we exit a loop

# define value for initial test
# while Test:
#   Do Someting

In [5]:
a = 10
while a>0:
    print(a)
    a = a-1
print("Blast Off")

10
9
8
7
6
5
4
3
2
1
Blast Off


In [23]:
# let's pick some things at random using the random.choice(function)
# and add them to a list of things - animals in this case
import random

menagerieLength = 3

done = False
menagerie = []
animals = ["cat","dog","parakeet","goldfish","carp","dragon"]

while not done:
    newAnimal = random.choice(animals)
    print("adding",newAnimal, "to our menagerie")
    menagerie.append(newAnimal)
    if len(menagerie) == menagerieLength:
        done = True


adding dragon to our menagerie
adding cat to our menagerie
adding dog to our menagerie


In [24]:
# Now let's list the items in our list - our menagerie

print("Our menagerie includes ")
for animal in menagerie:
    print("  ",animal)
    

Our menagerie includes 
   dragon
   cat
   dog


In [29]:
# Finally let's count the items in our list - how many of each animal type to we have

# Approach One: use the source list as the reference for potential animals
print(type(animals),"\n")
for animal in animals:
    animalCount = menagerie.count(animal)
    print(animal,":",animalCount)

<class 'list'> 

cat : 1
dog : 1
parakeet : 0
goldfish : 0
carp : 0
dragon : 1


In [27]:
# Approach 2: use the menagerie list as the reference for animals
uniqueAnimals = set(menagerie)
print(type(uniqueAnimals),"\n")
for animal in uniqueAnimals:
    animalCount = menagerie.count(animal)
    print(animal,":",animalCount)

<class 'set'> 

dog : 1
dragon : 1
cat : 1


In [127]:
# python includes more data types and structures than we have introduced here
# but what we have done is enough to develop some powerful workflows
# define variables and data structures, and use logic - loops and conditionals - to do things with them

## Practice on your own

In [132]:
# we have baby names from SSN applications 2010-2021, stored in a file per year
# can we get the most and least popular names for both sexes?
# Note this is exclusive of non-binary genders - for future iterations pick a different dataset

'''
think about the process - we have 11 files of baby names
we need to 

1. get a list of the files so python can open them
2. read each file individually
3. get the most common names per sex
'''

# 1. get a list of files
import glob
flist = glob.glob('./names/*')
print(flist)

['./names\\2010', './names\\2011', './names\\2012', './names\\2013', './names\\2014', './names\\2015', './names\\2016', './names\\2017', './names\\2018', './names\\2019', './names\\2020', './names\\2021']


In [133]:
# we need a for loop to do steps 2 and 3
# but we can use slicing to try our code on a subset of the list

for f in flist[:2]:
    print(f)

./names\2010
./names\2011


In [134]:
# reuse some code from above

for f in flist[:2]:
    print(f)
    name_data = []
    with open(f, 'r') as f:
        reader = csv.DictReader(f)
        for row in reader:
            name_data.append(row)
    print(len(name_data), 'names in the file')

./names\2010
34089 names in the file
./names\2011
33923 names in the file


In [131]:
# we can iteratively develop our code and test it

for file in flist:
    name_data = []
    with open(file, 'r') as f:
        reader = csv.DictReader(f)
        for row in reader:
            name_data.append(row)
    # comparing values is a little clumsy with the CSV library
    # here's my solution - set a baseline and update by comparing the baseline with each row values
    f_max_c = 0
    f_popular = ''
    m_max_c = 0
    m_popular = ''
    for name in name_data:
        if name['sex'] == 'F':
            if int(name['count']) > int(f_max_c):
                f_max_c = int(name['count'])
                f_popular = name['name']
        elif name['sex'] == 'M':
            if int(name['count']) > int(m_max_c):
                m_max_c = int(name['count'])
                m_popular = name['name']
    # clean up the filename a little
    y = file.replace('./names\\', '')
    print(y, 'most popular girl name:', f_popular, '(', f_max_c, ')')
    print(y, 'most popular boy name:', m_popular, '(', m_max_c, ')')

./names/2013 most popular girl name: Sophia ( 21236 )
./names/2013 most popular boy name: Noah ( 18269 )
./names/2014 most popular girl name: Emma ( 20949 )
./names/2014 most popular boy name: Noah ( 19324 )
./names/2015 most popular girl name: Emma ( 20468 )
./names/2015 most popular boy name: Noah ( 19654 )
./names/2012 most popular girl name: Sophia ( 22322 )
./names/2012 most popular boy name: Jacob ( 19091 )
./names/2017 most popular girl name: Emma ( 19847 )
./names/2017 most popular boy name: Liam ( 18838 )
./names/2010 most popular girl name: Isabella ( 22925 )
./names/2010 most popular boy name: Jacob ( 22139 )
./names/2019 most popular girl name: Olivia ( 18534 )
./names/2019 most popular boy name: Liam ( 20578 )
./names/2021 most popular girl name: Olivia ( 17728 )
./names/2021 most popular boy name: Liam ( 20272 )
./names/2020 most popular girl name: Olivia ( 17641 )
./names/2020 most popular boy name: Liam ( 19777 )
./names/2018 most popular girl name: Emma ( 18786 )
./nam

In [132]:
# we now have the most popular girl and boy names for each year since 2010
# How many babies were given the same name as yours each year?