## 1. Babynames analysis - A first look
<p><a  href="https://www.ssa.gov/">The United States Social Security Administration</a> has made available data on the frequency of baby names from 1880 through present. Despite the first impression, this a really interesting dataset which offers great opportunities for exploration and visualization.</p>
    <img src="https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRxncVEtnFae2H1JzJ1KScJPtsHaEPjRu67VA&usqp=CAU" height="200" width="250">
<p>The dataset is update once per year <a href=" http://www.ssa.gov/oact/babynames/limits.html">here</a>. So to get us started we will import our data. If someone took a look to the dataset contained in the <em>dataset/babynames</em>, he or she would see that we have values values separated with comma, so our import is pretty straigthforward. So let's have a first look in our dataset!</p>

In [5]:
import pandas as pd
from IPython import display

# Import data, print first rows
names1880 = pd.read_csv("./datasets/babynames/yob1880.txt")
names1880.head()

Unnamed: 0,Mary,F,7065
0,Anna,F,2604
1,Emma,F,2003
2,Elizabeth,F,1939
3,Minnie,F,1746
4,Margaret,F,1578


## 2.Managing Data per Years
<p>Ok nothing surprising here, just our usuall, regular dataframe. Still, If we open the <em>/babynames/</em> folder we will see that it contains a lot of txt folders,each containing tha list of baby names for the corresponding year. The next step is to import all the data and unify them into a single dataframe that will contain the <code>year</code> as an attribute.

In [12]:
# Prepare index and columns for unified dataframe
years = range(1880, 2011)
columns = ['name', 'sex', 'births']
pieces = []

# For each .txt create a frame, assign its 'year', keep them altogether in "piecies"
for year in years:
    path = './datasets/babynames/yob{:d}.txt'.format(year)
    frame = pd.read_csv(path,  names=columns)
    frame['year'] = year
    pieces.append(frame)
    
# Concat all "pieces" into a single dataframe
names = pd.concat(pieces, ignore_index=True)
names

Unnamed: 0,name,sex,births,year
0,Mary,F,7065,1880
1,Anna,F,2604,1880
2,Emma,F,2003,1880
3,Elizabeth,F,1939,1880
4,Minnie,F,1746,1880
...,...,...,...,...
1690779,Zymaire,M,5,2010
1690780,Zyonne,M,5,2010
1690781,Zyquarius,M,5,2010
1690782,Zyran,M,5,2010


## 3. Aggregating the data