# Input and Output

In [17]:
import pandas as pd
import requests
from io import StringIO

In [18]:
url = 'https://data.cityofnewyork.us/api/views/25th-nujf/rows.csv'
response = requests.get(url, verify=False)  # Ignorar o certificado SSL
baby_names = pd.read_csv(StringIO(response.text))
baby_names.head()



Unnamed: 0,Year of Birth,Gender,Ethnicity,Child's First Name,Count,Rank
0,2011,FEMALE,HISPANIC,GERALDINE,13,75
1,2011,FEMALE,HISPANIC,GIA,21,67
2,2011,FEMALE,HISPANIC,GIANNA,49,42
3,2011,FEMALE,HISPANIC,GISELLE,38,51
4,2011,FEMALE,HISPANIC,GRACE,36,53


## Export DataFrame to CSV File
- The `to_csv` method exports a **DataFrame** to a CSV file.
- Its first argument is the filename.
- By default, pandas will include the index. Set the `index` parameter to False to exclude the index.
- The `columns` parameter limits the exported columns.

In [19]:
baby_names = baby_names.to_csv('baby_names.csv', index=False)

In [52]:
baby_names = pd.read_csv('baby_names.csv')
baby_names.head()

Unnamed: 0,Year of Birth,Gender,Ethnicity,Child's First Name,Count,Rank
0,2011,FEMALE,HISPANIC,GERALDINE,13,75
1,2011,FEMALE,HISPANIC,GIA,21,67
2,2011,FEMALE,HISPANIC,GIANNA,49,42
3,2011,FEMALE,HISPANIC,GISELLE,38,51
4,2011,FEMALE,HISPANIC,GRACE,36,53


In [53]:
# 2011 até 2021
baby_names['Ethnicity'] = baby_names['Ethnicity'].replace('WHITE NON HISP','WHITE NON HISPANIC').replace('ASIAN AND PACI', 'PACIFIC ISLANDER').replace('BLACK NON HISP', 'BLACK NON HISPANIC')

In [54]:
is_gabriela = baby_names["Child's First Name"].str.lower() == 'gabriela'
baby_names[is_gabriela][['Year of Birth','Ethnicity']].value_counts(subset='Ethnicity')

Ethnicity
HISPANIC              35
WHITE NON HISPANIC    32
Name: count, dtype: int64

## Install openpyxl Library to Read and Write Excel Files

## Import Excel File into pandas
- The `read_excel` function reads an Excel file/workbook into a **DataFrame**.
- Use the `sheet_name` parameter if the workbook contains multiple worksheets. Pass a single worksheet name or a list of worksheet names/index positions.
- Pass the `sheet_name` parameter an argument of **None** to include all worksheets.
- Pandas will store multiple worksheets in a Python dictionary. The keys will be the worksheet names, and the values will be the **DataFrames**.

In [None]:
import openpyxl

In [57]:
pd.read_excel("Data - Single Worksheet.xlsx")

Unnamed: 0,First Name,Last Name,City,Gender
0,Brandon,James,Miami,M
1,Sean,Hawkins,Denver,M
2,Judy,Day,Los Angeles,F
3,Ashley,Ruiz,San Francisco,F
4,Stephanie,Gomez,Portland,F


In [63]:
df1 = pd.read_excel("Data - Multiple Worksheets.xlsx", sheet_name='Data 1')
df2 = pd.read_excel("Data - Multiple Worksheets.xlsx", sheet_name='Data 2')

In [64]:
df1

Unnamed: 0,First Name,Last Name,City,Gender
0,Brandon,James,Miami,M
1,Sean,Hawkins,Denver,M
2,Judy,Day,Los Angeles,F
3,Ashley,Ruiz,San Francisco,F
4,Stephanie,Gomez,Portland,F


In [65]:
df2

Unnamed: 0,First Name,Last Name,City,Gender
0,Parker,Power,Raleigh,F
1,Preston,Prescott,Philadelphia,F
2,Ronaldo,Donaldo,Bangor,M
3,Megan,Stiller,San Francisco,M
4,Bustin,Jieber,Austin,F


In [66]:
pd.read_excel("Data - Multiple Worksheets.xlsx", sheet_name=['Data 1','Data 2'])

{'Data 1':   First Name Last Name           City Gender
 0    Brandon     James          Miami      M
 1       Sean   Hawkins         Denver      M
 2       Judy       Day    Los Angeles      F
 3     Ashley      Ruiz  San Francisco      F
 4  Stephanie     Gomez       Portland      F,
 'Data 2':   First Name Last Name           City Gender
 0     Parker     Power        Raleigh      F
 1    Preston  Prescott   Philadelphia      F
 2    Ronaldo   Donaldo         Bangor      M
 3      Megan   Stiller  San Francisco      M
 4     Bustin    Jieber         Austin      F}

In [68]:
data = pd.read_excel("Data - Multiple Worksheets.xlsx", sheet_name=None)

In [69]:
data['Data 1']

Unnamed: 0,First Name,Last Name,City,Gender
0,Brandon,James,Miami,M
1,Sean,Hawkins,Denver,M
2,Judy,Day,Los Angeles,F
3,Ashley,Ruiz,San Francisco,F
4,Stephanie,Gomez,Portland,F


In [71]:
data['Data 2']

Unnamed: 0,First Name,Last Name,City,Gender
0,Parker,Power,Raleigh,F
1,Preston,Prescott,Philadelphia,F
2,Ronaldo,Donaldo,Bangor,M
3,Megan,Stiller,San Francisco,M
4,Bustin,Jieber,Austin,F


## Export Excel File from pandas
- The **ExcelWriter** class writes one or more **DataFrames** to an Excel file.
- Use a context manager (the `with` keyword) in combination with the **ExcelWriter** object and an assigned variable.
- Invoke the `to_excel` method on every **DataFrame** to include in the Excel workbook and pass in the **ExcelWriter** object as the first argument.
- The `to_excel` method supports `sheet_name`, `index`, and `columns` parameters.

In [78]:
female = baby_names[baby_names['Gender'] == 'FEMALE']
male = baby_names[baby_names['Gender'] == 'MALE']

In [81]:
male

Unnamed: 0,Year of Birth,Gender,Ethnicity,Child's First Name,Count,Rank
363,2013,MALE,HISPANIC,Jared,25,80
416,2013,MALE,HISPANIC,Jariel,25,80
547,2011,MALE,ASIAN AND PACIFIC ISLANDER,AARAV,15,51
548,2011,MALE,ASIAN AND PACIFIC ISLANDER,AARON,51,19
549,2011,MALE,ASIAN AND PACIFIC ISLANDER,ABDUL,20,46
...,...,...,...,...,...,...
69200,2011,MALE,HISPANIC,ANDERSON,33,71
69204,2011,MALE,HISPANIC,GERARDO,12,92
69206,2013,MALE,BLACK NON HISPANIC,Derrick,10,62
69208,2011,MALE,ASIAN AND PACIFIC ISLANDER,IBRAHIM,17,49


In [82]:
with pd.ExcelWriter('NYC Baby Data.xlsx') as excel_file: # Context Manager
    female.to_excel(excel_file, sheet_name='Females', index=False)
    male.to_excel(excel_file, sheet_name='Males', index=False, columns=['Ethnicity','Count'])