# 1. Introduction To The Dataset

The dataset contains the following columns:

`year`: Year (`1994` to `2003`).

`month`: Month (`1` to `12`).

`date_of_month`: Day number of the month (`1` to `31`).

`day_of_week`: Day of week (`1` to `7`).

`births`: Number of births that day.

**Instructions**

- Read the CSV file "US_births_1994-2003_CDC_NCHS.csv" into a string.
- Split the string on the newline character ("\n").
- Display the first 10 values in the resulting list.

In [2]:
f = open("US_births_1994-2003_CDC_NCHS.csv", "r")
text = f.read()
lines = text.split('\n')
lines[0:10]

['year,month,date_of_month,day_of_week,births',
 '1994,1,1,6,8096',
 '1994,1,2,7,7772',
 '1994,1,3,1,10142',
 '1994,1,4,2,11248',
 '1994,1,5,3,11053',
 '1994,1,6,4,11406',
 '1994,1,7,5,11251',
 '1994,1,8,6,8653',
 '1994,1,9,7,7910']

# 2. Converting Data Into A List Of Lists

**Instructions**

- Create a function named `read_csv()`
    - Takes a single, required argument, a string representing the file name of the CSV file.
    - Returns a list of lists of integer values in the csv file. Each row becomes a list of integer values and these lists are elements of the list to be returned.
    
- Use the `read_csv()` function to read in the file `"US_births_1994-2003_CDC_NCHS.csv"` and assign the result to `cdc_list`.

- Display the first 10 rows of `cdc_list` to confirm it's a list of lists, containing only integer values, and no header row.



In [4]:
# Define the function
def read_csv(csv_name):
    f = open(csv_name, "r")
    string_list = f.read().split('\n')
    string_list = string_list[1:]
    
    final_list = []
    
    for string in string_list:
        int_fields = []
        string_fields = string.split(',')
        for strf in string_fields:
            int_fields.append(int(strf))
        final_list.append(int_fields)
    
    return final_list

In [5]:
# Call the function
cdc_list = read_csv("US_births_1994-2003_CDC_NCHS.csv")
cdc_list[0:10]

[[1994, 1, 1, 6, 8096],
 [1994, 1, 2, 7, 7772],
 [1994, 1, 3, 1, 10142],
 [1994, 1, 4, 2, 11248],
 [1994, 1, 5, 3, 11053],
 [1994, 1, 6, 4, 11406],
 [1994, 1, 7, 5, 11251],
 [1994, 1, 8, 6, 8653],
 [1994, 1, 9, 7, 7910],
 [1994, 1, 10, 1, 10498]]

# 3. Calculating Number Of Births Each Month

** Instructions**

- Create a function named `month_births()` that takes single required argument list of lists and returns a dictionary containing number of month as the key and births occurred in the currosponding months as the value

- Use the `month_births()` function to calculate the monthly totals for the dataset and assign the result to `cdc_month_births`. Display the dictionary.

In [6]:
# Define the function
def month_births(csv_data):
    births_per_month = {}
    for line in csv_data:
        month = line[1]
        births = line[4]
        if month in births_per_month:
            births_per_month[month] += births
        else:
            births_per_month[month] = births
    return births_per_month

In [8]:
# Call the function
cdc_month_births = month_births(cdc_list)
cdc_month_births

{1: 3232517,
 2: 3018140,
 3: 3322069,
 4: 3185314,
 5: 3350907,
 6: 3296530,
 7: 3498783,
 8: 3525858,
 9: 3439698,
 10: 3378814,
 11: 3171647,
 12: 3301860}

# 4. Calculating Number Of Births Each Day Of Week

**Instructions**

- Create a function named `dow_births()` that takes a single, required argument (a list of lists) and returns a dictionary containing the total number of births for each unique value of the `day_of_week` column.
- Use the `dow_births()` function to return the day-of-week totals for the dataset and assign the result to `cdc_day_births`. Display the dictionary.

In [9]:
# Define the function
def dow_births(csv_data):
    births_per_day = {}
    for line in csv_data:
        dow = line[3]
        births = line[4]
        if dow in births_per_day:
            births_per_day[dow] += births
        else:
            births_per_day[dow] = births
    return births_per_day

In [10]:
# Call the function
cdc_day_births = dow_births(cdc_list)
cdc_day_births

{1: 5789166,
 2: 6446196,
 3: 6322855,
 4: 6288429,
 5: 6233657,
 6: 4562111,
 7: 4079723}

# 5. Creating A More General Function

** Instructions **

- Create a function named `calc_counts()` that takes `data` and `column` (which takes column number we want to calculate the totals for). 
- Use the calc_counts() function to:
    - Return the yearly totals for the dataset and assign the result to `cdc_year_births`.
    - Return the monthly totals for the dataset and assign the result to `cdc_month_births`.
    - Return the day-of-month totals for the dataset and assign the result to `cdc_dom_births`.
    - Return the day-of-week totals for the dataset and assign the result to `cdc_dow_births`.

In [13]:
# Define the function
def calc_counts(csv_data, col):
    births_per_col = {}
    for line in csv_data:
        col_value = line[col]
        births = line[4]
        if col_value in births_per_col:
            births_per_col[col_value] += births
        else:
            births_per_col[col_value] = births
    return births_per_col

In [14]:
# Function calls
cdc_year_births = calc_counts(cdc_list, 0)
cdc_month_births = calc_counts(cdc_list, 1)
cdc_dom_births = calc_counts(cdc_list, 2)
cdc_dow_births = calc_counts(cdc_list, 3)