**Working With CSV Files**

CSV files are used to store a large number of variables – or data. They are incredibly simplified spreadsheets – think Excel – only the content is stored in plaintext.

**Challenge 2**

Functions allow us to organize our code and make it more modular and reusable. 
Rewrite the logic for calculating the average and the highest cholesterol values to use functions.

1. Write a function for calculating the average of a numerical list.
    
    - Input parameters: 
        - CSV file object
        - index of the column for which we need to calculate the average
    - Return parameter: 
        - average value
        
2. Write a function for calculating the maximum value of a numerical list
    - Input parameters: 
        - CSV file object
        - index of the column for which we need to calculate the maximum
    - Return parameter: 
        - maximum value
        



In [5]:
# To parse CSV files, we use the csv module. CSV literally stands for comma separated value, 
# where the comma is what is known as a "delimiter." The csv module provides a number of built-in
# functions to make it easier to parse and iterate through CSV files.
import csv

In [6]:
# Open the diabetes file.  Note that when Python opens data files and stores them in variables,
# the variables DO NOT actually contain text.  In the example below, the diabetes_file 
# variable stores the file in a special format (one that Python can understand and interpret)
diabetes_file = open("diabetes.csv")


# Now we need to tell Python that the file stored in diabetes_file variable should be read as 
# and interpreted as a CSV file.  We do that by calling on the reader() function of the csv module
diabetes_data = csv.reader(diabetes_file)



In [7]:
# A function is a block of organized, reusable code that is used 
# to perform a single, related action. Functions provide better modularity 
# for your application and a high degree of code reusing.
# https://www.tutorialspoint.com/python/python_functions.htm

# Just like with variables, functions need to be declared before they can be used
# The example below shows a function for calculating an average value from a column in a CSV file object

def calculate_average(csv_data, column_index):
    cnt = 0 # Initialize a temporary counter
    total = 0 # This variable will hold the sum of all cholesterol values

    for row in diabetes_data:
        if row[1] != "":
            if cnt > 0:
                total = total + int(row[column_index])
            cnt = cnt + 1 # Increment the counter by one

    column_average = total / cnt

    return column_average



In [8]:
# Call the function
diabetes_file.seek(0) # Reset the read position of the file object
avg_chol = calculate_average(diabetes_data, 1); 
print("Average cholesterol: " , avg_chol)

Average cholesterol:  207.3300248138958


In [10]:
def calculate_max(csv_data, column_index):
    cnt = 0 # Initialize a temporary counter
    max_chol = 0 # This variable will hold the sum of all cholesterol values

    for row in csv_data:
        if row[1] != "":
            if cnt > 0:
                # Every time through the loop (for every row that contains a value)
                # we compare the value from the data with the value stored in 
                # max_chol variable.  
                # If the value from the data is larger, we set max_chol to that larger value
                # After the loop finishes running, the largest value will be stored in max_chols
                if max_chol < int(row[column_index]):
                    max_chol = int(row[column_index])
            cnt = cnt + 1 # Increment the counter by one

    return max_chol


# Calculate maximum cholesterol
diabetes_file.seek(0) # Reset the read position of the file object
max_chol = calculate_max(diabetes_data, 1)
print("Maximum cholesterol: ", max_chol)

Maximum cholesterol:  443
