# File Handling Hands On

You are going to read in a file of data, compute some statistics from the data, and write out a brief report of your findings. Start by downloading the data file _weight-height.csv_ from Canvas.

**For Google Colab Users**

* Run the following cell to give this Colab notebook access to your Google Drive.
* Then, in Colab, click the File Folder icon along the left edge. This allows you to navigate to your files. Expand _drive_ and you should see _MyDrive_ -- that's your Google drive.
* Put the _weight-height.csv_ file somewhere in your Google Drive and note where you put it.
* Then, when the time comes later to open this file using Python code, the path to it will be
`/content/drive/MyDrive/folder/folder/.../weight-height.csv`.

In [None]:
# Only run this if you are using Google Colab
from google.colab import drive
drive.mount('/content/drive')

**For Everyone**

You can't really see what's in the _weight-height.csv_ file if you open it in JupyterLab or Google Colab. Those tools will give you a spreadsheet-like view of the file. (If you are using JupyterLab and not Colab, you can right-click the file and choose Open with --> Editor.)

Here's an excerpt from the file to show you what it really contains:

```
"Gender","Height","Weight"
"Male",73.847017017515,241.893563180437
"Male",68.7819040458903,162.310472521300
"Male",74.1101053917849,212.7408555565
"Male",71.7309784033377,220.042470303077
"Male",69.8817958611153,206.349800623871
. . .
. . .   (lots of lines omitted)
. . .
"Female",58.9107320370127,102.088326367840
"Female",65.2300125077128,141.305822601420
"Female",63.3690037584139,131.041402692995
"Female",64.4799974256081,128.171511221632
"Female",61.7930961472815,129.781407047572
```

The file is in CSV format, which stands for Comma-Separated Values. You'll see that every line contains three values: gender, height, and weight, with a comma separating each value. The exception is the first line, which just contains headings to describe the lines that follow.

Your task is to write a program that:

* Prompts the user for the name of the data file
* Opens the data file
* Reads the data file and calculates the following values for each gender:
  * Number of data points
  * Average Height
  * Average Weight
* Writes a new file, _weight-height.txt_, that contains a report of your findings. The format of this file should be as follows:

```
Female

Number of data points: n
Average Height: hhh
Average Weight: www

Male

Number of data points: n
Average Height: hhh
Average Weight: www
```
* Closes both files (the one you read and the one you wrote)

## The Rules
* You may assume that the data file contains only gender, height, and weight in the order shown in the file. Your code does not have to work with a different file that might contain different values in a different order.
* You may **not** assume anything about how many records are in the file, or how many records there are for males and females. If I gave you a different file with a different number of male records and a different number of female records, your code would still have to work correctly.
* You may not alter the data file in any way, except that you may rename it with a shorter name if you like.
* You may **not** use pandas.

In [2]:
# Here is "pseudocode" for the game plan

# Ask the user for the name of a data file
# Open that data file

# Establish running totals for height and weight for both male and female
# Establish counters for total number of males and females

# Read and discard the first line of the file (no useful data on that line)

# For every remaining line in the data file
#   Get the gender, height, and weight from the line
#   If male...
#     Add height to total male height
#     Add weight to the total male weight
#     Add 1 to the total male count
#   Otherwise...
#     Do all the same things for the female totals and counts
#
# After all lines have been read, calculate the averages
# Open the output file to write on it
# Write the output
# Close the output file

male_total_height = 0
male_total_weight = 0
female_total_height = 0
female_total_weight = 0
male_count = 0
female_count = 0

file_name = input('Enter the name of the file: ')
f = open(file_name)
f.readline()

for line in f:
    line = line.rstrip('\n')
    line_parts = line.split(',')  # line_parts will be a list of strings
    gender = line_parts[0]
    height = float(line_parts[1])
    weight = float(line_parts[2])

    if gender == '"Male"':
        male_total_height = male_total_height + height  # update running sum
        male_total_weight = male_total_weight + weight
        male_count = male_count + 1
    else:
        female_total_height = female_total_height + height  # update running sum
        female_total_weight = female_total_weight + weight
        female_count = female_count + 1

# Calculate Averages
average_male_height = male_total_height / male_count
average_male_weight = male_total_weight / male_count
average_female_height = female_total_height / female_count
average_female_weight = female_total_weight / female_count

output_file = open('weight-height.txt', 'w')

output_text = f'''Female

Number of data points {female_count}
Average Height: {average_female_height}
Average Weight: {average_female_weight}

Male

Number of data points {male_count}
Average Height: {average_male_height}
Average Weight: {average_male_weight}
'''

output_file.write(output_text)
output_file.close()
