# Working with Files in Python

### File system 
The file system is where files are stored and organized into directories, often visually represented through a graphical user interface 

### Path
The specific place in a directory where a file is located is called its path. For instance, if you had a file named file.txt on your Desktop, the file's path might be /Users/username/Desktop/file.txt on a Mac or C:\Desktop\file.txt on a Windows machine

### OS
Library that makes navigating directories and locating files easier

In [30]:
import os

In [32]:
# find out our current working directory 

os.getcwd()

'/home/cvila/Ironhack_Prework_01/prework-datamad-no-solutions'

In [33]:
# To change your current working directory, you can use the os.chdir()

## Writing in files

### Write to a file

We can write to a file in Python by using a with open() as combination to open the file for writing ( "w"), and then the write method to actually write content to that file. 

In [35]:
with open ("example.txt", "w") as f:
    f.write("Hello World! \n")
    f.write("how are you? \n")
    f.write("I'm fine.")

###  Reading Files

We will need to change the "w" inside the open() function to an "r" (for "reading"). We can then use the readlines method to read all the lines in the file and a simple for loop to print each of the lines. 

In [37]:
with open ("example.txt", "r") as f:
    lines = f.readlines()
    for line in lines:
        print (line)

Hello World! 

how are you? 

I'm fine.


## Reading Data from a CSV File 

Comma-delimited files with a .csv file extension. In these files, commas are used to separate values from one another and new lines are used to separate records from each other. In this section, we are going to read data from a csv file and perform some simple calculations.

In [41]:
##extract the data from this file and put it into a format that we can work with##

# First, we are going to create an empty list called data
data = []

# Then we are going to read our csv file, convert each line to a list by leveraging the split() method, and append it to our data list. 
with open("/home/cvila/Downloads/weight_height.csv", "r") as f:
    lines =f.readlines()
    for line in lines:
        data.append(line.split()[0].split(","))


In this example, we used the split() method twice. The first time, it was used to turn each line into a list, but the content of each list was just a comma-separated string, which we would not have been able to do anything with. In order to access the values we need to access, we needed to reference the 0th element in each list (the comma-separated string) and then split the values wherever there was a comma. 

In [44]:
data

[['gender',
  'actual_weight',
  'actual_height',
  'reported_weight',
  'reported_height'],
 ['M', '77', '182', '77', '180'],
 ['F', '58', '161', '51', '159'],
 ['F', '53', '161', '54', '158'],
 ['M', '68', '177', '70', '175'],
 ['F', '59', '157', '59', '155'],
 ['M', '76', '170', '76', '165'],
 ['M', '76', '167', '77', '165'],
 ['M', '69', '186', '73', '180'],
 ['M', '71', '178', '71', '175'],
 ['M', '65', '171', '64', '170'],
 ['M', '70', '175', '75', '174'],
 ['F', '166', '57', '56', '163'],
 ['F', '51', '161', '52', '158'],
 ['F', '64', '168', '64', '165'],
 ['F', '52', '163', '57', '160'],
 ['F', '65', '166', '66', '165'],
 ['M', '92', '187', '101', '185'],
 ['F', '62', '168', '62', '165'],
 ['M', '76', '197', '75', '200'],
 ['F', '61', '175', '61', '171'],
 ['M', '119', '180', '124', '178'],
 ['F', '61', '170', '61', '170'],
 ['M', '65', '175', '66', '173'],
 ['M', '66', '173', '70', '170'],
 ['F', '54', '171', '59', '168'],
 ['F', '50', '166', '50', '165'],
 ['F', '63', '169', 

In [46]:
data[0]

['gender',
 'actual_weight',
 'actual_height',
 'reported_weight',
 'reported_height']

In [47]:
data[1][0]

'M'

In [48]:
heights = []

for person in data[1:]:
    height = int(person[2])
    heights.append(height)

In [49]:
avg_height = sum(heights)/len(heights)
print(avg_height)

170.14835164835165


Comparing the average heights of males vs. females

In [52]:
# 1 create an empty list for each segmentation
male_heights = []
female_heights = []
 
# 2 Append to the empty lists the information of interest for each segmentation
for person in data[1:]:
    height = int(person[2])
    if person[0] == 'M':
        male_heights.append(height)
    elif person[0] == 'F':
        female_heights.append(height)
   
# 3 make the desired studies on the appended sublists        
avg_male_height = sum(male_heights)/len(male_heights)
avg_female_height = sum(female_heights)/len(female_heights)
 
print("Avg male height:", avg_male_height)
print("Avg female height", avg_female_height)

Avg male height: 178.0121951219512
Avg female height 163.7
