# Starter code when reading files

[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/enactdev/CISC_106_F18/master?filepath=starter_code/file_input.ipynb)

**Even if you have not been using the Binder links, I suggest you do for file io since the file is already created for you and in the proper place.**

### Read the entire file into a variable:

In [1]:
filename = '../data_files/us_census_population.csv'

# Open the file
f = open(filename, 'r')

# Read entire file
file_contents = f.read()

# Close the file
f.close()

# Check out the first 50 characters
print(file_contents[:50])

Year,Population
1790,3929326
1800,5308483
1810,723


---

### Read a file line by line:

**Just like user input with the `input()` function, data from a file starts out as a string.**

In [2]:
filename = '../data_files/us_census_population.csv'

# Open the file
f = open(filename, 'r')

# When reading a file it can be handy to know what line you are on
current_line = 0

# Iterate through the file line by line. This happens automatically when iterating
# through a file with a for loop
for line in f:
    
    current_line += 1
    
    # For testing, look at the data type the line is, and also the line value
    print('current line number:', current_line)
    print('type(line):', type(line))
    print('line:', line)
    
    # When testing, it can be useful to exit after manually looking at a few lines.
    # You do not want these two lines in production code
    if current_line >= 5:
        break
    
f.close()

current line number: 1
type(line): <class 'str'>
line: Year,Population

current line number: 2
type(line): <class 'str'>
line: 1790,3929326

current line number: 3
type(line): <class 'str'>
line: 1800,5308483

current line number: 4
type(line): <class 'str'>
line: 1810,7239881

current line number: 5
type(line): <class 'str'>
line: 1820,9638453



---

### Converting a line into values:

In [3]:
filename = '../data_files/us_census_population.csv'

# Open the file
f = open(filename, 'r')

# This is a CSV file. As shown above, the first line is a header row. 
# Read it with the next() function, but do not do anything with it.
header_row = next(f) 

# When reading a file it can be handy to know what line you are on
# Since we read the first line and ignored it, set to 1
current_line = 1

# Iterate through the file line by line. This happens automatically when iterating
# through a file with a for loop
for line in f:
    
    current_line += 1
    
    # Turn the line into a list of values
    line_split = line.split(',')
    
    # Use the strip() method to make sure there are not any spaces around the data
    year = line_split[0].strip()
    population = line_split[1].strip()
    
    print('In {} the population was {}'.format(year, population))

    
f.close()

In 1790 the population was 3929326
In 1800 the population was 5308483
In 1810 the population was 7239881
In 1820 the population was 9638453
In 1830 the population was 12866020
In 1840 the population was 17069453
In 1850 the population was 23191876
In 1860 the population was 31443321
In 1870 the population was 39818449
In 1880 the population was 50189209
In 1890 the population was 62947714
In 1900 the population was 76212168
In 1910 the population was 92228496
In 1920 the population was 106021537
In 1930 the population was 122775046
In 1940 the population was 132164569
In 1950 the population was 150697361
In 1960 the population was 179323175
In 1970 the population was 203302031
In 1980 the population was 226545805
In 1990 the population was 248709873
In 2000 the population was 281421906
In 2010 the population was 308745538


---

**Basic skip lines example:**

In [4]:
filename = '../data_files/us_census_population.csv'

# Open the file
f = open(filename, 'r')

# This is a CSV file. As shown above, the first line is a header row. 
# Read it with the next() function, but do not do anything with it.
header_row = next(f) 

# When reading a file it can be handy to know what line you are on
# Since we read the first line and ignored it, set to 1
current_line = 1

# Number of lines to skip
skip_lines = 10

# Iterate through the file line by line. This happens automatically when iterating
# through a file with a for loop
for line in f:
    
    current_line += 1
    
    if current_line < skip_lines:
        continue
    
    # Turn the line into a list of values
    line_split = line.split(',')
    
    # Use the strip() method to make sure there are not any spaces around the data
    year = line_split[0].strip()
    population = line_split[1].strip()
    
    print('In {} the population was {}'.format(year, population))

    
f.close()

In 1870 the population was 39818449
In 1880 the population was 50189209
In 1890 the population was 62947714
In 1900 the population was 76212168
In 1910 the population was 92228496
In 1920 the population was 106021537
In 1930 the population was 122775046
In 1940 the population was 132164569
In 1950 the population was 150697361
In 1960 the population was 179323175
In 1970 the population was 203302031
In 1980 the population was 226545805
In 1990 the population was 248709873
In 2000 the population was 281421906
In 2010 the population was 308745538
