# Import files in python code.
This section is based on Python Official Documentation (available in https://docs.python.org/3/tutorial)

## Reading and Writing Files
The most commonly used mode to read or write a file object is with `open()`.

In [5]:
f = open('test_file.txt', 'r', encoding="utf-8")
print(f.readline()) # Read the first line
f.close() 

# This is a text file for coding.



In [None]:
f = open('test_file.txt', 'r', encoding="utf-8")
print(f.read(10)) # Read 10 first characters
f.close() 

# This is 


The `open()` contain 3 arguments: the file name, `r` for reading, `w` for writing or `a` for any, and the `encoding` choice. Appending `b` to the mode opens the file in binary mode. Binary mode data is read and written as bytes objects. You can not specify encoding when opening file in binary mode.
It's a good pratice use the `with` keyword before the `open()` r/w the file object.

In [None]:
with open('test_file.txt', 'r', encoding="utf-8") as f:
    print(f.read()) # read all lines

# This is a text file for coding.
# All phrases with a '#' in start must be ignored.
# This ficctional example show the score of students in a exam.
# All data is separated by tabular
student	scoreMath	scoreBio	scoreGrammar	totalScore
Thompson	85	78	92	255
Jackson	90	81	76	247
Marie	72	88	85	245
Alice	95	91	89	275
John	68	75	70	213
Clarice	82	94	88	264


## Read files line by line
This is a key process to store data by line in a code. In this example, lines that start with '#' will be ignored and only the table will be printed.

In [None]:
# Print only lines that doesn't start with '#'
with open("test_file.txt") as f:
  for x in f:
    if x.startswith('#'):
      pass # ignore those lines
    else:
      print(x) 

student	scoreMath	scoreBio	scoreGrammar	totalScore

Thompson	85	78	92	255

Jackson	90	81	76	247

Marie	72	88	85	245

Alice	95	91	89	275

John	68	75	70	213

Clarice	82	94	88	264


## Read files as a list

In [10]:
# create a list of each line
with open("test_file.txt") as f:
  for x in f:
    if x.startswith('#'):
      pass # ignore those lines
    else:
      names = x.split() # the split function without arguments use spaces as delimiters
      print(names)

['student', 'scoreMath', 'scoreBio', 'scoreGrammar', 'totalScore']
['Thompson', '85', '78', '92', '255']
['Jackson', '90', '81', '76', '247']
['Marie', '72', '88', '85', '245']
['Alice', '95', '91', '89', '275']
['John', '68', '75', '70', '213']
['Clarice', '82', '94', '88', '264']


In [11]:
# print only the first column
with open("test_file.txt") as f:
  for x in f:
    if x.startswith('#'):
      pass # ignore those lines
    else:
      names = x.split() 
      print(names[0]) # show the first column

student
Thompson
Jackson
Marie
Alice
John
Clarice


In [14]:
# print lines
with open("test_file.txt") as f:
    for lines in f:
        if lines.startswith('#'):
            pass
        else:
            lines = f.readlines()
            print(lines)

['Thompson\t85\t78\t92\t255\n', 'Jackson\t90  81\t76   247\n', 'Marie\t72\t88\t85\t245\n', 'Alice\t95\t91\t89\t275\n', 'John\t68\t75\t70\t213\n', 'Clarice\t82\t94\t88\t264']


In this case, the columns headers aren't shown in the output of the code, only the content. To print all content, we need create a empty list. 

In [20]:
with open('test_file.txt') as f:
    table_lines = [] #  empty list
    for lines in f:
        if lines.startswith('#'):
            pass
        else:
            table_lines.append(lines) # .append() function add all lines in the list table_lines

    print(table_lines[0])

student scoreMath	scoreBio	scoreGrammar	totalScore



This way of parse files ensure that all lines had been correctly stored in our list. 

## Read files as a dictionary

In [33]:
# Dictionary of values
fruits = {"banana":12,"apple":9,"pineapple":15,"tomato":3}
print(fruits["banana"])
# add new value
fruits["orange"] = 8
print(fruits)
# delete a value
del fruits["banana"]
print(fruits)
# sort dict
print(sorted(fruits))

12
{'banana': 12, 'apple': 9, 'pineapple': 15, 'tomato': 3, 'orange': 8}
{'apple': 9, 'pineapple': 15, 'tomato': 3, 'orange': 8}
['apple', 'orange', 'pineapple', 'tomato']


Let's manipulate our data.

In [None]:
# ASSOCIATE VALUES WITH DICT - INCOMPLETE
with open("test_file.txt") as f:
  for x in f:
    if x.startswith('#'):
      if x.startswith('student'):
        pass
    else:
      print(x) 


student scoreMath	scoreBio	scoreGrammar	totalScore

Thompson	85	78	92	255

Jackson	90  81	76   247

Marie	72	88	85	245

Alice	95	91	89	275

John	68	75	70	213

Clarice	82	94	88	264
