# Reading & Writing Files, the csv Module, and Lambda Expressions

### Table of Contents
1. [Reading Files](#Reading-Files)
2. [Writing and Appending Files](#Writing-and-Appending-Files)
3. [The csv Module](#The-csv-Module)
4. [Lambda Expressions](#Lambda-Expressions)

### Reading Files
[[back to top]](#Table-of-Contents)
[[documentation]](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)

To read or write to files in Python, we use the `open()` built-in function. This funtion takes two arguments: the filename, and the mode. Some examples of modes include:

| Mode | Mode Description |
|:---:|------|
| 'r' | open for reading (default) |
| 'w' | open for writing, truncating the file first |
| 'x' | create a new file and open it for writing |
| 'a' | open for writing, appending to the end of the file if it exists |
| 'b' | binary mode |
| 't' | text mode (default) |
| '+' | open a disk file for updating (reading and writing) |


The default mode is `'rt'` (open for reading text). For binary random access, the mode `'w+b'` opens and truncates the file to 0 bytes, while `'r+b'` opens the file without truncation. The `'x'` mode implies `'w'` and raises an `FileExistsError` if the file already exists. Using `'r+'` opens the file for both reading and writing. The mode argument is optional; `'r'` will be assumed if it’s omitted.

Python distinguishes between files opened in binary and text modes, even when the underlying operating system doesn't. Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent. `'b'` appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.

Let's read a sample file.

In [1]:
# open the sample text file in read only mode
f = open('Sample_Text.txt', 'r')

In [2]:
# read the contents of the file
f.read()

'Hello, world!\nThis is my file. \nLine 3.'

In [3]:
# read the contents again
f.read()

''

Notice how when we read the file the second time, we only get a blank string returned. Imagine reading a book by following the text with your finger, this is similar to how Python reads files. In this example, the 'finger' is the file object's current position. When a file is opened, the position is set to the top of the file. We can see the current position by using the `tell()` method, and `seek()` method to move the position (seeking `0` moves to the start of the file). 

In [4]:
# give the file object's current position
f.tell()

41

In [5]:
# move the position to the start of the file
f.seek(0)

0

In [6]:
# check the position again
f.tell()

0

We can use the `.readline()` method to go through the file line-by-line. 

In [7]:
# read the first line of the file
f.readline()

'Hello, world!\n'

In [8]:
# read the second line
f.readline()

'This is my file. \n'

In [9]:
# return to the top of the file
f.seek(0)

0

We can also easily loop over the lines in a file.

In [10]:
for line in f:
    print(line, end='')

Hello, world!
This is my file. 
Line 3.

If we want to read all lines of the file, we can create a list of the lines. This is done by using either `list(f)` or `f.readlines()`. 

In [11]:
# read all lines of the file
f.seek(0)
f.readlines()

['Hello, world!\n', 'This is my file. \n', 'Line 3.']

Before we can move on, we need to close this file. If we don't, the file will stay 'in use' and you won't be able to open it regularly. 

In [12]:
# close the file
f.close()

### Writing and Appending Files
[[back to top]](#Table-of-Contents)


Let's create a new file that we can write to. We call the `open()` function with the `'x'` mode to do this. The `'w'` mode is for writing to existing files (the file is truncated first), and `'a'` is used to append to existing files. When you write to a file, the number of characters

In [13]:
# create a file in write mode
f2 = open('Sample-File2.txt', 'x')

In [14]:
# write to a file
f2.write('My first line\n')

14

In [15]:
# close the file
f2.close()

While the traditional `open()` and `f.close()` functions work well, there's a better way to work with file. Instead, we should use the `with` keyword instead. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way. It is also much shorter than how files are normally opened, using `try-except-finally` blocks. 

In [16]:
with open('Sample-File2.txt', 'r') as f3:
    read_lines = f3.readlines()

Now, the file is automatically closed, and we can still reference the lines becaues they're stored in the `read_lines` variable. 

In [17]:
read_lines

['My first line\n']

### The `csv` Module
[[back to top]](#Table-of-Contents)
[[documentation]](https://docs.python.org/3/library/csv.html)

The csv module implements classes to read and write tabular data in CSV format. It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file which was generated by Excel,” without knowing the precise details of the CSV format used by Excel. Programmers can also describe the CSV formats understood by other applications or define their own special-purpose CSV formats. The csv module’s reader and writer objects read and write sequences.

The `csv.reader()` function returns a reader object which will iterate over lines in the given csvfile. `csvfile` can be any object which supports the iterator protocol and returns a string each time its `__next__()` method is called — file objects and list objects are both suitable. If `csvfile` is a file object, it should be opened with `newline=''`. Each row read from the csv file is returned as a list of strings. No automatic data type conversion is performed unless the `QUOTE_NONNUMERIC` format option is specified (in which case unquoted fields are transformed into floats).

The `csv.writer()` function returns a writer object responsible for converting the user’s data into delimited strings on the given file-like object. `csvfile` can be any object with a `write()` method. If `csvfile` is a file object, it should be opened with `newline=''`. 

In [18]:
# create some lists
list1 = list('abcdefghij')
list2 = list(range(1, 11))
list3 = ['dog', 'cat', 'mouse', 'elephant', 'horse', 
         'snake', 'bat', 'cow', 'frog', 'bird']

In [19]:
# write the three lists to a csv file
import csv
with open('sample_csv.csv', 'w', newline='') as csv1:
    writer = csv.writer(csv1)
    writer.writerow(list1)
    writer.writerow(list2)
    writer.writerow(list3)

In [20]:
# read the csv file we just wrote
with open('sample_csv.csv', 'r', newline='') as csv2:
    reader = csv.reader(csv2)
    for row in reader:
        print(row)

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
['dog', 'cat', 'mouse', 'elephant', 'horse', 'snake', 'bat', 'cow', 'frog', 'bird']


In [21]:
# store the read data
with open('sample_csv.csv', 'r', newline='') as csv2:
    reader = csv.reader(csv2)
    saved_data = list(reader)

In [22]:
# show the saved data
saved_data

[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
 ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'],
 ['dog',
  'cat',
  'mouse',
  'elephant',
  'horse',
  'snake',
  'bat',
  'cow',
  'frog',
  'bird']]

This returns a list of lists, allowing us to work with it the data like any other list. 

In [23]:
# examine a particular element of the saved data
saved_data[0][2]

'c'

There are two very important keywords when dealing with csv files: `delimiter` and `lineterminator`. The `deliminater` keyword allows us to change the delimiter from the default comma to something else (a tab, space, pipe, or something else), and the `lineterminator` allows us to change where we start the next line. 

Let's do an example where we change the delimiter from a comma to a pipe. We have the `saved_data` list, so we can write this to a new csv object with a pipe delimiter. 

In [24]:
with open('my_pipe_csv.csv', 'w', newline='') as pipecsv:
    writer = csv.writer(pipecsv, delimiter='|')
    for item in saved_data:
        writer.writerow(item)

We now have a pipe-delimited csv file. 

### Lambda Expressions
[[back to top]](#Table-of-Contents)
[[documentation]](https://docs.python.org/2/tutorial/controlflow.html#lambda-expressions)

Lambda expressions, also called lambda functions, are small anonymous functions created with the `lambda` keyword. This function returns the sum of its two arguments: `lambda a, b: a+b`. Lambda functions can be used wherever function objects are required. They are syntactically restricted to a single expression. Semantically, they are just syntactic sugar for a normal function definition. Like nested function definitions, lambda functions can reference variables from the containing scope.

In [25]:
# create a normal function to add 1 to a number
def add_one(x):
    return x + 1

In [26]:
# use our function
add_one(5)

6

Now, let's make the same function using a lambda expression. 

In [27]:
# create our lambda expression
z = lambda x: x + 1

In [28]:
# use our lambda expression
z(8)

9

Let's check the types for both of these. 

In [29]:
print(type(add_one))
print(type(z))

<class 'function'>
<class 'function'>


They're both functions, which is interesting. Typically, we want to use the standard `def` keyword to define functions, because it makes the code much more readable. Lambda expressions are typically used within built-in functions, when we have no need to keep a function for repeated use. Let's go over another use case for lambda expressions. 

In [30]:
# define a list of tuples
pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

If we want to sort the pairs, they will be sorted by the first key. To demonstrate this, let's sort the pairs in reverse.

In [31]:
# sort the pairs in reverse
pairs.sort(reverse=True)
print(pairs)

[(4, 'four'), (3, 'three'), (2, 'two'), (1, 'one')]


Now what is we wanted to sort by the second value in the tuple? We can use a lambda expression to give us the second value of the pair instead. 

In [32]:
# sort the pairs by the second value (index = 1)
pairs.sort(key=lambda pair: pair[1])
print(pairs)

[(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')]
