---

# Lecture 4

Learning resources:

- Pre-lecture videos, thise lecture notes and the lecuture 04 itself as usual

Additional resources:

- Working with files: 
  - slides 103 to 116
  - Video: [Socratica: Text files in Python](https://youtu.be/4mX0uPQFLDU?si=nKq2jHYSwJbyqmXp)

---

- [**1. Splitting a string**](#1.-Splitting-a-string)


- [**2. Reading and writing files**](#2.-Reading-and-writing-files)

    - [2.1. Writing a text file](#2.1.-Writing-a-text-file)
    
    - [2.2. Reading a text file](#2.2.-Reading-a-text-file)

---

## 1. Splitting a string

- We will often need to split a string into separate parts. This procedure is frequently used during text file processing. We can use the string method `split()` for this. Try e.g. `help("".split)` at the Python prompt for more info:

        help("".split)
        Help on built-in function split:

        split(sep=None, maxsplit=-1) method of builtins.str instance
            Return a list of the words in the string, using sep as the delimiter string.

            sep
              The delimiter according which to split the string.
              None (the default value) means split according to any whitespace,
              and discard empty strings from the result.
            maxsplit
              Maximum number of splits to do.
              -1 (the default value) means no limit.

Example:

Take a string and display each word on a separate line.

In [1]:
c = 'This is my string'

In [2]:
csp = c.split(" ")   # produces a list with elements being sub-strings
                     # composed of characters contained between the white spaces
                     # c.split(" ") is equivalent to c.split()

In [3]:
csp

['This', 'is', 'my', 'string']

In [4]:
for i in csp:     # print on separate lines
    print(i)

This
is
my
string


In [5]:
c.split('i')   # split around the character "i"

['Th', 's ', 's my str', 'ng']

Example:

Extract temperature from the following string: and convert it to a floating point number.

In [6]:
a = "Water freezes at 273.15 K"

In [7]:
asp = a.split()   # split to list of substrings

temperature_string = asp[3]   # select the 4th element

temperature = float(temperature_string)   # convert string to float

temperature

273.15

Equivalent short solution:

In [8]:
temperature = float(a.split()[3])

temperature

273.15

---

## 2. Reading and writing files


It is a common task to

- read some input data file
- do some calculation/filtering/processing with the data
- write some output data file with results


Python distinguishes between

- _text_ files (`'t'`)
- _binary_ files (`'b'`)

If we don't specify the file type, Python assumes we mean text files.

### 2.1. Writing a text file

To write data, we need to open the file with `'w'` mode:

    f = open('test.txt', 'w')

By default, Python assumes we mean text files. However, we can be explicit and say that we want to create a Text file for Writing:

    f = open('test.txt', 'tw')

- If the file exists, it will be overridden with an empty file when the open command is executed.


- The file object `f` has a method `f.write` which takes a string as in input argument.


- We must close file at the end of writing process using `f.close()`. It is a good practice to close the file as soon as possible.


Example:

In [9]:
f = open('test.txt', 'w')  # open file test.txt for writing

In [10]:
f.write("first line\nsecond line")  # returns number of characters

22

In [11]:
f.close()  # close the file

Creates a file `test.txt` with content:

    first line
    second line


### 2.2. Reading a text file

We create a file object `f` for file reading using

    f = open('test.txt', 'r')

and have different ways of reading the data:


#### 2.2.1. `f.readlines()`

- returns a list of strings (each being one line)

In [12]:
f = open('test.txt', 'r')

In [13]:
lines = f.readlines()

In [14]:
f.close()

In [15]:
lines

['first line\n', 'second line']

In [16]:
type(lines)

list

#### 2.2.2. `f.read()`

- returns one long string for the whole file


In [17]:
f = open('test.txt', 'r')

In [18]:
data = f.read()

In [19]:
f.close()

In [20]:
data

'first line\nsecond line'

In [21]:
type(data)

str

#### 2.2.3. Use text file `f` as an iterable object

- process one line in each iteration (important for large files)

In [22]:
f = open('test.txt', 'r')

In [23]:
for line in f:
    print(line, end='')

first line
second line

In [24]:
f.close()

#### 2.2.4. Opening and _automatic_ file closing through context manager

Python provides _context managers_ that we use using `with`. For the file access:

In [25]:
with open('test.txt', 'r') as f:
    data = f.read()

data

'first line\nsecond line'

- If we use the file context manager, it will close the file automatically (when the control flows leaves the indented block).


- Once you are familiar with file access, we recommend you use this method.



#### 2.2.5. Exercise - shopping list

Given the following list stored in the file `Lecture4_shopping.txt`:

    bread      1  1.39
    tomatoes   6  0.26
    milk       3  1.45
    coffee     3  2.99


Write program that computes total cost per item, and writes to `Lecture4_shopping_cost.txt`:

    bread      1.39
    tomatoes   1.56
    milk       4.35
    coffee     8.97

Let us first create the `Lecture4_shopping.txt` file on disk using the IPython `%%file` magic. It creates the file, and writes everything in the cell into that file:

In [26]:
%%file Lecture4_shopping.txt
bread      1  1.39
tomatoes   6  0.26
milk       3  1.45
coffee     3  2.99

Overwriting Lecture4_shopping.txt


One solution is:

In [27]:
fin = open('Lecture4_shopping.txt', 'tr')    # INput File
lines = fin.readlines()             # read the list from the input file
fin.close()                         # close file as soon as possible

In [28]:
lines

['bread      1  1.39\n',
 'tomatoes   6  0.26\n',
 'milk       3  1.45\n',
 'coffee     3  2.99\n']

In [29]:
fout = open('Lecture4_shopping_cost.txt', 'tw') # OUTput File

for line in lines:
    words = line.split()
    itemname = words[0]
    number = int(words[1])
    cost = float(words[2])
    totalcost = number * cost
    fout.write("{:20} {}\n".format(itemname, totalcost))

fout.close()

which produces output file `Lecture4_shopping_cost.txt` with content show above. We use the `!cat` command to display the file:

In [30]:
!cat Lecture4_shopping_cost.txt

bread                1.39
tomatoes             1.56
milk                 4.35
coffee               8.97


### 2.3 Binary files

Files that store _binary_ data are opened using the `'b'` flag (instead of `'t'` for Text):

    f = open('data.dat', 'br')

- For text files, we read and write `str` objects. For binary files, use the `bytes` type instead.


- By default, store data in text files. Text files are human readable (that's good) but take more disk space than binary files.


- Reading and writing binary data is outside the scope of this introductory module. If you need it, do learn about the `struct` module.

---