## 5. Input and output in Python

A computer works on the *EVA principle* (**E**input-**Processing-**A**output), i.e. it receives data via an input device, processes it and delivers the result via an output device to its surroundings. I/O operations are for example:

Input:

- Entering characters using the keyboard
- Reading files stored on peripheral storage (hard drive, CD, memory stick, etc.)

Output:

- Displaying texts and numbers on the screen in a console window
- Writing to files stored on peripheral storage

In Python, all of these operations are carried out via objects of type `file`.

### 5.1 Files

#### 5.1.1 What is a `file`?

A `file` is a series of *bits*. We call an 8-bit unit (octets) a *byte*. So we know that all data in memory is stored in *binary* form.

These patterns of zeros and ones can be interpreted in different ways depending on the context. To interpret a binary number as an integer, you can use this formula:

$$ z = \sum_{i=0}^{n-1} 2^i \;b_i$$

In [None]:
# TODO example binary number


The same pattern can also stand for a character. In this case, translation is done using a table.

![](https://upload.wikimedia.org/wikipedia/commons/1/1b/ASCII-Table-wide.svg)

Those: https://upload.wikimedia.org/wikipedia/commons/1/1b/ASCII-Table-wide.svg

In [None]:
# TODO example characters


The `file` object can (in principle) be of any length. The end is marked by the special character `eof` (*end of file*, ASCII: `0x04`).

#### 5.1.2 Create, read and close a `file` object

A `file` object is created by calling the standard `open()` function. The general syntax is `open(filename, mode = "r")`, where:

- `filename`: Path consisting of two `string` parts: name of the directory (e.g. */python/programs/*) and file name (e.g. *textfile.txt*)
- `mode`: describes in which mode the file should be opened. Default = `"r"`

|`mode`|Explanation|
|:---|:---|
|`"r"`| File is opened for reading only. It must already exist|
|`"w"`| File is opened exclusively for writing, if a file with the same name already exists, its length is set to zero and it is **overwritten**|
|`"a"`| "append": File is opened exclusively for writing. If a file with the same name already exists, it will be **extended**|
|`"r+"`| The file is opened for reading and writing. It must already exist|
|`"w+"`| The file is opened for reading and writing, if a file with the same name already exists, it will be **overwritten** |
|`"a+"`| The file is opened for reading and writing, if a file with the same name already exists, it will be **expanded**|

The file opens in *text format* by default. Adding a `"b"` (e.g. `"rb"`, `"wb"`, `"r+b"`, `"w+b"`) will open this in *binary format*, i.e. its contents is returned as `byte` objects.

Note on `filename`: Absolute path specifications are inherently inflexible, so relative path specifications should always be used. If possible, design your programs so that necessary files are in subfolders that are relatively easy to access from the main program!

The `read(size)` method is used for reading. `size: int` is an optional argument here. If the `size` specification is omitted or if it is negative, the entire `file` is read and returned, otherwise a maximum of the number of bytes passed is read.

The official [Python documentation](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) comments laconically on this:

> *It's your problem if the file is twice as large as your machine's memory.*

In [None]:
# The file "LICENSE_PYTHON.txt" must be in the "data" folder,
# which is on the same level as the workbook
daten = open("daten/LICENSE_PYTHON.txt")
#print(daten.read(10))
#print(daten.read())
# print("Towards the End")
# print(daten.read())

Text files often need to be edited line by line. The method `readline(size)` exists for this. `size` dictates the maximum amount of data to be read. If this argument is omitted, Python reads *the entire line*.

The method `readlines()` is similar. This reads the *entire file*, but breaks it down at the line breaks and packs the parts into a `list`.

In Python, newlines have the format `\n`.

In [None]:
print(daten.readline())
# TODO readlines()


Remarks:

? Different operating systems have different line endings (`\n` on Linux, `\r\n` on Windows, `\r` on Macs)

? Python works **platform-independent** because it converts the platform-specific line endings to `\n` when reading and back again when writing.

? `file` objects are *iterators*. That's why you can only read/run through them once.

In [None]:
# Because file objects are iterators, they can be linked with a
# for loop
daten = open("daten/LICENSE_PYTHON.txt")
# ALL


Every file that was opened should also be closed and saved. You can check this with the `closed` attribute and close the file with the `close()` method.

In [None]:
# Close TODO file
#daten.closed
daten.close()
daten.closed
#data.read()

#### 5.1.3 Writing a `file` object

While a file is open in write mode(!) (`closed == False`), you can write text to the file using the `write(text)` method. `text` must be of the data type `string`. The return value is the number of characters written.

In [None]:
# Create a new file ownFile.txt and write it with write()
eigeneDatei = open("daten/eigeneDatei.txt", "w")

eigeneDatei.write("Sehr geehrte Damen und Herren\n")
eigeneDatei.write("bla"*5 + "\n")
eigeneDatei.write("Das war jetzt mit write()\n")

eigeneDatei.close()     # Speichern und Schließen

Alternatively, you can also write to a file using the `print()` function. We remember the syntax of this function and the optional parameter `file`. This means we can also write the above code cell as:

In [None]:
# Create a new file ownFile.txt and write it with print()
eigeneDatei = open("daten/eigeneDatei.txt", "w")

print("Sehr geehrte Damen und Herren", file=eigeneDatei)
print("bla"*5, file=eigeneDatei)
print("Das war jetzt mit print()", file=eigeneDatei)

eigeneDatei.close()     # Speichern und Schließen

In [None]:
# Reading the self-created file “ownfile.txt”
eigeneDatei = open("daten/eigeneDatei.txt", "r")
text = eigeneDatei.read()
eigeneDatei.close()
print(text)

#### 5.1.4 Caching without closing

You can easily cache files in Python without immediately closing them. This is enabled by the `flush()` method.

In [None]:
# Example of flush()
flush = open("daten/flush.txt", "w")
flush.write("Hello darkness\n")
#flush.flush()

In [None]:
# further processing of the file
print(flush.closed)
flush.write("my old friend\n")
flush.close()
flush.closed

In [None]:
# Read the file
song = open("daten/flush.txt")
print(song.read())
song.close()

#### 5.1.5 Move and determine file cursor

Sometimes it is necessary to know and change the current cursor position in the file. The methods `tell()` and `seek(offset, from)` are used for this.

`tell()` returns the current position of the cursor from the beginning of the file.

`seek(offset, from)` moves the cursor, where:

- `offset`: Required. Specifies how many bytes the cursor should be moved against a specific reference point.
- `from`: Optional. Can only be 0, 1 or 2:
    - `from=0`: Default. *offset* refers to the beginning of the file.
    - `from=1`: *offset* refers to the *current position*.
    - `from=2`: *offset* refers to the *end of file*

In [None]:
# TODO function for length of a file with cursors
def laengeFile(datei):
    pass # TODO

with open("daten/LICENSE_PYTHON.txt", "r") as datei:
    print(f"{datei.name} ist {laengeFile(datei)} Bytes lang")

Compared to *classic iterators*, this makes it possible to *rewind* the `file` object and iterate through it as often as desired.

#### 5.1.6 The `with` statement

We have seen that for a program with `file` objects to run error-free, it is essential to open files in a controlled manner and close them again after processing, even if something goes wrong in between. This is basically the idea of ​​the `with` statement, whose syntax is structured as follows:

```python
with object as name:
    instruction block
```

or related to files:

```python
with open(filename, mode) as filename:
    file_instructionblock
```

Objects can have two special methods:

- `__enter__()`: opens a file
- `__exit__()`: closes the file

the `with` statement guarantees that the `__exit__()` method will *definitely* be called if the `__enter__()` method was previously successfully executed.

In [None]:
# TODO reading with with statement


In [None]:
# TODO reading with with statement alternatively


#### 5.1.7 The pseudofiles `sys.stdin` and `sys.stdout`

Until now, we have conveniently queried keyboard input using the `input()` function. In the background, these entries are made via a *pseudofile*. These are essentially a `file` object with limited access options. It has no write methods, only the `readline()` method. The name of this pseudofile is `sys.stdin`. In German we also speak of the *standard input*.

```python
# input() function programmed manually
# only works in the Python interactive shell
import sys
print("Input: ", end = " ")
input = sys.stdin.readline()
print("Your input was: ", input)
```

For the output there is the pseudofile `sys.stdout`, which is also the default parameter of the standard `print()` function. This behaves like a `file` object that cannot be read, but only written to using `write()`.

In [None]:
# Example of outputting a string without print()
import sys
sys.stdout.write("Hallo Welt!")

### 5.2 Saving objects with `pickle`

In German, *pickle* means *pickle* and *to pickle* means *to pickle*. The Python-specific module `pickle` provides functions with which you can save program data after the program ends or is aborted. In computer science, this type of data is called *persistent data*.

The following object types can be saved using the pickle mechanism:

- Pay
- thongs
- Features
- any sequences
- Dictionaries
- Instances of self-defined classes (*later lecture*)

#### 5.2.1 Save and load functions

To **save** there is the function `dump(object, file)` from the module, where:

- `object`: any object from the list above
- `file`: Name of a `file` object opened in binary writing mode (`"wb"`)

In [None]:
# TODO example to save
telefonbuch = [("Tim", "85675"), ("Jenny", "233325"), ("Max", "89923")]

To **load** the file there is the function `load(file)`, where:

- `file`: Name of a `file` object opened in binary reading mode (`"rb"`)

In [None]:
# TODO example for loading

#### 5.2.2 How does `pickle` work?

When saving, a 'bytestring' is created on the transferred object, in which the structure of the object is encoded. In particular, the data type is also stored for each elementary value.

The creation of a character string to represent a data structure is also called *serialization*.

In [None]:
# Return value of the dumps() function
import pickle
s = pickle.dumps(telefonbuch)
s

When loading, the object is retrieved from the `bytestring`, provided that the `bytestring` used represents a correct representation in the sense of the `pickle` protocol

**Warnung**: 
> The `pickle` module **is not secure**. Only unpickle data you trust.

### 5.3 CSV files

CSV stands for *comma separated values* and is a data format that we are particularly familiar with from Excel. It is therefore suitable for displaying tables whose columns are separated from each other by separators (usually commas).

The Python module `csv` with the class `csv.reader` allows to decompose files in CSV format

In [None]:
# Read TODO file with CSV reader

In [None]:
# Read out TODO column headings