# Reading data from files
How to read from a file now? Files are organized sequentially as mentioned before, i.e. they consist of consecutive
lines. For processing sequences the `for` loop is suitable. Specifically, one can iterate over the lines of a file like
follows:

In [1]:
# open file
with open("lorem_ipsum.txt", "r") as file:
    # read file line by line and output the lines
    for line in file:
        print(line)

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris

nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in

reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla

pariatur. Excepteur sint occaecat cupidatat non proident, sunt in

culpa qui officia deserunt mollit anim id est laborum.


If you compare the output of the program with the content of the file (e.g. in a text editor), you notice that blank
lines have been added to the output. What is the reason for this?  
At the end of each line there is a line break `\n` in the text file. This is only visible indirectly, because the text
continues on the next line. On output, the function `print()` adds another line break, hence the blank line. 

You can correct this behaviour in several ways. One way is to set the `end` parameter in the `print()` function to an
empty character `end = ""`.  
Another way is to *strip* the line first. For strings there is a method `.strip()`. This removes spaces, tabs and line
breaks at the beginning and at the end of a string. `.strip()` is often used when reading forms to prevent a leading
space from changing the input. With one optional argument, you could also specify which characters should be removed.  
Alternatively, `.lstrip()` or `.rstrip()` can be used. In this case something is deleted only left or right of the
string.

In [2]:
# Open file
with open("lorem_ipsum.txt", "r") as file:
    # read file line by line, strip from  and output the lines
    for line in file:
        line = line.strip()
        print(line)

Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.


## Output the contents of a file twice
In the following program, the `for` loop is run twice. What does the output look like? Why?

In [3]:
# open file
with open("lorem_ipsum.txt", "r") as file:
    # read file line by line and print the lines
    print("First round")
    for line in file:
        line = line.strip()
        print(line)

    # read file line by line and print the lines
    print("Second round")
    for line in file:
        line = line.strip()
        print(line)

First round
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
Second round


When reading a file, the "read cursor" or "read pointer" is moved character by character over the file. If the *read
pointer* arrives at the end of the file and is **not** reset or set to another position, it can not continue reading as
the file ends there. To place the *read cursor*, the method `.seek()` can be used. However, this is beyond the scope of the course. 

## Read a file into a list in one go
It is possible that the line breaks are superfluous and only exist because a paper page has a limited width for example.
In this case, it may make sense to read the entire text "in one go" without iterating over the lines using a loop. The
method `.readlines()` is useful for this. The result is a list with **one** entry.

In [4]:
# Open file
with open("lorem_ipsum.txt", "r") as file:
    # read file in one go
    line = file.readlines()
    print(line)

['Lorem ipsum dolor sit amet, consectetur adipiscing elit,\n', 'sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n', 'Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris\n', 'nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in\n', 'reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla\n', 'pariatur. Excepteur sint occaecat cupidatat non proident, sunt in\n', 'culpa qui officia deserunt mollit anim id est laborum.']


# Exercise 1:
In the file `numbers2.txt` there is one number per line. Read the file and sum up the numbers. Output your result.

In [5]:
with open("numbers2.txt", "r") as file:
    sum_file = 0
    for line in file:
        sum_file += int(line.strip())
    print(sum_file)

5050


In [9]:
sum_file = 0
with open("numbers2.txt", "r") as file:
    for line in file:
        line = line.strip()
        sum_file += int(line)
        print(line, sum_file)

0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
20 210
21 231
22 253
23 276
24 300
25 325
26 351
27 378
28 406
29 435
30 465
31 496
32 528
33 561
34 595
35 630
36 666
37 703
38 741
39 780
40 820
41 861
42 903
43 946
44 990
45 1035
46 1081
47 1128
48 1176
49 1225
50 1275
51 1326
52 1378
53 1431
54 1485
55 1540
56 1596
57 1653
58 1711
59 1770
60 1830
61 1891
62 1953
63 2016
64 2080
65 2145
66 2211
67 2278
68 2346
69 2415
70 2485
71 2556
72 2628
73 2701
74 2775
75 2850
76 2926
77 3003
78 3081
79 3160
80 3240
81 3321
82 3403
83 3486
84 3570
85 3655
86 3741
87 3828
88 3916
89 4005
90 4095
91 4186
92 4278
93 4371
94 4465
95 4560
96 4656
97 4753
98 4851
99 4950
100 5050
