# Lecture 3 Statements, Files

- [3.1 Statements](#section1)
    - [3.1.1 if, else, elif Statements](#section2)
    - [3.1.2 for Loops](#section3)
    - [3.1.3 while Loops](#section4)
    - [3.1.4 break, continue, pass Statements](#section5)
- [3.2 Files](#section6)
- [3.3 Appendix: Python Interpreter](#section7)
- [References](#section8)

# 3.1 Statements <a id="section1"/>

Python code can be decomposed into **modules**, **statements**, **expressions**, and **objects**, as follows:
1. Programs are composed of modules.
2. Modules contain statements.
3. Statements contain expressions.
4. Expressions create and process objects.

> ***Modules*** are Python files that contain Python statements; the files can be used by other programs. Modules are also often called scripts.

> ***Statements*** are complete sections of code that perform an action, such as assignment statements, print statements, if statements, for statements and others. They are the smallest executable unit of code. 

> ***Expressions*** are part of statements that return a value, such as variables, operators, or functions calls. 

### 3.1.1 if, else, elif Statements <a id="section2"/>

In Python, `if` statements allow us to instruct the program to perform alternative actions, based on one or several tests. This provides a means for introducing logic in our codes, and it can be interpreted as "if this case happens, then perform this action".

`if` is a ***compound statement,*** since it may contain other statements in its syntax. Also, the `if` statement is referred to as a ***conditional*** statement.

The `if` statement takes the form of an `if test`, which can be followed by one or more optional `elif (else if) tests`, and
a final optional `else test`. Each of the tests has an associated block of nested statements, indented under a header line.

### Basic if Test

In its simplest form, the `if` statement has the following syntax:
    
    if test1:
        statements -> perform action 1

The `if` statement is used to perform a test and control whether or not the statements block of code is executed.
- The first line is a header line, it opens with `if` statement and ends with a colon (omitting the colon at the end of `if` statement is one of the most common mistakes by beginner programmers in Python). The output of the `if` test is a *Boolean* variable (i.e., *True* or *False*).   
- The block of code is indented under the header and contains one or more statements that are executed if the test is True.

In [5]:
x = 105

if x > 100:
    print(x, 'is high')

105 is high


In [6]:
x = 105

if x < 50:
    print(x, 'is high')

In [7]:
y = 20

if y < 50:
    print (y, 'is low')

20 is low


In [8]:
if True:
    print('It is true!')

It is true!


In [9]:
if False:
    print('It is true!')

Therefore, the statements indented under an `if` line will be executed only if the first line returns a Boolean `True` value. As we mentioned earlier, any nonzero number or nonempty array returns a Boolean `True`, and 0 or an empty array returns a `False`.

In [10]:
if 1:
    print('It is true!')
    
if 5:
    print('It is also true!')
    
if 0:
    print('It is not true!')

It is true!
It is also true!


### if - else Tests

Let's add in additional logic by using the `else` statement. 

Check this example. Since we assigned a Boolean `False` to the variable `x`, the `if x:` line returns False, and as a result the statement indented under `if` will not be executed. In the case when the `if` test is False, the code after `else` is executed.

In [11]:
x = False

if x:
    print('This is printed when x is True!')
else:
    print('This is printed when x is False')

This is printed when x is False


Here are more examples of using `else` to execute a block of code when an `if` test is *not* true. Note that:
- `else` is always attached to `if`, and it cannot be used as a standalone test.
- `else` allows to specify an alternative to execute when the `if` is False.

In [12]:
num = 43

if num > 100:
    print(num, 'is high')
else:
    print(num, 'is low')

43 is low


In [13]:
num = 134

if num > 100:
    print(num, 'is high')
else:
    print(num, 'is low')

134 is high


### if - elif - else Tests

We can use `elif` to specify additional tests, when we want to provide several alternative cases, each with its own test. `elif` is short for "else if" and it is always associated with an `if`. If there is an `else` test in the code, `elif` must come before `else`.

The general form for is:

    if test1:
        statements -> perform action 1
    elif test2:
        statements -> perform action 2
    else: 
        statements -> perform action 3
        
The above compound statement can be interpreted as: if the case in test 1 happens, perform action 1. Else, if the case in test 2 happens, perform action 2. Or else, if none of the above cases happen, perform action 3.

That is, Python executes the statements nested under the `elif` test if the statements before the test are not True, and the statements under the `else` test are executed only when none of the `elifs` is True.

Both the `elif` and `else` parts are optional and they may be omitted, as well as there may be more than one statement nested in each section.

The words `if`, `elif`, and `else` line up vertically, and have the same indentation.

In [14]:
z = 68
    
if z > 100:
    print(z, 'is high')
elif z > 50:
    print(z, 'is medium')
else:
    print(z, 'is low')

68 is medium


In [15]:
z = 30
    
if z > 100:
    print(z, 'is high')
elif z > 50:
    print(z, 'is medium')
else:
    print(z, 'is low')

30 is low


In [16]:
location = 'Bank'

if location == 'Auto Shop':
    print('Welcome to the Auto Shop!')
elif location == 'Bank':
    print('Welcome to the bank!')
else:
    print('Where are you?')

Welcome to the bank!


In the next example we use the `input()` function to enter text using the keyboard (press the `Enter` key to confirm it).

In [17]:
person = input("Enter your name: ")
# E.g., enter Joe

if person == 'Joe':
    print('Welcome Joe!')
else:
    print("Welcome, Joe will be with you shortly?")

Enter your name:  Joe


Welcome Joe!


### Boolean Operators to Make Complex Statements

We can create more complex conditional statements with Boolean operators like **and** and **or**, or use comparators like < and >.

In [18]:
age = 40

if age > 65 or age < 16:
    print(age, 'is outside the labor force')
else:
    print(age, 'is in the labor force')

40 is in the labor force


We saw in the examples above that we can use `==` to check if two objects are the same. Similarly, we can use an exclamation point: `!=` to check if two objects are not the same. 

In [19]:
person = 'Jim'

if person != 'Joe':
    print("Welcome, what's your name?")
else:
    print('Welcome Joe!')

Welcome, what's your name?


### The if - else Ternary Expression

Python also has an ***if - else ternary expression*** with the following syntax:

    a if condition else b
    
In the above expression, first the condition is evaluated, and afterward either a or b is returned based on the Boolean value of the condition.

Let's reconsider the `if-else` example that we saw earlier.

In [20]:
num = 43

if num > 100:
    print(num, 'is high')
else:
    print(num, 'is low')

43 is low


The corresponding `if-else` ternary expression is as follows. It allows to reduce the above 4 lines of code into 1 line. Based on the value of the condition `num > 100`, if the condition is True then `print(num, 'is high')` is executed, and if the condition is False then `print(num, 'is low')` is executed.

In [21]:
print(num, 'is high') if num > 100 else print(num, 'is low')

43 is low


### Handling Case Switch

If you’ve used languages like C, Pascal, or MATLAB, and if you are interested to know if there us a *switch* or *case* statement in Python that selects an action based on a variable’s value, there aren't. Instead, in Python we can code multiway branching as a series of if-elif tests.

An example is shown below. Note again that we can use as many `elif` statements as we want, but there can be only one `else` statement.

In [22]:
choice = 'ham'

if choice == 'spam': 
    print(2.25)
elif choice == 'ham':
    print(1.75)
elif choice == 'eggs':
    print(0.75)
elif choice == 'bacon':
    print(1.10)
else:
    print('Bad choice')

1.75


Although, it may be more convenient to create a dictionary to handle case switching instead of if-elif-else especially when there are many cases involved. 

In [23]:
branch = {'spam': 2.25, 'ham': 1.75, 'eggs': 0.75, 'bacon': 1.10}

choice = 'eggs'

if choice in branch:
    print(branch[choice])
else:
    print('Bad choice')

0.75


## Indentation Rules

Python uses indentation of statements under a header to group the statements in a nested block. In the figure below, there are 3 blocks of code, each having a header line. Note that Block 1 is nested under Block 0, and it is indented further to the right of Block 0. Then, Block 2 is nested under Block 1, and it is intended even further to the right of Block 1.

<img style="float: left; height:270px; width:auto" src="images/pic1.jpg">

The indentation in Python is used to detect blocks boundaries. All statements indented the same distance to the right belong to the same block of code. The block ends either when a less-indented line or the end of the file is encountered. 

Indentation may consist of any number of spaces, but it must be the same for all the statements in a single block. Four spaces or one tab per indentation level are commonly used, but there is no absolute standard for the number of spaces in indentation. However, it is not recommended to mix spaces and tabs for indentation within a block, because such indentation may look different in other editors and the codes can be more difficult to edit. 

Look at this example. It contains three blocks: the first block (Block 0, `if x:`) is not indented at all, the second (Block 1, `y = 2`) is indented four spaces under Block 0, and the third (Block 2, `print ('Block 2')` is indented eight spaces.

In [24]:
x = 1
if x:
    y = 2
    if y:
        print('Block 2')
    print('Block 1')
print('Block 0')

Block 2
Block 1
Block 0


Several common mistakes with code indentation are shown below, which result in errors.

In [26]:
x = 1
  if x: # Error: first line indented, this line belongs to Block 0 and it shouldnt be indented
    y = 2
      if y:  # Error: unexpected indentation, this line should have the same indentation as 'y = 2'
        print('Block 2')
   print('Block 1') # Error: inconsistent indentation, this line is indented 3 spaces, and 'y = 2' is indented 4 spaces
print('Block 0')

IndentationError: unindent does not match any outer indentation level (<tokenize>, line 6)

To indent several lines of code for one tab, select the lines and the press either the `Tab` key or press the keys `Ctrl` + `]`. To unindent several lines of codes for one tab, press the keys `Ctrl` + `[`.

### Statement Delimiters: Lines and Continuations

Python expects `if` statements to be written on a single line.

The code below produces an error because the `if` statement spans on two lines.

In [27]:
num = 80

if num > 20 and num > 50 and 
    num < 200 and num < 100:
    print('Medium number')

SyntaxError: invalid syntax (<ipython-input-27-b96ea93ac69a>, line 3)

When a statement is too long to fit on a single line, there are two ways to make it span multiple lines.

The first one is to enclose the statement either in a pair of parentheses (), square brackets [], or curly braces {}. Continuation lines do not need to be indented at any level, but it is a good practice to align the lines vertically for readability.

Examples are shown below.

In [28]:
num = 80

if (num > 20 and num > 50 and 
    num < 200 and num < 100):
    print('Medium number')

Medium number


In [29]:
# Note that the indentation is not required for continuation lines enclosed in a pair of parentheses, brackets, or braces
num = 80
if {num > 20 and num > 50 and 
  num < 200 and num < 100}:
    print('Medium number')

Medium number


In [30]:
L = ["Good",
"Bad",
"Ugly"]

L

['Good', 'Bad', 'Ugly']

Also, statements can span multiple lines if they end in a backward slash. Although this is an older feature, and it is not generally recommended. One reason is because if there are empty spaces after the backward slash, it will result in an error.

In [31]:
num = 80
if num > 20 and num > 50 and \
    num < 200 and \
    num < 100:
    print('Medium number')

Medium number


The above line continuation rules apply for any other statements and expressions.

In [32]:
x = 1 + 2 + 3 \
+4
x

10

In [33]:
x = (1 + 2 + 3 
+4)
x

10

Note also Python allows to write more than one noncompound statement (i.e., statements without nested statements) on the same line, separated by semicolons.

In [34]:
x = 5; print(x)

5


Python allows to write the body of a compound statement (like `if`) on the same line with the header, provided the body contains just simple (noncompound) statements (i.e., without 'elif' or 'else' tests).

In [35]:
if True: print('Something')

Something


### 3.1.2 for Loops <a id="section3"/>

A <code>for</code> loop acts as an iterator in Python; it goes through items that are in a *sequence* or any other iterable item. Objects that we've learned about that we can iterate over include strings, lists, tuples, and even dictionaries allows to iterate over keys or values.

The general format for a <code>for</code> loop in Python is:

    for item in object:
        statements -> perform actions    

The variable name used for the `item` is completely up to the coder, so use your best judgment for choosing a name that makes sense and you will be able to understand when revisiting your code. This `item` can then be referenced inside your loop, for example if you wanted to use <code>if</code> statements to perform checks.

In [36]:
list1 = [1,2,3,4,5,6,7,8,9,10]

In [37]:
for num in list1:
    print(num)

1
2
3
4
5
6
7
8
9
10


Add an <code>if</code> statement to check for even numbers. 

In [38]:
for num in list1:
    if num % 2 == 0:
        print(num)

2
4
6
8
10


We could have also put an <code>else</code> statement.

In [39]:
for num in list1:
    if num % 2 == 0:
        print(num)
    else:
        print('Odd number')

Odd number
2
Odd number
4
Odd number
6
Odd number
8
Odd number
10


Another common idea during a <code>for</code> loop is keeping some sort of running tally during multiple loops. For example, let's create a <code>for</code> loop that sums up the list.

In [40]:
# Start sum at zero
list_sum = 0 

for num in list1:
    list_sum = list_sum + num

print(list_sum)

55


Also we could have implemented a <code>+=</code> to perform the addition towards the sum. 

In [41]:
# Start sum at zero
list_sum = 0 

for num in list1:
    list_sum += num

print(list_sum)

55


We can also use <code>for</code> loops with strings and tuples, since they are sequences, so when we iterate through them we will be accessing each item in the sequence.

In [42]:
for letter in 'This is a string.':
    print(letter)

T
h
i
s
 
i
s
 
a
 
s
t
r
i
n
g
.


In [43]:
# loop through a dictionary
d = {'k1':1,'k2':2,'k3':3}

In [44]:
for item in d:
    print(item)

k1
k2
k3


Notice how this produces only the keys. 

We can also use the Dictionary methods: **.keys()**, **.values()** and **.items()**. In Python each of these methods return a *dictionary view object*. It supports operations like membership test and iteration, but its contents are not independent of the original dictionary – it is only a view. 

In [45]:
# Create a dictionary view object
d.items()

dict_items([('k1', 1), ('k2', 2), ('k3', 3)])

Since the `.items()` method supports iteration, we can perform *dictionary unpacking* to separate keys and values.

In [46]:
# Dictionary unpacking
for k,v in d.items():
    print(k)
    print(v) 

k1
1
k2
2
k3
3


If you want to obtain a true list of keys, values, or key/value tuples, you can *cast* the view as a list:

In [47]:
list(d.keys())

['k1', 'k2', 'k3']

In [48]:
# Compare to 
d.keys()

dict_keys(['k1', 'k2', 'k3'])

In [49]:
list(d.values())

[1, 2, 3]

Another used function is **range** which allows to quickly *generate* a list of integers, and it is often used with `for` loops.

In [50]:
string = 'abcde'

n = len(string)
for i in range(n): # i in the index
    print('Index', i, 'Letter', string[i])

Index 0 Letter a
Index 1 Letter b
Index 2 Letter c
Index 3 Letter d
Index 4 Letter e


In general, range can have 3 parameters to pass: a start, a stop, and a step size. Let's see some examples:

In [51]:
# To get a list out of range, we need to cast it to a list
# Parameters: start, stop, step size
list(range(0,101,10))

[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

In [52]:
# Default step size is 1
# Notice that 11 is not included
list(range(0,11))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [53]:
# Default start is 0
list(range(6))

[0, 1, 2, 3, 4, 5]

The **enumerate** function is another useful function to use with `for` loops. It returns both the index and item in an sequence. 

In [54]:
string = 'abcde'

for i,letter in enumerate(string):
    print('Index', i,'Letter:', letter)

Index 0 Letter: a
Index 1 Letter: b
Index 2 Letter: c
Index 3 Letter: d
Index 4 Letter: e


### 3.1.3 while Loops <a id="section4"/>

The <code>while</code> statement in Python is another way to perform iteration. A <code>while</code> statement will repeatedly execute a single statement or group of statements as long as the condition is true. The reason it is called a 'loop' is because the code statements are looped through over and over again until the condition is no longer met.

The general format of a while loop is:

    while test:
        statements -> perform action 1
    else:
        statements -> peform action 2

Let’s look at a few simple <code>while</code> loops in action. 

In [55]:
x = 0

while x < 5:
    print('x is currently: ', x)
    print('x is still less than 5, adding 1 to x')
    x+=1

x is currently:  0
x is still less than 5, adding 1 to x
x is currently:  1
x is still less than 5, adding 1 to x
x is currently:  2
x is still less than 5, adding 1 to x
x is currently:  3
x is still less than 5, adding 1 to x
x is currently:  4
x is still less than 5, adding 1 to x


We can also add an <code>else</code> statement:

In [56]:
x = 0

while x < 5:
    print('x is currently: ',x)
    print(' x is still less than 5, adding 1 to x')
    x+=1
else:
    print('All Done!')

x is currently:  0
 x is still less than 5, adding 1 to x
x is currently:  1
 x is still less than 5, adding 1 to x
x is currently:  2
 x is still less than 5, adding 1 to x
x is currently:  3
 x is still less than 5, adding 1 to x
x is currently:  4
 x is still less than 5, adding 1 to x
All Done!


In [None]:
# DO NOT RUN THIS CODE!!!! 
while True:
    print("I'm stuck in an infinite loop!")

A quick note: If you *did* run the above cell, click on the Kernel menu above to restart the kernel!

### 3.1.4 break, continue, pass Statements <a id="section5"/>

We can use <code>break</code>, <code>continue</code>, and <code>pass</code> statements in our loops to add additional functionality for various cases. 

With the <code>break</code> and <code>continue</code> statements, the general format of the <code>while</code> loop looks like this:

    while test: 
        statements -> perform action 1
        if test: 
            break         # Exit the 'while' loop now
            continue      # Skip the 'else' statements (if present) and go to top of the 'while' loop now 
        else:
            statements -> perform action 2   # Run these statements when the 'if' test is False 

<code>break</code> and <code>continue</code> statements can appear anywhere inside the loop’s body, but they are usually nested in an <code>if</code> statement to perform an action based on some condition.

In [57]:
for letter in "string":
    if letter == "i":
        break # exit the 'for' loop now
    print(letter)

print("The end")

s
t
r
The end


In [58]:
for letter in "string":
    if letter == "i":
        continue # go to the top of the 'for' loop now (skip the commands following 'continue')
    print(letter)

print("The end")

s
t
r
n
g
The end


One more example follows with an `else` statement.

In [59]:
x = 0

while x < 5:
    print('x is currently: ', x)
    print(' x is still less than 5, adding 1 to x')
    x+=1
    if x==3:
        print('Breaking because x==3')
        break  # terminate the 'while' loop, go to the 'print('The end')' statement
    else:
        print('continuing...')
        
print('The end')

x is currently:  0
 x is still less than 5, adding 1 to x
continuing...
x is currently:  1
 x is still less than 5, adding 1 to x
continuing...
x is currently:  2
 x is still less than 5, adding 1 to x
Breaking because x==3
The end


In [60]:
x=0

while x < 5:
    print('x is currently: ', x)
    print(' x is still less than 5, adding 1 to x')
    x+=1
    if x==3:
        print('Continuing to the next step')
        continue  # Skip the rest of the lines, and go to the while loop
    else:
        print('continuing...')
        
print('The end')

x is currently:  0
 x is still less than 5, adding 1 to x
continuing...
x is currently:  1
 x is still less than 5, adding 1 to x
continuing...
x is currently:  2
 x is still less than 5, adding 1 to x
Breaking because x==3
x is currently:  3
 x is still less than 5, adding 1 to x
continuing...
x is currently:  4
 x is still less than 5, adding 1 to x
continuing...
The end


`Pass` is generally used as a placeholder and it does not do anything. Suppose we have a loop or a function that is not implemented yet, but we want to implement it in the future. They cannot have an empty body, because this would give an error. So, we use the `pass` statement to construct a body that does nothing.

In [61]:
# Pass is just a placeholder for functionality to be added later
sequence = {'p', 'a', 's', 's'}
for val in sequence:
    pass

In [62]:
# Pass can be used as a placehold for a function or a class
def my_function(arguments):
    pass

class Example:
    pass

# 3.2 Files <a id="section6"/>

Python uses file objects to interact with external files on your computer. These file objects can be any sort of file you have on your computer, such as a text file, Excel document, email, audio file, picture, etc. 

Python has a built-in `open` function that allows us to open and write to files. The `open` function creates a Python file object, which serves as a link to the file residing on the computer. It allows to transfer strings of data to and from the linked external file.

The `open` function requires to pass two arguments: filename and processing mode. The `filename` is the name of the file, and for reading it, it is assumed that the file exists in the current working directory: if that is not the case, the `filename` should also include the path to the file. The processing `mode` can be either the string `'r'` to read the file (open for text input), `'w'` to write the file (create and open for text output), or `'a'` to append text to an existing file; also, adding `+` to the mode allows to both read and write to a file. Both the filename and mode arguments should be strings. 

```
afile = open(filename, mode)
```

### Writing to a File

For example, let's create a simple text file called `test.txt` having two lines of text. The `open` function in the example below will return an object named `myfile`, which has a `write` method  for data transfer. 

The file `test.txt` will be saved in the current working directory.

In [63]:
# Open for text output: create an empty file
myfile = open('test.txt','w')

In [64]:
# Write a line of text: string
myfile.write('hello text file\n')
# Note that the write call returns the number of characters in the string

16

In [65]:
myfile.write('goodbye text file\n')

18

In [66]:
myfile.close()

Note also that we need to include the end-of-line terminator `\n` in the string, otherwise the next `write` command will continue the current line.

Now, click on the `test.txt` file in the Jupyter dashboard, to inspect if it looks as we expect.

Use caution when opening a file for writing with `w`, as it truncates the original, meaning that the original file is deleted. Let's try the following code.

In [67]:
myfile = open('test.txt','w')
myfile.write('This is a first line\n')
myfile.write('This is a second line\n')
myfile.close()

Now open the file `test.txt` and you will notice that it has been overwriten.

### Opening a file

Let's open the file `test.txt`.

In [68]:
# Open for text input: 'r' is default mode and it can be omitted
myfile = open('test.txt','r')

In [69]:
# Read the lines back one at a time
myfile.readline()

'This is a first line\n'

In [70]:
# Read the lines back one at a time
myfile.readline()

'This is a second line\n'

In [71]:
# Empty string: end-of-file (EOF)
myfile.readline() 

''

In addition, using the `read` method we can read the entire file into a string all at once.

In [72]:
myfile = open('test.txt')
myfile.read()

'This is a first line\nThis is a second line\n'

Or, if we use `print` the content will be displayed in a readable format without showing the `\n`characters.

In [73]:
myfile = open('test.txt')
print(myfile.read())

This is a first line
This is a second line



Also note that we can write the above cell into one single line:

In [74]:
print(open('test.txt').read())

This is a first line
This is a second line



One confusing thing about reading files is that If we try to read the same file object twice, we'll find out that it only gets read once:

In [75]:
myfile = open('test.txt')
myfile.read()

'This is a first line\nThis is a second line\n'

In [76]:
# What happens if we try to read the file again?
myfile.read()

''

This happens because file objects remember their position, and after we read the file the first time, the reading 'cursor' was at the end of the file, and there was nothing left to read.

We can reset the 'cursor' like this:

In [77]:
# Seek to the start of file (index 0)
myfile.seek(0)

0

In [78]:
# Now read again
myfile.read()

'This is a first line\nThis is a second line\n'

When you have finished using a file, it is always good practice to close it.

In [79]:
myfile.close()

You can also sometimes see another code variant, where `open` is used within a `with` statement, like in the example shown below. One advantage of this approach is that the `with` statement automatically closes the file after the block. 

In [80]:
with open('test.txt', 'r') as myfile:
    data = myfile.read()
print(data)

This is a first line
This is a second line



Alternatively, to read files from other directories on your computer (instead of the current working directory), enter the entire file path.

For Windows, one option is to use double backslashes `\\` so that Python doesn't treat the second `\` as an escape character:

    myfile = open('C:\\Users\\YourUserName\\Desktop\\MyFolder\\test.txt')
    
E.g., note that the `\n` character in the string introduces an unwanted new line:

In [81]:
print('C:\some\name')

C:\some
ame


This is corrected by using double backslashes:

In [82]:
print('C:\\some\\name')

C:\some\name


For Mac OS and Linux, use forward slashes:

    myfile = open('/Users/YourUserName/MyFolder/test.txt')    
    
In latest Python versions `open` works with either forward slashes or backward slashes, so either is fine.
However, the problem with the single and double slashes in the examples above is that codes written on a Windows machine will not work on Unix machines, and vice versa. Therefore, a preferred option for Windows would be to use a `raw string` and single backslashes as shown below.
    
    myfile = open(r'C:\Users\YourUserName\Desktop\MyFolder\test.txt')

The `raw string` form (use of `r` before the string) turns off escape characters in strings. 

Note that `C:\Users\YourUserName\Desktop\MyFolder\test.txt` is an ***absolute path*** because it lists all directories on the disk `C:` to access the file `test.txt`. The path can also be a ***relative path***, where for example if we are currently in a current working directory `C:\Users\YourUserName\Desktop` we can use `MyFolder\test.txt` as a path for the `filename` relative to the current working directory.

### Appending to a File

Passing the argument `'a'` as a processing mode opens the file and puts the pointer at the end for appending. `'a+'` allows us to both read and write to a file. If the file does not exist, one will be created.

In [83]:
myfile = open('test.txt','a+')
myfile.write('\nThis is text being appended to test.txt\n')
myfile.write('And another line here\n')

22

In [84]:
myfile.seek(0)
print(myfile.read())

This is a first line
This is a second line

This is text being appended to test.txt
And another line here



In [85]:
myfile.close()

### Iterating through a File

When reading a file line by line, the entire file is held in the memory. Using file iterators, such as a `for` loop, is often preferred with large files. The created file object by `open` will automatically read and return one line on each loop iteration.

In [86]:
for line in open('test.txt'):
    print(line)

This is a first line

This is a second line



This is text being appended to test.txt

And another line here



### Storing Python Objects in Files: Conversions

Let's next consider an example where multiple Python objects are written into a text file on multiple lines. The objects need to be converted to strings, as write methods do not do any automatic to-string formatting.

In [87]:
# Introduce numbers, string, dictionary, and list objects
S = 'Spam' 
X, Y, Z = 43, 44, 45 
D = {'a': 1, 'b': 2}
L = [1, 2, 3]

# Create output text file
F = open('datafile.txt', 'w') 
# The lines in the string variable S aboove should end with \n
F.write(S + '\n') 
# Convert numbers to strings
F.write('%s,%s,%s\n' % (X, Y, Z)) 
# Convert and separate
F.write(str(L) + '\n' + str(D) + '\n') 
F.close()

Next, let's open the file and read it. 

Notice that the interactive displayed output gives the exact contents, while the print operation interprets embedded end-of-line characters to render a formatted display.

In [88]:
content = open('datafile.txt').read() 
# String display
content

"Spam\n43,44,45\n[1, 2, 3]\n{'a': 1, 'b': 2}\n"

In [89]:
# User-friendly display
print(content) 

Spam
43,44,45
[1, 2, 3]
{'a': 1, 'b': 2}



To translate the strings in the text file into Python objects, we now have to use other conversion tools.

For instance, `rstrip()` removes the end-of-line character `\n`.

In [90]:
# Open the file again, this time using the object named F
F = open('datafile.txt') 
# Read the first line (see above)
line = F.readline() 
line

'Spam\n'

In [91]:
# Remove end-of-line
s = line.rstrip() 
s

'Spam'

The next line contains the string of numbers `'43,44,45\n'`, for which `split()` can be used to separate the numbers.

In [92]:
# Next line from file
line = F.readline() 
line

'43,44,45\n'

In [93]:
# Split on commas
parts = line.split(',')
parts

['43', '44', '45\n']

In [94]:
# int() converts to integer numbers
numbers = [int(P) for P in parts]
numbers

[43, 44, 45]

In [95]:
x, y, z = numbers
x, y, z

(43, 44, 45)

To covert the list and dictionary we will use `eval()` which treats a string as executable code containing a Python expression.

In [96]:
line = F.readline()
line

'[1, 2, 3]\n'

In [97]:
line.rstrip()

'[1, 2, 3]'

In [98]:
l = eval(line)
l

[1, 2, 3]

In [99]:
type(l)

list

In [100]:
line = F.readline()
line

"{'a': 1, 'b': 2}\n"

In [101]:
d = eval(line)
d

{'a': 1, 'b': 2}

In [102]:
type(d)

dict

Fortunately, there are simpler ways to write and read files in Python, which do not require the above conversion steps. Next, we will learn about `pickle` and `JSON`, and in another lecture we will learn how to use `pandas`.

### Storing Python Objects with pickle

Python’s standard library `pickle` allows storing almost any Python object in a file directly, without the requirement for conversions to and from strings. To store the above list L in a file, we can pickle it directly.

In [103]:
import pickle
F = open('newdatafile.pkl', 'wb') # 'wb' stands for writing a binary file, and it indicates that the content of the file is not text
# Pickle any object to file
pickle.dump(L, F) 
F.close()

Then, to read the file and get the list we simply use pickle again (a.k.a. unpickling).

In [105]:
# Load any object from file
F = open('newdatafile.pkl', 'rb') # similarly, 'rb' stands for reading a binary file
list1 = pickle.load(F) 
list1

[1, 2, 3]

In [106]:
F.close()

The `pickle` module performs what is known as *object serialization*—converting objects to and from strings of bytes.

### Storing Python Objects with JSON

`JSON` (stands for JavaScript Object Notation) is a newer data interchange format, which allows using stored data across programming languages (unlike `pickle` which works only with Python). On the other hand, `JSON` does not support as broad a range of Python object types as `pickle`.

The following example shows translating the above dictionary D into JSON format to be saved into a file, and recreating the dictionary from the JSON format when it is loaded from the file.

In [107]:
import json
FJ = open('json_datafile.text', 'w')
# Store the object to file
json.dump(D, FJ) 
FJ.close()

In [108]:
new_d = json.load(open('json_datafile.text'))
new_d

{'a': 1, 'b': 2}

# 3.3 Appendix (not required for quizzes and assignments) <a id="section7"/>

# Python Interpreter

The ***interpreter*** in Python is the program that executes other programs. When you run your programs in Python, the interpreter reads your programs, and carries out the instructions contained in the program. Or, we can say that the interpreter interprets your codes and enables the hardware on your computer to execute the program.

When you install Python on your computer, the Python interpreter will be part of the installation, either as an executable program, or as a set of linked libraries. Note that there are several different Python installations, and depending on the type of Python installation you have on your computer, the interpreter may be implemented as a C program, a set of Java classes, or
in another programming language. 

Understanding how the programs are executed in Python can be helpful for programmers. For instance, I saved the following simple file as *module1.py*.

<img style="float: left; height:180px;" src="images/pic2.jpg">

When I run the file in the Command Prompt, Python executed the file, and the output of the program is `Hello world!` and `10`.

<img style="float: left; height:240px;" src="images/pic3.jpg">

When we run programs in Python, the programs are first compiled into ***byte code***, and are afterward run by a  ***Python virtual machine (PVM)***, as shown in the figure below.

<img style="float: left; height:200px;" src="images/pic4.jpg">

***Byte code*** is a format into which the ***source code*** (the statements in the file) is compiled by the Python interpreter. Byte-code is platform-independent (i.e., it can be run on Windows, Linux, MacOS), and it can be run more quickly than the source code program.  

The byte code is stored into a file with a *.pyc* extension, which stands for compiled .py file. The *.pyc* files are saved in a subdirectory named
\_\_pycache\_\_ located in the same directory where the source file is saved.

For example, the directory where *module1.py* is saved on my computer is shown below, and the \_\_pycache\_\_ subdirectory was automatically created by Python.  

<img style="float: left; height:180px;" src="images/pic5.jpg">

Within the subdirectory is the byte code file named *module1.cpython-36*. The name indicates that the Python installation on my computer uses the CPython interpreter, and the installed Python version is 3.6. Note that the file type is PYC file, meaning a .pyc extension.

<img style="float: left; height:140px;" src="images/pic6.jpg">

Byte code is saved for speed optimization. When I run module1.py next time, Python will skip the compilation step, and it will directly load the saved .pyc byte code file. However, if the original source code file module1.py was modified, Python will re-compile and update the byte code file. Similarly, if a different version of Python is installed, a new byte code file will be created that matches the current version of Python.

***Python virtual machine*** (PVM) is the last part of the Python interpreter. PVM executes the byte code instructions one-by-one, i.e., it is the component that runs the programs. PVM is not a separate program, and it  does not need to be installed separately: it is part of the Python installation. PVM needs a software layer to allocates physical computing resources—such as processors, memory, and storage. 

Python belongs to the group of **interpreted languages**, or they are also called *scripting languages* (other languages in this group are Perl, Ruby). The Python interpreter reads the statements in source files and converts them into byte code files, which are afterwards executed by the PVM. Conversely, Java, C, and C++ belong to the group of **compiled languages**. In these languages, a compiler converts the source files' statements into binary machine code, which are afterwards executed by the computer hardware. Note that byte code files are different than binary machine code files. Consequently, running Python programs is slower than running C or C++ programs, because the code is interpreted as it is executed. On the other hand, writing and testing Python programs is faster and easier than writing and testing programs with the compiled languages. (One last clarification: Python does compile source files, but the result is not a binary machine code, and because of that it is not considered a compiled language).

As we mentioned earlier, there are several different implementations of the Python interpreter, such as CPython, Jython, IronPython, Stackless Python, and PyPy. CPython is the standard, original implementation of Python, Jython is a Python implementation targeted for integration with the Java programming language, IronPython was designed to allow Python programs to integrate with applications coded to work with Microsoft’s .NET Framework for Windows, etc.

# References <a id="section8"/>

1. Mark Lutz, "Learning Python," 5-th edition, O-Reilly, 2013. ISBN: 978-1-449-35573-9.
2. Pierian Data Inc., "Complete Python 3 Bootcamp," codes available at: [https://github.com/Pierian-Data/Complete-Python-3-Bootcamp](https://github.com/Pierian-Data/Complete-Python-3-Bootcamp).