Geo Data Science with Python,
Prof. Susanna Werth, VT Geosciences

---
### Reading - Lecture 6
 
# Tuples, Sets, Files

This lesson discusses the Python object types **Tuples**, **Sets** and **Files** as well as how to **read and write** files. 

### Content

- A. <a href='#intro'> Introduction </a>
- B. <a href='#tuples'> Tuples </a>
- C. <a href='#sets'> Sets </a>
- D. <a href='#files'> Files </a>

### Sources
Part B of this lesson was inspired by the page [Understanding Tuples in Python 3](https://www.digitalocean.com/community/tutorials/understanding-tuples-in-python-3) of the [Digital Ocean Community](https://www.digitalocean.com/community). Part D of this notebook is an adaption of the lesson [How To Handle Plain Text Files in Python 3](https://www.digitalocean.com/community/tutorials/how-to-handle-plain-text-files-in-python-3) of the [Digital Ocean Community](https://www.digitalocean.com/community).


---

<a id='intro'></a>
# A. Introduction

Table 1 summarizes categories and literals of complex data containers in Python. That gives a first idea how *Tuples* and *Sets* distinguish from *Lists* and *Dictionaries*.

Table 1. *Literals of Composite Complex Data Containers in Python*

| Container      | Category   |  Denotation  |  Feature    |  Examples  | 
| ---------:    | :---------:|  :---------: | :---------: | :--------: |
| **List**       | Sequence   | `( [ ] )`    | mutable     | `[‘a’, ‘b’, ‘c’]` |
| **Dictionary** |  Mapping   | `{'key: value', ...}` | mutable | `{‘Alice’: ‘38’, ‘Beth’: ‘35’}` |
| **Tuple**      | Sequence   | `( ( ) )`    | immutable   | `(‘a’, ‘b’, ‘c’)` |
| **Set**        | Set        | `set()`   |  mutable   | `set(['a', 'c', 'e'])` |

Tuples are immutable sequences, they are used for grouping data. Sets are mutable, but they are a completely new category, which derives from mathematical set theories. Below, both object types are introduced separately.

---
<a id='tuples'></a>
# B. Tuples


A tuple is a data structure that is an immutable, or unchangeable, ordered sequence of elements. A tuple is similar to a list, except for their mutability. Because tuples are immutable, their values cannot be modified and they do not come with methods that would change these objects in place. 

A tuple in Python looks like this:

In [1]:
coral = ('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')

Tuples have values between parentheses `( )` separated by commas `,`. Each element or value that is inside of a tuple is called an item.

Empty tuples will appear as `coral = ()`, but tuples with even one value must use a comma as in `coral = ('blue coral',)`. This is because, if a tuple sequence is assigned with only one number, the respective literal could not be distingisued from assigning an integer number:

In [2]:
a=(40)
b=(40,)
type(a), type(b)

(int, tuple)

Now, if we `print()` the tuple above, we’ll receive the following output, with the tuple still typed by parentheses:

In [3]:
print(coral)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')


When thinking about Python tuples and other data structures that are types of collections, it is useful to consider all the different collections you have on your computer: your assortment of files, your song playlists, your browser bookmarks, your emails, the collection of videos you can access on a streaming service, and more.

Tuples are similar to lists, but their values can’t be modified. Because of this, when you use tuples in your code, you are conveying to others that you don’t intend for there to be changes to that sequence of values. Additionally, because the values do not change, your code can be optimized through the use of tuples in Python, as the code will be slightly faster for tuples than for lists.

### Indexing Tuples

As an ordered sequence of elements, each item in a tuple can be called individually, through indexing. This is done based on sequence operations and it done in congruence to lists. 

Each item corresponds to an index number, which is an integer value, starting with the index number `0`. Because each item in a Python tuple has a corresponding index number, we can call a discrete item of the tuple by referring to its index number:

`coral[0]` calls the item `'blue coral'`       <br>
`coral[1]` calls the item  `'staghorn coral'`  <br>
`coral[2]` calls the item `'pillar coral'`     <br>
`coral[3]` calls the item `'elkhorn coral'`

If we call the tuple coral with an index number of any that is greater than `3`, it will be out of range as it will not be valid. Instead, an `IndexError` will be raised. Try by uncommenting the line below:

In [4]:
# print(coral[22])

In addition to positive index numbers, we can also access items from the tuple with a negative index number, by counting backwards from the end of the tuple, starting at `-1`. This is especially useful if we have a long tuple and we want to pinpoint an item towards the end of a tuple.

For the same tuple coral, the negative index breakdown looks like this:


`coral[-4]` calls the item `'blue coral'`       <br>
`coral[-3]` calls the item  `'staghorn coral'`  <br>
`coral[-2]` calls the item `'pillar coral'`     <br>
`coral[-1]` calls the item `'elkhorn coral'`


### Further Sequence Operations with Tuples

Since tuples are sequences, most sequence operations that are relevant for lists are also relevant for tuples, as long as they do not mutate the sequence.

#### Slicing

Slices allow us to call multiple values by creating a range of index numbers separated by a colon `[x:y]`. To just print the middle items of coral, we can do so by creating a slice. Remember, the first index number is where the slice starts (inclusive), and the second index number is where the slice ends (exclusive), which is why the following example prints the items at position 1 and 2: 

In [5]:
coral = ('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')

print(coral[1:3])

('staghorn coral', 'pillar coral')


If we want to include either end of the list, we can omit one of the numbers in the tuple slicing syntax. And we can also use negative index numbers when slicing tuples, just like with positive index numbers:

In [6]:
print(coral[:-1])

('blue coral', 'staghorn coral', 'pillar coral')


One last parameter that we can use with slicing of sequences, including tuples, is called **stride**, which refers to how many items to move forward after the first item is retrieved from the tuple. So far, we have omitted the stride parameter, and Python defaults to the stride of 1, so that every item between two index numbers is retrieved.

The syntax for this construction is `tuple[x:y:z]`, with z referring to stride. Let’s make a larger list, then slice it, and give the stride a value of 2. The example below prints only every second item from `1` (inclusive) until `11` (exclusive):

In [7]:
numbers = tuple(range(0,13))
print(numbers[1:11:2])

(1, 3, 5, 7, 9)


We can omit the first two parameters and use stride alone as a parameter with the syntax `tuple[::z]`. Here an example to print every third tuple item:

In [8]:
print(numbers[::3])

(0, 3, 6, 9, 12)


#### Concatenating and Multiplying Tuples

Operators can be used to concatenate or multiply tuples. Concatenation is done with the `+` operator, and multiplication is done with the `*` operator. 

We can concatenate string items in a tuple with other strings using the `+` operator:

In [9]:
print('This reef is made up of ' + coral[1])

This reef is made up of staghorn coral


We were able to concatenate the string item at index number `0` with the string `'This reef is made up of '`. We can also use the `+` operator to concatenate two or more tuples together.

In [10]:
coral = ('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')
kelp = ('wakame', 'alaria', 'deep-sea tangle', 'macrocystis')
coral_kelp = (coral + kelp)
print(coral_kelp)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis')


Because the `+` operator can concatenate, it can be used to combine tuples to form a new tuple, though it cannot modify an existing tuple.

The `*` operator can be used to multiply tuples. Perhaps you need to make copies of all the files in a directory onto a server or share a playlist with friends — in these cases you would need to multiply collections of data.

Let’s multiply the `coral` tuple by `2` and the kelp tuple by `3`, and assign those to new tuples:

In [11]:
multiplied_coral = coral * 2
multiplied_kelp = kelp * 3

In [12]:
print(multiplied_coral)

('blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral', 'blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral')


In [13]:
print(multiplied_kelp)

('wakame', 'alaria', 'deep-sea tangle', 'macrocystis', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis', 'wakame', 'alaria', 'deep-sea tangle', 'macrocystis')


By using the `*` operator we can replicate our tuples by the number of times we specify, creating new tuples based on the original data sequence.

### How Tuples Differ from Lists

The primary way in which tuples are different from lists is that they cannot be modified. This means that items cannot be added to or removed from tuples, and items cannot be replaced within tuples. (You can, however, concatenate two or more tuples to form a new tuple, as we have seen in the examples above.)

Let’s consider our coral tuple. Say we want to replace the item 'blue coral' with a different item called 'black coral'. If we try to change that output the same way we do with a list, ...

In [14]:
# coral[0] = 'black coral'

... if you uncomment and execute the code line above, you will receive a `TypeError`. This is because tuples cannot be modified.

#### Literals & Type Conversion

If we create a tuple and decide what we really need is a list, we can convert it to a list. To convert a tuple to a list, we can do so with `list()`. After that, the `coral` object type will be a list:

In [15]:
list(coral)

['blue coral', 'staghorn coral', 'pillar coral', 'elkhorn coral']

We can see that the tuple was converted to a list because the parentheses changed to square brackets.

Likewise, we can convert lists to tuples with tuple().

In [16]:
a_list = [25, 26, 19, 39]
tuple(a_list)         # converting a list into a tuple, 

(25, 26, 19, 39)

In [17]:
list(tuple(a_list))   # ... then back into a list

[25, 26, 19, 39]

#### Nesting
Similarly to lists and dictionaries, tuples can be nested. For example a tuple can be nested in a tuple:

In [18]:
at=(1,2,3,(1,2,3))
print(at)

(1, 2, 3, (1, 2, 3))


#### Tuples as Keys in Dictionaries
In contrast to lists, tuples can also be used as keys in dictionaries. The following cell creates a dictionary `dict1` with three entries, where the last contains a tuple as a key. Then the dictionary is print to screen. Study the example and how tuples can be used as key's:

In [19]:
dict1 = {    5   : "number",
          "to"   : "string",
         (1,"a") : "tuple"}
dict1

{5: 'number', 'to': 'string', (1, 'a'): 'tuple'}

#### List comprehensions
Tuples can be processed by list comprehensions. Now, let's use *list comprehensions*. Try to generate a new list that bases on the tuple `a_tuple` and returns the square of each item. What object type does the operation return, a tuple or a list? And why is that so?

In [20]:
a_tuple = (15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3)

In [21]:
             # add your list comprehension on a_tuple here

### Tuple Built-in Functions & Methods

There are a few built-in functions that work with tuples. In general, most built-in functions that work with sequences, do work with tuples. For sequence methods, however, only few of them work, because often methods alter a sequence and tuples are not mutable. Some examples are given in Table 2:

Table 2: *Built-in Tuple Functions & Methods*

| Built-in Functions & Methods | Description |
| :-: | :-: |
| len()               | Gives the total length of the tuple. |
| min()               | Returns item from the tuple with min value. |
| max()               | Returns item from the tuple with max value. |
| sum()               | Add items. |
| tuple(seq)          | Converts a list into tuple. |
| .index()            | Returns the position at the first occurrence. |
| .count()            | Return the number of times a value appears. |

Experiment executing these functions using the tuples `numbers`, `coral` and `kelp` in the following cells.

In [22]:
a_list = [15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3]
a_tuple = tuple(a_list)      # converts the a list into a tuple
print(a_tuple)

(15.6, 2.7, 3.9, 61.4, 32.9, 100.2, 55.3)


In [23]:
              # try the function len()

In [24]:
              # try the function min()

In [25]:
              # try the function max()

In [52]:
              # try the method .index()

In [27]:
              # try the method .count()

### Tuple Summary

Tuples are an ordered and immutable collection of arbitrary objects that has a fixed-length.
The fact that the tuple data type is a sequenced data type that cannot be modified, offers optimization to your programs. Tuples are a somewhat faster type than lists for Python to process. When others collaborate with you on your code, your use of tuples will convey to them that you don’t intend for those sequences of values to be modified. Besides that, tuples might have advantages for certain data types. Due to their immutability, they can take over the rolse of a *constant* declaration. Most commonly they are used as dictionary keys (in contrast to lists), as keys must be immutable, too. This is helpful, when generating larger dictionaries and to use entire datasets as keys. 

Most sequence operations that work on strings and lists, work also on tuples: 
* Values/items are indexed by integers 
* Slicing, concatenating, repetition  
* List comprehensions are applicable
* Nesting
 
The following page provides a comprehensive overview of tuple operands and functions:
https://www.tutorialspoint.com/python3/python_tuples.htm    

---
<a id='sets'></a>
# C. Sets

### Introduction

As one of the most fundamental concepts in mathematics, set theory is a branch of mathematical logic that studies sets. Therein, a set is an "*unordered collection of unique and immutable objects that supports operations corresponding to* ***mathematical set theory***". 

<img src="./Image_SetExample.png" alt="Illustrating Sets." title="A Set" width="300" />
Figure 1: *A Set of Geometric Objects*

For example, the graphic above shows a collection of geometric objects, with each being distinctly different from the other. Such an collection of distinct objects is considered as an object in its own right. 
Sets can be derived from data lists. For example, a list of numbers may contain repeating entries. However, we can generate a set from a list, if we select each item only once from the original list:

Table 3: *Example for a Set of a List of Numbers*

| Type / Member | 1 | 2 | 3 | 4 | 5 | 6 |
| -:            |:-:|:-:|:-:|:-:|:-:|:-:|
|List:          | 0 | 7 | 2 | 7 | 0 | 4 |
|Set:           | 7 | 0 | 4 | 2 | 1 |   |

Table 4: *Example for a Set of a List of Animals (Strings)*

| Type / Member | 1     | 2   |  3    |  4  |   5  |   6   |
| -:            | :-:   | :-: | :-:   | :-: |  :-: |  :-  :|
| List:         | dog   | cat | mouse | cat | duck | mouse |
| Set:          | mouse | dog | cat   | duck|      |       |



### Sets in Python

As the examples illustrate, sets are unordered, they contain no duplicates. Hence, they are neither maps nor sequences, but they present us with a completely separate category of objects. Consequently, Python provides an object type for *Sets*. 

To generate a set from any sequence, the literal `set()` is used:


In [28]:
x = set('abcde')
print(x)

{'e', 'd', 'c', 'a', 'b'}


In [29]:
type(x)

set

Note that sets contain immutable objects, but sets themselves are mutable. Therefore, sets can embed tuples and strings (which are immutable), but once sets are created, they cannot contain any lists or dictionaries (because they are mutable). However, lists can be used to create a set, exactly like illustrated by the examples of Table 4 and 5, above.

In [30]:
numberList = [0, 7, 2, 7, 0, 4]
numberSet = set(numberList)
print(numberSet)

{0, 2, 4, 7}


In [31]:
set(['dog', 'cat', 'mouse', 'cat', 'duck', 'mouse'])

{'cat', 'dog', 'duck', 'mouse'}

But the following does not work (uncomment and try):

In [32]:
# set(['dog', 'cat', 'mouse', 'cat', 'duck', 'mouse',[1,2,3] ])

### Set Theory Concepts in Python

The purpose of sets is to construct and manipulate unsorted collections of unique elements, in order to support mathematical applications or to analyze complex data structures. For the set in Figure 1, we might be interested in finding geometrical objects that have four corners and are of color green. For such an analysis set theory concept *Intersection* would help us (Figure 2). 

<img src="./Image_SetTheoryOperation.png" alt="Illustrating Set Theory." title="Set Theory" width="300" />

Figure 2: *A Venn Diagram Ilustrating the Intersection of two Sets.* [Source: Wikipedia](https://en.wikipedia.org/wiki/Set_theory) 

In geographical data analysis, the concepts of sets and set theory can help to find spatial objects that match two conditions, for example:
* Combining two regions' animal species to get a collection
* Receiving coordinates of certain buildings in a certain region
* Finding a French style restaurant near a park

Major concepts in set theory are explained in Table 5.

Table 5: *Set Theory Concepts* [Source: Wikipedia](https://en.wikipedia.org/wiki/Set_theory) 

| Set Theory Concept | Description |
| :-: | :- |
| Membership | Set theory begins with a fundamental binary relation between an object o and a set A. If o is a member (or element) of A, the notation o ∈ A is used. |
| Subset / Superset | A derived binary relation between two sets is the subset relation, also called set inclusion. If all the members of set A are also members of set B, then A is a subset of B, denoted A ⊆ B. For example, {1, 2} is a subset of {1, 2, 3} , and so is {2} but {1, 4} is not. B is also called superset of A.|
| Union | Union of the sets A and B, denoted A ∪ B, is the set of all objects that are a member of A, or B, or both. The union of {1, 2, 3} and {2, 3, 4} is the set {1, 2, 3, 4}. |
| Intersection | Intersection of the sets A and B, denoted A ∩ B, is the set of all objects that are members of both A and B. The intersection of {1, 2, 3} and {2, 3, 4} is the set {2, 3}. |
| Difference | Set difference of U and A, denoted U \ A, is the set of all members of U that are not members of A. The set difference {1, 2, 3} \ {2, 3, 4} is {1} , while, conversely, the set difference {2, 3, 4} \ {1, 2, 3} is {4}.



The following literals are used to perform set theory concepts:

In [33]:
x = set('abcde')
y = set('bdxyz')

In [34]:
'e' in x        # Membership of 'e' in set x      

True

In [35]:
x < y           # Subset: is x subset of y

False

In [36]:
x > y           # Superset: is x superset of y

False

In [37]:
x | y           # Union of x and y

{'a', 'b', 'c', 'd', 'e', 'x', 'y', 'z'}

In [38]:
x & y           # Intersection of x and y

{'b', 'd'}

In [39]:
x - y           # Difference of x and y

{'a', 'c', 'e'}

#### Geographical Example

Now let's make a bit sense of these concepts and use our geographical examples for a restaurant search. We have a list of restaurants in our town, some larger restaurant chains have two branches:

In [40]:
# %load sampleCode_L04c.py
restaurantsInOurTown = [
    'La Madeleine',
    'Pont Blanc',
    'Olive Garden',
    'Pommes de Terre',
    'Jean"s''LaBoca', 
    'Pont Blanc', 
    'La Madeleine', 
    'Berliner Kueche', 
    'La Madeleine',
    'Olive Garden']

Additional information is given in the form of two set. Once contains all french style restaurants:

In [41]:
setFrench = set(['La Madeleine','Pont Blanc','Jean"s'])

The other one contains all restaurants near a park:

In [1]:
setPark = set(['LaBoca', 'Pont Blanc', 'La Madeleine', 'Berliner Kueche', 'Olive Garden'])

Each of the set theory concepts can solve a different question. Use set operations to answer the questions below:

In [2]:
# Is the restaurant 'Madeleine' located near a park?


In [3]:
# Are french restaurants a category of park restaurants?


In [4]:
# Are park restaurants a category of french restaurants?


In [5]:
# We are just hungry, any restaurant is fine!


In [6]:
# Find a french restaurant near a park!


In [7]:
# Find a restaurant that is near a park but not french style!


### Sets and List Comprehensions

List comprehensions are applicable for sets.

In [49]:
[e*2 for e in x]

['ee', 'dd', 'cc', 'aa', 'bb']

### Set Methods

Find a list of built-in set methods. Study the list and experiment with the methods in the cells below.

Table 6: *Set Operations and Methods*

| Operation | Equivalent | Result |
| :- | :-: | :- |
|`len(s)`            |           | Number of elements in set s (cardinality) |
|`.issubset(t)`      |  `s <= t` | Test whether every element in s is in t   |
|`.issuperset(t)`    | `s >= t`  | Test whether every element in t is in s   |
|`.union(t)`         |`s | t`    | New set with elements from both s and t   |
|`.intersection(t)`  | `s & t`   | New set with elements common to s and t   |
|`.difference(t)`    | `s - t`   | New set with elements in s but not in t   |
|`.symmetric_difference(t)` |`s ^ t` | New set with elements in either s or t but not both |
|`.copy()`    | | New set with a shallow copy of s   |
|`.add(x)`    | | Add element x to set s   |
|`.remove(x)` | | Remove x from set s   |
|`.pop(x)`    | | Remove and return element from s   |
|`.clear()`   | | Remove all elements from set s   |

### `frozensets()`

Frozenset is a class with the characteristics of a set, but once its elements have been assigned, they cannot be changed. Tuples can be seen as immutable lists, while frozensets can be seen as immutable sets.

Sets are mutable and unhashable, which means we cannot use them as dictionary keys. Frozensets are hashable and we can use them as dictionary keys.

To create frozensets, we use the `frozenset()` method. Let us create two frozensets, X and Y:

In [50]:
z = frozenset('bdxyz')
print(z)

frozenset({'z', 'd', 'x', 'y', 'b'})


In [51]:
# this did not work earlier:
# set(['dog', 'cat', 'mouse', 'cat', 'duck', 'mouse',[1,2,3] ])

# compare to:
set(['dog', 'cat', 'mouse', 'cat', 'duck', 'mouse', frozenset([1,2,3]) ])

{'cat', 'dog', 'duck', frozenset({1, 2, 3}), 'mouse'}

The frozensets support the use of Python set methods like `copy()`, `difference()`, `symmetric_difference()`, `isdisjoint()`, `issubset()`, `intersection()`, `issuperset()`, and `union()`.

---
<a id='files'></a>
# D. File Objects, File Input and Output

## Introduction

Python is a great tool for processing data. It is likely that any program you write will involve reading, writing, or manipulating data. For this reason, it’s especially useful to know how to handle different file formats, which store different types of data.

For example, consider a Python program that checks a list of users for access control. Your list of users will likely be stored and saved in a text file. Perhaps you are not working with text, but instead have a program that does financial analysis. In order to do some number crunching, you will likely have to input those numbers from a saved spreadsheet. Regardless of your application, it is almost guaranteed that inputting or outputting data will be involved.

This tutorial will briefly describe some of the format types Python is able to handle. After a brief introduction to file formats, we’ll go through how to open, read, and write a text file in Python 3.

When you’re finished with this tutorial, you’ll be able to handle any text file in Python.

## Background

Python is super accommodating and can, with relative ease, handle a number of different file formats, including but not limited to the following:


| File type |	Description |
| :-: | :- | 
| `txt`  |	Plain text file stores data that represents only characters (or strings) and excludes any structured metadata |
| `csv`  |	Comma-separated values file uses commas (or other delimiters) to structure stored data, allowing data to be saved in a table format |
| `html` |	HyperText Markup Language file stores structured data and is commonly used with most websites    |
| `json` |	JavaScript Object Notation is a simple and efficient format, making it one of the most commonly used formats to store and transfer data |
| `jpg` | JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. |

This tutorial will focus on the `txt` file format. Technically `csv` files are also textfiles.

<div class="alert alert-info">

**Note**

The JupyterHub includes a texteditor. To create, open and edit a textfile: go to your dashboard and click on "New" (button to the top right) > "Text File". This will create and open a textfile with the name "untitled.txt". Rename the file as you want, by clicking on the filename at the top of the texteditor. Once you are done with editing the file content, save and close it. 

</div>

## Step 1 — Creating a Text File

Before we can begin working in Python, we need to make sure we have a file to work with. To do this, open up the texteditor of the JupyterHub and create a new textfile, let’s call it `days.txt`. 

<div class="alert alert-success">
    
**`~/$home/days.txt`**

Monday <br> Tuesday <br> Wednesday <br> Thursday <br> Friday <br> Saturday <br> Sunday <br>

</div>

Next, save your file. Make sure that the file is located in the folder from where you want to read it. If neccessary move it to the correct folder using the JupyterLab file explorer.
In our example, our user sammy, saved the file here: `~/geosf21_material/Lessons/days.txt`. The path will be relevant, when we open the file in Python.

Now that we have a txt file to process, we can begin our code!

## Step 2 — Opening a File

To open a file in Python, we first need some way to associate the file on disk with a variable in Python. This process is called opening a file. We begin by telling Python where the file is. The location of your file is often referred to as the file path. In order for Python to open your file, it requires the path. The path to our `days.txt` file is: `/home/jupyter-YourGithubUsername/days.txt`. In Python, we will create a string variable to store this information.

In [2]:
path = './days.txt'  # update your file path if necessary

We will then use Python’s `open()` function to open our `days.txt` file. The `open()` function requires as its first argument the file path. The function also allows for many other parameters. However, most important is the optional *mode* parameter. Mode is an optional string that specifies the mode in which the file is opened. The mode you choose will depend on what you wish to do with the file. Here are some of our mode options:

* `'r'` : use for reading
* `'w'` : use for writing
* `'x'` : use for creating and writing to a new file
* `'a'` : use for appending to a file
* `'r+'` : use for reading and writing to the same file

In this example, we only want to read from the file, so we will use the `'r'` mode. We will use the `open()` function to open the `days.txt` file and assign it to the variable `days_file`.

In [2]:
days_file = open(path,'r')

FileNotFoundError: [Errno 2] No such file or directory: '/home/jupyter-YourGithubUsername/days.txt'

If you get an `FileNotFoundError`, make sure that you have entered the path correctly, including your Github usere name. After we have opened the file, we can then read from it, which we will do in the next step.

## Step 3 — Reading a File

Since our file has been opened, we can now manipulate it (i.e. read from it) through the variable we assigned to it. Python provides three related operations for reading information from a file. We’ll show how to use all three operations as examples that you can try out to get an understanding of how they work.

The first operation `<file>.read()` returns the entire contents of the file as a single string.

In [None]:
days_file = open(path,'r')
days_file.read()

Remember `\n` from string formatting?! This backslash character codes a new line in strings and it represents a new line in the file.

The second operation `<file>.readline()` returns the next line of the file, returning the text up to and including the next newline character. More simply put, this operation will read a file line-by-line. 

In [None]:
days_file.readline()

Oh, what happened ?!!

You have received an empty object!. When reading a file, a file pointer is moving through the file and always stops at the position after the last read entry. Therefore, after executin the operation `<file>.read()`, a file pointer has reached the end of the file (`EOF`). If we continue reading the file after `EOF` was reached, we receive an empty string object.

<div class="alert alert-info">

**Note**

You have to be very careful with the order of file handling operations in a Jupyter Notebook, since it is possible to execute cells in any order. Meanwhile, the Python interpreter has to first receive a command to open the file, then it can read through it step by step from beginning to end. Therefore, if you want to read a file from the beginning, we advise to always re-open a file before reading it in one Jupyter Notebook code cell.  

</div>

So, let's first reopen the file and then read it with the `<file>.readline()` method.

In [None]:
days_file = open(path,'r')
days_file.readline()

At this point, we will not close the file yet, because we want to continue reading the file. If you read a line with the readline operation it will pass to the next line. So if you were to call this operation again, it would return the next line in the file, as shown.

In [None]:
days_file.readline()

Now let's try another option. The last operation, `<file>.readlines()` returns a list of the lines in the file, where each item of the list represents a single line.

In [None]:
days_file = open(path,'r')
days_file.readlines()

Yet another option allows us to apply list comprehensions to read a file's content line by line into a list:

In [None]:
days_file = open(path,'r')
[line for line in open(path,'r')]

Again, keep in mind when you are reading from files, once a file has been read using one of the read operations, it cannot be read again. For example, if you were to first run `days_file.read()` followed by `days_file.readlines()` the second operation would return an empty string. Therefore, anytime you wish to read from a file you will have to first open a new file variable. Now that we have read from a file, let’s learn how to write to a new file.

Python provides also some functions to control the position of the file pointer. We are not going to discuss these here, however, if you are interested, you can research the file object methods `.tell()` or `.seek()`.

## Step 4 — Writing a File

In this step, we are going to write a new file that includes the title *Days of the Week* followed by the days of the week. First, let’s create our `title` variable.

In [None]:
title = 'Days of the Week\n'

We also need to store the days of the week in a string variable, which we’ll call `days`. To make it easier to follow, we include the code from the steps above. We open the file in read mode, read the file, and store the returned output from the read operation in our new variable `days`.

In [None]:
path = './days.txt'  # update your file path if necessary
days_file = open(path,'r')
days = days_file.read()

Now that we have variables for title and days of the week, we can begin writing to our new file. First, we need to specify the location of the file. Again, we will use the file `days.txt`. We will have to specify the new file we wish to create `new_days.txt`. Make sure to enter the path you intend to use. 

Now we can open our new file in write mode, using the `open()` function with the `'w'` mode specified.

In [4]:
new_path = './new_days.txt' # update your file path if necessary
new_days = open(new_path,'w')

Important to note, if `new_days.txt` already existed before opening the file its old contents would have been destroyed, so be careful when using the `'w'` mode.

Once our new file is opened, we can put data into the file, using the write operation, `<file>.write()`. The write operation takes a single parameter, which must be a string, and writes that string to the file. If you want to start a new line in the file, you must explicitly provide the newline character. First, we write the title to the file followed by the days of the week. Let’s also add in some print statements of what we are writing out, which is often good practice for tracking your scripts’ progress.

In [None]:
new_days.write(title)
print(title)

new_days.write(days)
print(days)

In analogy to readlines, a list of strings can be written to a file at onces using the file object method `<file>.writelines().`

Lastly, whenever we are finished with a file, we need to make sure to close it. We show this in our final step.


## Step 5 — Closing a File

Closing a file makes sure that the connection between the file on disk and the file variable is finished. Closing files also ensures that other programs are able to access them and keeps your data safe. So, always make sure to close your files. Now, let’s close all our files using the `<file>.close()` function.

In [6]:
days_file.close()
new_days.close()

We’re now finished processing files in Python and can move on to looking over our code.

In [5]:
path = './days.txt'  # update your file path if necessary
days_file = open(path,'r')
days = days_file.read()

new_path = './new_days.txt' # update your file path if necessary
new_days = open(new_path,'w')

title = 'Days of the Week\n'
new_days.write(title)
print(title)

new_days.write(days)
print(days)

days_file.close()
new_days.close()

Days of the Week

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday



With that we covered the bulit-in functions to handle textfiles in Python. For more sophisticated file operations, including renaming, deleting, moving files or folder, external module packages have to be used. We will get back to that later. 

## Receiving User Input

Another very useful tool in Python, not to actually handle files but to receive user input from the keyboard, is the function `input()` (note in Python2, this would be the function `raw_input()`).

You can request user input in Jupyter Notebook or in a Python script like the one above. For that, Python 2 and 3 differ in reading user input:
* `text = raw_input("prompt")`  # Python 2
* `text = input("prompt")`  # Python 3

Since, the Kernel on the JupyterHub is running Python 3, you have to use the function `input()`. If you are unsure, which Python version you are running on your computer, just execute the following command in your terminal or command window:

`python --version`

<div class="alert alert-warning">
    
**Note**

If you run the `input()` function in a Jupyter Notebook a prompt field opens. After entering text, you have to hit `<Enter>`. Do not execute the cell again, before hitting a plain `<Enter>`. If you execute the cell again (e.g. by hitting `<Shift>+<Enter>` or the `Run` butoon at the top), without actually filling the prompt field, the Kernel get's stuck in an infinite loop that you cannot interrupt through cell controls, since the original prompt is hidden. Therefore, once the prompt field is open, enter something into it before executing the cell again, then hit a plain `<Enter>` on your keyboard. However, if you get stuck in such an infinite loop, you can force the Kernel to stop the loop by clicking on the following menue item above: `Kernel` > `Interupt`.

</div>


Now, let's try to get some user input and assign it to a variable. For that, execute the cell below. The cell will prompt you to enter any string and it would display the same string on the screen. Enter your text and hit a plain `<Enter>`. Then execute the cell below, to see your result!

In [None]:
print('Write something, you like to see printed below!!!')
yourText = input()


In [None]:
print("You entered: ", yourText)

Add the following code to the end of your Python script you generated in the exercise above. Then execute the script again and try the user input!

In [None]:
print('Which is your favorite day?!!!')
yourText = input()
print('')
print('Great choice, your favorite day : ')
print(yourText)
print('')

## Conclusions: File Objects

File objects are the main interface to external files on your computer. With that files are also a core object type. There is no specific literal syntax for creating or reading files. Instead built-in functions are used for handling files. File objects are created  using the `open()` function. A summary of File I/O in Python, mentioning a few more useful methods and functions, can be accessed here: https://www.tutorialspoint.com/python/python_files_io.htm. 
Now you can open, read, write, and close textfiles in Python!

You should also be able to read user input from the keyboard. And you can even write a small program (script) and execute it on the terminal of the JupyterHub. If you install Python on your own computer, this way you can run any Python script from the terminal (on Apple desktops) or from the command window (on Windows desktops). 
