# Module 2 Part 3: Python Tuples, Dictionaries, Reading Data from a File, Formatting Print Output

This module is designed as an introduction to the Python programming language. It covers the basic syntax of Python, main data types and most used data collections with examples.

This module consists of 3 parts:

- **Part 1** - Introduction to Python.
- **Part 2** - Python Strings and Lists.
- **Part 3** - Python Tuples, Dictionaries, Reading data from a file, Formatting print output.

This notebook contains **Part 3** of the Module 2.

## Tuples

__Tuples__ are similar to lists, but are immutable. A tuple is declared as a comma-separated sequence of values or using the `tuple()` function. Below, two tuples are created:
- `new_tuple` is created from a comma-separated sequence of strings
- `vowels` is created by splitting a string.

In [7]:
new_tuple = "apple", "banana", "orange"

In [8]:
new_tuple

('apple', 'banana', 'orange')

In [9]:
type(new_tuple)

tuple

In [None]:
'''The tuple of English vowels:'''

vowels = tuple('yeiouAEIOU')
vowels

In [None]:
'''Since a tuple is immutable, it cannot be changed.
Instead, a new tuple will be created.'''

vowels_corrected = ('a',) + vowels[1:]
t_single_elem = ("one",)
empty_tuple = ()

print(t_single_elem)
print(empty_tuple)
vowels_corrected

### Tuple Assignments

Tuples can be used to easily swap the values of two variables. Python allows this to be implemented without the need for a temporary variable, in one line of code:

In [None]:
a = 1
b = 5

a,b = b,a

print(a)
print(b)

An example below demonstrates how to extract values from a tuple, or to unpack a tuple. 

**NOTE:** The number of variables on the left should be the same as the number of elements in the tuple.

In [None]:
'''Validating the tuple `new_tuple`'''

new_tuple

In [None]:
'''Need to define three variables to unpack a tuple'''

x, y, z = new_tuple
print(x)
print(y)
print(z)

In [None]:
'''If the number of variables does not equal number of values in the tuple,
we will get an error:'''

d, e, f, g = new_tuple

Quite often, a function returns values as tuples. Technically, a Python function can return only one value, 
so if we need a function to return several values they can be returned as a tuple.

The return expression `return result_1, result_2, result_3` in a function will produce a tuple `(result_1, result_2, result_3)`. 

### Create tuples with `zip()` function

The `zip()` function is an example of a function that returns tuples. This function pairs up the elements from multiple sequences, starting with the first values, then the second, etc. 

In [1]:
list_num = [1,2,3,4,5]
list_alpha =['a','b','c','d','e']
zipped = zip(list_num, list_alpha)

Note that in Python 3 `zip()` function returns a **zip** object which is an iterator:

In [2]:
type(zipped)

zip

An **iterator** in Python is a special object type that works as a sequence, and can be looped over using (for example) a `for` statement. An iterator is created when we loop over Python lists, tuples, or dictionaries (will be reviewed in the next section).

Let's loop over `zipped`:

In [3]:
for i in zipped:
    print(i)

(1, 'a')
(2, 'b')
(3, 'c')
(4, 'd')
(5, 'e')


Let's try to convert the `zipped` object to a list:

In [4]:
zipped_list = list(zipped)
zipped_list

[]

__NOTE:__ Conversion to a list was unsuccessful in the cell above; `zipped_list` is empty. This is because an iterator can be traversed only once and for this reason it is convenient to convert it to a list first and then operate with the list.   

In [5]:
zip_again = zip(list_num, list_alpha)
zipped_list_2 = list(zip_again)
zipped_list_2

[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e')]

The function `zip()` used with the `*` operator can be used to unzip a list:

In [None]:
list(zip(*zipped_list_2))

In [None]:
first_list = list(list(zip(*zipped_list_2))[0])
second_list = list(list(zip(*zipped_list_2))[1])

In [None]:
first_list

In [None]:
second_list

## Dictionaries

A dictionary is similar to a list where the indices are not limited to only integers. The dictionary is a set of key-value pairs where the key is the index to its associated value. The general form of a dictionary is

    {key_1: value_1, key_2: value_2, ...}

A dictionary can be created by enclosing a sequence of key:value pairs in curly brackets.

Dictionaries are __mutable__; one can build a dictionary by adding items (key-value pairs) to an empty list. The order of items in a dictionary does not matter, they are not indexed with integers. Instead, keys are used to look up values. 

__NOTE:__ The operator `in` also works on dictionaries, but it scans only keys, not values.  

In [1]:
my_dictionary = {}
my_dictionary["one"] = 1
my_dictionary['two'] = 2
print(my_dictionary)
list(my_dictionary.values())

{'one': 1, 'two': 2}


[1, 2]

In [3]:
'''Traversing over dictionary is very similar to that of a list,
just this time keys are used not indices.
Compare these two loops: the first one prints keys, not values.'''

for value in my_dictionary:
    print(value)

for key in my_dictionary:
    print(my_dictionary[key])


one
two
1
2


### Dictionary Comprehension

In the previous section, we learned about list comprehensions. A dictionary comprehension works very similar to a list comprehension, but the end result is a dictionary. The structure of a dict comprehension can be described as follows:

    {key: value for (key, value) in iterable}

Let's see how it works. One of the applications would be to take two lists and create a dictionary using dict comprehension. 

**NOTE:** You will find in the documentation that the term *dictionary comprehension* is often shortened to the *dict comprehension*. 

In [4]:
'''Create a dictionary where the key is an integer from 0 to 9
and the value is the same integer to the power of three.''' 

{x: x**3 for x in range(10)}

{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 125, 6: 216, 7: 343, 8: 512, 9: 729}

In [5]:
'''Creating dictionary where the key is a letter of an alphabet:'''

import string
{x: y**3 for (x, y) in zip(string.ascii_lowercase, range(10))}

{'a': 0,
 'b': 1,
 'c': 8,
 'd': 27,
 'e': 64,
 'f': 125,
 'g': 216,
 'h': 343,
 'i': 512,
 'j': 729}

Let's break down the line of code above. The expression `zip(string.ascii_lowercase, range(10)` creates a list of tuples where the first element in each pair is a lowercase letter of an English alphabet, and the second element is an integer from 0 to 9:

In [6]:
list(zip(string.ascii_lowercase, range(10)))

[('a', 0),
 ('b', 1),
 ('c', 2),
 ('d', 3),
 ('e', 4),
 ('f', 5),
 ('g', 6),
 ('h', 7),
 ('i', 8),
 ('j', 9)]

**NOTE:** For description of the `string` package, please refer to the Python documentation: [Common string operations](https://docs.python.org/3/library/string.html) (Python Software Foundation, 2018).

### __EXERCISE 4:__ Word count

Imagine that you are given a long sentence and need to count how many times each word appears in the sentence.    

1). First, split the sentence into a list of words. We will use a sentence consisting of 251 words from "Barnaby Rudge", by Charles Dickens.

In [None]:
long_sentence = 'To none of these interrogatories, whereof every one was more pathetically delivered than the last, did Mrs Varden answer one word: but Miggs, not at all abashed by this circumstance, turned to the small boy in attendance—her eldest nephew—son of her own married sister—born in Golden Lion Court, number twenty-sivin, and bred in the very shadow of the second bell-handle on the right- hand door-post—and with a plentiful use of her pocket- handkerchief, addressed herself to him: requesting that on his return home he would console his parents for the loss of her, his aunt, by delivering to them a faithful statement of his having left her in the bosom of that family, with which, as his aforesaid parents well knew, her best affections were incorporated; that he would remind them that nothing less than her imperious sense of duty, and devoted attachment to her old master and missis, likewise Miss Dolly and young Mr Joe, should ever have induced her to decline that pressing invitation which they, his parents, had, as he could testify, given her, to lodge and board with them, free of all cost and charge, for evermore; lastly, that he would help her with her box upstairs, and then repair straight home, bearing her blessing and her strong injunctions to mingle in his prayers a supplication that he might in course of time grow up a locksmith, or a Mr Joe, and have Mrs Vardens and Miss Dollys for his relations and friends.'

In [None]:
# Type your code here


2). Create a dictionary from that list where keys are unique words (e.g. if "the" appears in the sentence 3 times, there will only be one key for "the") from the list, and the values are its occurrence number.

In [None]:
# Type your code here

## Reading data from a file

Often, we need to read data from a file. The file may contain simple text, an Excel spreadsheet, an XML document, or be of any other format. Python offers multiple tools for file reading.

The simplest way would be to use the built-in function __`open()`__. This function opens a file and returns a file object; it gives a _file handle_. The file handle is not the actual data contained in the file, but instead it can be used to read the data. Typing `?open` in a command line will return a docstring about this function.

### Open and read a text file

The general syntax of `open()` function is:

    open(name[, mode[, buffering]]) -> file object

This shows that the first argument is the path to a file (if the file is in the same directory only the filename can be provided; if not, the path to the file must be provided). The second argument states the mode. Here are the most used modes: 

* 'r' - to open for reading,
* 'w' - to open for writing (old data erased),
* 'a' - to open for appending to what is already in the file,
* 'r+' - read and write mode

A text file is a sequence of lines. The special character called the __newline__ character represents the end of each line. In Python, the *backslash-n* represents a new line. __NOTE:__ `"\n"` is actually one character even though it looks like two.

There are four different methods to read from a file: `read(), readlines(), readline()` and `for`-loop over the file_object:
 * `read()` can be used to read the whole file at once and use it as a single string (not recommended for big files)
 * `readlines()` returns the content of a file as a list of strings; each line can be accessed by index
 * `readline()` can be used to read only a part of file; the first call of the method returns the first line, second call - second line and so on. Inside `while` loop can be used to read the file until certain place reached.
 * `for line in file_object: <...>` allows to process every line in a file one at a time:
       
       file_object = open('example_file_for_reading.txt', 'r')
       for line in file_object:
           print(line)
        
Also, the `file_object` can be converted to a list using `list()` constructor, i.e. `list(file_object)`. This will return a result similar to `readlines()` method.

__NOTE:__ When you are done working with the file, it's important to close it.  If you don't and the file was written to it may end up empty, incomplete or corrupted. If method `close()` was used:

    file_object.close()
    file_obj.read()
    % ValueError: I/O operation on closed file

### Write to a file

The `write()` method in Python works like `print()` function but it does not add newline character `'\n'`. Again, open the file first:

    file_object = open('file_to_write_to.txt', 'w')
    file_object.write(('text goes here\n')
    file_object.close()

In [None]:
two_rows_string = "This is the first line,\nand this is the second line."

In [None]:
two_rows_string

In [None]:
print(two_rows_string)

In [None]:
file_object = open('file_to_write_to.txt', 'w')
file_object.write(two_rows_string + '\n')
file_object.close()

In [None]:
file_object = open('file_to_write_to.txt', 'a')
file_object.write('Oh, one more thing\n')
file_object.close()

## Formatting print output

In this short section we will quickly review how to format the output of the `print()` function. Here are a few examples:

In [None]:
'''Formatting strings with format specifiers'''

'''Pad with spaces if < 10 chars long'''
print("{0:20} is the best of them all".format("Stella Artois"))

'''Take the width from the second parameter'''
print("{0:{1}} is the best of them all".format("Stella Artois", 10))

'''Take named arguments'''
print("{beer:{width}} is the best of them all".format(beer="Stella Artois", width=10))

'''Force right justification'''
print("{0:>20} is the best of them all".format("Stella Artois"))

'''Force right justification and pad with &'''
print("{0:&>20} is the best of them all".format("Stella Artois"))

In [None]:
'''String interpolation'''
personA = "Mary"
personB = "Jane"
print("%s and %s went up the hill" % (personA, personB))

----

You have reached the end of this module. 

---

## References

Python Software Foundation (2018). https://docs.python.org/3/library/string.html