# Week 7: Nestead, User Function, Files

Nested structures are not new data types. Those are just more complex structures than those that we've used already. Usually we call nested structures the lists of lists, dictionaries where the values are the lists or the sets, etc.

## 1. Nested Lists

Remember, that any object can be a part of a list. So in theory we can have a list of lists, a list of tuples, a list of sets, a lists of dictionaries... Let's check an example. Imagine that we have a list that consists of two elements. The first is the list with student names, the second is the list with those students' grades.

In [None]:
students = [['Mark', 'Alice'], [8, 9]]
print(len(students)) # checking the number of elements
print(students[0]) # accessing the first list
print(students[1]) # accessing the second list

See, the number of elements is not 4 but 2, and the indexing does not give us the strings `'Marc'` and `'Alice'`, but rather the first list and the second list. That happens because Python sees that sructure as some kind of a Russian doll — there are smaller elements withing the bigger ones.

However, such structures are really convenient. Imagine our example list to be like a table where the first column contains student's names and the second — their grades.

Actually, with such kind of structures we can use double (triple, quadraple...) indexing. Let's access information about the first student.

In [None]:
print(students[0][0]) # the 1st student name
print(students[1][0]) # the 1st student grade

First, we've got the bigger element in which we are interested (the list of names under the index `0`) and then the 1st student name (the element of that list under the index `0`). Then we performed the same operations for the second list.

Let's try another example. Now each nested list would describe a student based on three features — name, year of birth and major. And let's try to read and add info about new student.

In [None]:
students = [['Ivan Ivanov', 2005, 'POLSCI'],
            ['Oleg Sidorov', 2006, 'JOURN']]

new_student = input("New stu.:").split(',') # reading info items separated by comma
new_student[1] = int(new_student[1]) # converting year of birth into integer
students.append(new_student) # adding info about new student to a list
print(students)

Doesn't look too hard, yes? 

If you need to read information in `while` loop just do not forget to check the stop criteria and read information about the new student after you've appended the previous one to the list of lists. Look at the example below:

In [None]:
students = []
info = input("New stu.: ")
while info != 'END':
    info_list = info.split(',')
    info_list[1] = int(info_list[1])
    students.append(info_list)
    info = input("New stu.: ")
print(students)

Let's go back to our initial nested list. So now we know how to read data in such a list. But how can we retrieve information from it? Actually, don't fret and just treat nested list like a usual list. Let's start with `for in range()` type of loop.

In [None]:
students = [['Ivan Ivanov', 2005, 'POLSCI'],
            ['Oleg Sidorov', 2006, 'JOURN']]

for i in range(len(students)):
    print(students[i]) # prints nested list under the index `i`

See, each nested list was printed in due time, when its index was assigned to `i` variable. Let's now output information about each student in a more fancy way.

In [None]:
for i in range(len(students)):
    print(f'Info about student #{i+1}')  # printing student's number
    print('Name:', students[i][0])       # printing name
    print('Age:', 2024 - students[i][1]) # calculating and printing age
    print('Major:', students[i][2])      # printing major

If you don't need, let's say, a counter variable, you can go for a simple `for` loop and access nesting lists directly without using an index variable. Compare the code below to the one above.

In [None]:
for item in students: # each nested list would be assigned to an `item` variable
    print('Name:', item[0])
    print('Age:', 2024 - item[1])
    print('Major:', item[2])
    print('*' * 20) # printing a divider to make out output prettier and more organized

Sometimes it can even be ok to use a nested `for` loop to go through the elements of nested lists. E.g. if we don't need to print all elements of our list in some fancy way.

In [None]:
for item in students: # each nested list would be assigned to `item`
    for item_info in item: # each nested list element would be assigned to `item_info`
        print(item_info)
    print('*' * 20)

Let's check another example. Now our nested lists represent grades for different students, each can have a different number of grades. Let's count how many grades of all our students are lower than 4.

In [None]:
marks = [[2, 5, 5], [10, 8, 3, 9], [10, 8, 4, 5]]

cnt = 0 # creating a counter variable
for item in marks: # looping through the major list
    for mark in item: # now looping through the nested list currently stored in `item`
        if mark < 4:
            cnt += 1

print(cnt)

Now let's calculate GPA for each of those students.

In [None]:
marks = [[2, 5, 5], [10, 8, 3, 9], [10, 8, 4, 5]]

for item in marks:
    print(sum(item)/len(item))

To make our output more fancy, let's use `for i in range()`:

In [None]:
marks = [[2, 5, 5], [10, 8, 3, 9], [10, 8, 4, 5]]

for i in range(len(marks)):
    print(f'Student #{i+1} GPA is: {sum(marks[i])/len(marks[i])}')

## Nested dictionary
Now let's check an example with a nested dictionary. In dictionary only **values** can be `dictionaries`, `sets` or `lists`, **not keys**.

Let's start with an example where keys are strings (dates) and the values are the lists of night and day temperatures for the given day.

As with lists, we can simply loop through such object and retrive the information we need. We can also use double indexing passing first the key and then the index of the item in the list-value corresponding to that key.

Let's for each day print the night temperature, the day temperature and the average.

In [None]:
temp = {'1st APR':[5, 11], '2nd APR':[4, 12]}

for key in temp:
    print(key) # printing the date
    print('Nighttime (max):', temp[key][0], 'degrees')
    print('Daytime (max):', temp[key][1], 'degrees')
    print('Average:', sum(temp[key])/len(temp[key]), 'degrees')
    print('*'*10)

Check below an example how we can read data into such kind of a dictionary.

In [None]:
temp = {}
info = input("Data: ") # expected format of input here is '1st Mai: 10, 18'
while info != 'END':
    date = info.split(': ')[0]
    temp_values = list(map(int, info.split(': ')[1].split(', ')))
    temp[date] = temp_values
    info = input("Data: ")
    
print(temp)

Now let's solve a problem. Imagine that Ilya wants to watch an anime and he asks his friends for the recommendations. Ilya will watch the anime which was recommended by the majority of people.

**INPUT FORMAT:**
* An unknown number of lines in a format `<ANIME TITLE>: <FRIEND NAME>`
* One friend can recommend several titles.

**OUTPUT FORMAT:**
* Title of the anime that Ilya will watch.

This is not the easiest problem, let's first read the data in the dictionary. Since one friend can recommend many animes, convenient format here would be to store the data in such dictionary where the title would be a key and the list of friends who have recommended it — a value.

In [None]:
anime = {}
advice = input("Data: ")    # reading the line like "Tenki no ko: Pasha"
while advice != "END":
    anime_title = advice.split(': ')[0]   # saving title to a variable
    friend = advice.split(': ')[1]    # retrieving a friend's name
    if anime_title not in anime:      # if such anime was not yet recommended
        anime[anime_title] = [friend] # then create such key and assign a list consisting of one name to it
    else:     # if such anime was already recommended
        anime[anime_title].append(friend)    # than append a new friend's name to an existing list assigned to that key
    advice = input("Data: ")    # read a new input

print(anime)    # checking our list

Great! We got quite a complex structure but it will help us to solve that problem quite nicely. Now we need to get to the second part. We have to find how many friends recommended each anime and which one got the maximum number of recommendations. We can compute the length of all values (lists with recommenders names) via `map()` function and then find the maximum.

In [None]:
print(list(map(len, anime.values()))) # computing number of recommendations for each anime
max_recs = max(map(len, anime.values())) # finding the highest number of recommendations
print(max_recs)

Now that we have the maximum number of recommendations, let's find the key which value has exactly the same length.

In [None]:
for key in anime:
    if len(anime[key]) == max_recs:
        print(key)

In case if we were not to come with a `map()` function trick, we could do it in a more lengthy way.

In [None]:
max_recommend = 0 # initiating a counter variable
for key in anime:
    if len(anime[key]) > max_recommend: # if current anime has the highest number of recommendations
        title_recommend = key # then update title of anime that Ilya will watch
        max_recommend = len(anime[key]) # update the current maximum number of recommendations

print(title_recommend)

Now let's get the code for the problem in one place:

In [None]:
anime = {}
advice = input() # reading the line like "Tenki no ko: Pasha"
while advice != "END":
    anime_title = advice.split(': ')[0] # saving title to a variable
    friend = advice.split(': ')[1] # retrieving a friend's name
    if anime_title not in anime: # if such anime was not yet recommended
        anime[anime_title] = [friend] # then create such key and assign a list consisting of one name to it
    else: # if such anime was already recommended
        anime[anime_title].append(friend) # than append a new friend's name to an existing list assigned to that key
    advice = input() # read a new input

max_recs = max(map(len, anime.values())) # finding the maximum

for key in anime:
    if len(anime[key]) == max_recs:
        print(key) # printing the title with the maximum number of recommendations

# 2. User-defined functions in Python

We use a lot of functions when we are programming. But often it is useful to define user's functions. Let's find out how to do that.

``` python
def name_of_the_function(parameters):
  *some instructions*
  return result
```

Above there is the schema for the function creation.
* You need the keyword `def` to define your own function.
* You need to name it with the same restrictions applicable to variables names: you cannot use spaces within the name; the name cannot start with a digit; it should not be the same as the name of existing Python's functions.
* Then in parentheses you specify the number of parameters (arguments) that your function will take and to which local variables they will be assigned.
* Then there is an indented block after the colon where you can specify any instructions you want.
* The line with a keyword `return` is not mandatory, but you will have to write it if you want to use the result of function later (e.g. to assign it to a variable or to use in conditional statement, etc.).
* We shouldn't call a function before defining one! It would lead to an error.

Let's define our very first function.

Our function would be named `currency_converter`. It will take a number as an argument and will assign it to a `rub` variable. It will covert rubles to dollars according to an exchange rate 74.73 rubles per one dollar.

In [None]:
def currency_converter(rub): # defining a function
    dollars = rub / 91.54

my_rubs = 200

print(currency_converter(my_rubs)) # calling a function

Hm, the code above produced nothing. Why? Because we did not get our function to return any result. Maybe we have to print `dollars` variable to get it?

In [None]:
print(dollars)

An error. The thing is that all the variables that we define within a function are called `local` variables because they don't exist outside of it. So the only way to get something out of function is to `return` it. Let's try adding new line to our function.

In [None]:
def currency_converter(rub):
    dollars = rub / 74.73
    return round(dollars, 2) # return the variable value rounded to 2 digits after the dot

my_rubs = 200
print(currency_converter(my_rubs))

Amazing! We can even assign the result produced by a function to a variable if we need to.

Now let's speak about the parameters. We've specified that our function `currency_converter` takes one argument. What will happen if we try to pass no arguments or two arguments?

In [None]:
print(currency_converter()) # error saying that 1 argument is required

In [None]:
print(currency_converter(200, 10)) # error saying that there are too many arguments

So basically when designing a function you specify the number of arguments it will take. You can specify more than one! You can even specify an indefinite amount of arguments. You can read about it [here](https://www.geeksforgeeks.org/args-kwargs-python/#:~:text=The%20special%20syntax%20*args%20in,used%20with%20the%20word%20args.). But once specified you cannot pass different amount of arguments to your function. Too less or too many would lead to an error.

But how does Python know that our argument should be a number? It actually does not. You can try to pass a string to `currency_converter` and Python will throw an error that it cannot divide a string by a float when it comes to a calculation.

In [None]:
print(currency_converter('100'))

However, in some cases you may end up with functions that will be able to perfrorm the needed instructions to the 'wrong' datatype. So watch for this. Basically, the data type of function arguments is restricted only by the instructions that you specify inside the function.

Assume that you wrote a function that you were planning to use to add two numbers together. However, it will work with two strings and even with two lists since `+` operator can be used for those data types as well.

In [None]:
def sum_a_b(a, b):
    return a + b

print(sum_a_b(10, 4)) # sums two integers
print(sum_a_b('10', '4')) # concatenates two strings
print(sum_a_b([2, 4], ['cat', 'dog'])) # concatenates two lists

By the way, we've just specified the function that takes two arguments! `sum_a_b` will throw an error if you will try to pass any number of argument that is not two.

We can also specify a **default value** for an argument. In that case that default value would be used if that argument is not passed when calling the function.

In the example `currency_converter_2` takes two arguments — amount of rubles and the exchange rate. If the rate is not passed, then default rate (91.54) would be used.

In [None]:
def currency_converter_2(rub, rate=91.54): # specifying a default value
    return round(rub / rate, 2)

print(currency_converter_2(100, 70))
print(currency_converter_2(100, 30))
print(currency_converter_2(200)) # converting with a default rate

There can be also the functions with no arguments and with no `return` keyword. Such functions are more exotic and are usually used to debug the code or check the progress when running the programs.

In [None]:
def info():
    print(f'The file was downloaded {cnt} times today')

cnt = 17 # some global variable that would be accessed within a function
info() # not using print() because function returns nothing and prints info by itself

When to define your own function? Sometimes it is just neat to pack lengthy instructions that are expected to be called several times throughout a project into the short name. In other cases we need to define a function to pass it to other functions.

E.g. let's write our own function to use with `map()`. Imagine that we have a list of ages of the respondents to the questionnaire. But some of them by mistake wrote the year of birth. Let's write a function that will check whether the age or YoB was inputted, and then convert the latter in the age by deducting it from the current year.

In [None]:
def get_age(number):
    if number > 1000: # checking that number is indeed YoB and not age
        return 2024 - number # if yes then calculate the age
    return number # if no then return number (age) unchanged

answers = [26, 2005, 31, 15, 2003]
print(list(map(get_age, answers)))

In the example above we didn't use `else`. We actually could but it would be redundant. When a function hits `return`, it exits, no other code within that function would be executed. That is why we can bypass `else` in this case.

We can also call a function within a function. Let's make our example a bit more complicated. Let's say that we are not interested in the age per se, but rather to see whether the respondent is a minor or not. Let's define the second function that would call `get_age()` within itself.

In [None]:
def get_age(number): # defining the first function
    if number > 1000:
        return 2024 - number
    return number

def is_minor(age): # defining the second function that does the `minor check`
    if get_age(age) >= 18: # before the comparison function get_age is called
        return 'Not minor'
    return 'Minor'

answers = [26, 2007, 31, 15, 2003]
print(list(map(is_minor, answers)))

## Modules

In the future we will use not only standard Python functions and our very own functions, but we will also import different `modules` — collections of the functions and variables to solve particular problems.

To import a module we use `import` keyword and then specify a module name to import. Sometimes we will have to download the module first, but not now.

Then to call a function or a variable from a module we will have to type a module's name, then put a dot and then call a function. All the functions and variables available in the module we can find in documentation. Below are few examples.

### Module math

Collection of the most basic math functions and variables. Documentation is [here](https://docs.python.org/3/library/math.html).

In [None]:
import math # importing math

print(math.log(10)) # calling logarithm function from math
print(math.sqrt(10)) # calling square root function
print(math.pi) # calling pi variable

### Module calendar

Collection of the basic calendar and dates related functions and variables. More [here](https://docs.python.org/3/library/calendar.html?highlight=calendar#module-calendar).

In [None]:
import calendar
print(calendar.weekday(2024,2,29)) # calling a function that
                                  # returns index of day of a week for a given date
# Python is 0 index

### Module string

Formatting string operations and useful string variables. More [here](https://docs.python.org/3/library/string.html?highlight=string#module-string).

Let's do a small example and clean the text from the punctuation symbols using the imported variable that contains all of them.

In [None]:
import string
print(string.punctuation) # punctiation variable consists all basic punctuation symbols

text = "hi, it's me!" # our text to clean

clean_text = '' # initiating an empty string to store a clean text
for symbol in text:
    if symbol not in string.punctuation: # if the symbol is not a punctuation then add it to the clean_text
        clean_text += symbol

print(clean_text) # print the text without punctuation

# 3. Working with Files in Python

Reading and writing data to files using Python is pretty straightforward. To do this, you must first open files in the appropriate mode.

`open()` takes a filename (e.g. some `.txt` file) and a mode as its arguments. `r` opens the file in read only mode. To write data to a file, pass in `w` as an argument instead, to append new text use `a`. In the following syntax, when you're finished with working with the file, you need to close it, applying the corresponding method.

In [None]:
fh = open('test_file.txt', mode='w') # w - write
fh.write('Hi!')
fh.close()

In [None]:
fh = open('test_file.txt', mode='a') # a - append
fh.write('How are you?')
fh.close()

Let's try to write several lines into our file using `\n`

In [None]:
# file handle, infile
fh = open('new_test_file.txt', mode='w')
fh.write('Hi!\n')
fh.write('How are you?\n')
fh.write('Fine! Thank you!')
fh.close()

If you have already some data stored in a list, for example, you may write information into the file using for loop.

In [None]:
students = ['Anna', 'Maria', 'Alexandra']
fh = open('students.txt', mode='w')
for name in students:
    fh.write(name + '\n')
fh.close()

Usually, it's better to define the encoding which you use for writing/reading the file in order not to get encoding problems. `UTF-8` is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding. (There are also `UTF-16` and `UTF-32` encodings, but they are less frequently used than `UTF-8`.)

In [None]:
fh = open('students.txt', mode='r', encoding='utf8') # r - reading
x = fh.read()
fh.close()

In [None]:
fh = open('students.txt', mode='r', encoding='utf8')
for item in fh:
    print(item, end='')
    print('*'*10)
fh.close()

Here’s an example of how to use Python’s “with open(…) as …” pattern to open a text file and read its contents. It's more convenient to use this snippet of code as you don't need to close the file. “with open(…) as …” context manager will do it automatically.

In [None]:
with open('students.txt') as fh:
    print(fh.read())

In [None]:
with open('students.txt') as fh:
    st = fh.readlines()

for item in st:
    print('Name:', item.strip())