# Week 3: Data Structures and Functions

In today's tutorial, we'll cover:
- Data structures, 
  - like lists, 
  - dictionaries, and 
  - tuples
- Functions.

A set of exercises that will allow you to test your learning of this tutorial will also be made available.  

## Lists

So far, we've used variable names to store a single value. However, we sometimes have a _collection_ of related values that we want to store together. Python has a number of built-in collection types, including lists. Lists are _sequences_ of values, possibly of different types, that are stored in order. Python lists are also _mutable_ and _dynamic_, and allow us to access their elements using indices.

### Instantiating a list

Square brackets (`[` `]`) are used to instantiate lists:

In [63]:
no_elements = []
one_element = [1]
lots_of_elements = [1, 2, 3]
lots_of_elements_with_different_types = [1, "two", 3.0]
nested_lists = [1, 2, lots_of_elements_with_different_types]

print(no_elements)
print(one_element)
print(lots_of_elements)
print(lots_of_elements_with_different_types)
print(nested_lists)

[]
[1]
[1, 2, 3]
[1, 'two', 3.0]
[1, 2, [1, 'two', 3.0]]


As shown, we can create an empty list with square brackets that don't have anything between them. Sometimes this is useful for creating a list that we'll add values to later on.

In addition, we can instantiate a list with values. The initial values of the list go inside the square brackets, and are separated by commas. For example, `lots_of_elements` has three elements, or values: `1`, `2`, and `3`. We can also store values of different types in a single list. `lots_of_elements_with_different_types` contains an integer, a string, and a float. Finally, as shown by `nested_lists`, list elements can be lists themselves, creating a _nested_ list.

### Getting the length of a list

We can get the length of a list using the `len` function:

In [None]:
print(len(no_elements))
print(len(one_element))
print(len(lots_of_elements))
print(len(lots_of_elements_with_different_types))
print(len(nested_lists))

While most of these are self-explanatory, it is worth noting how `len` handles nested lists. `nested_lists` contains only three elements, and `len` returns `3`. The last element in `nested_lists` is itself a list: but it still counts as a single element.

### Indexing and slicing

We can access a specific element of a list:

In [None]:
print(lots_of_elements[0])
print(lots_of_elements[2])

The number inside the square brackets is called the _index_. In Python, each element of the list is given an index, starting with `0` for the first element, and incrementing by one for each subsequent element in order.

We can try to access an element that doesn't exist:

In [65]:
print(lots_of_elements[0])

1


As shown, we'll get an error if we try to do that. We can, however, access elements using a negative index:

In [None]:
print(lots_of_elements[-3])

A negative index tells the Python interpreter to count from the _last element_ of the list, where the last element can be thought of as having index `-1`, and the first element as having index `-len(list)`. As with positive indices, we can't access elements that don't exist, and need to stay within the range:

In [None]:
print(lots_of_elements[-4])

While we can access individual elements of a list, it is sometimes useful to access multiple values at once. In Python, we can do this with _slicing_:

In [None]:
some_of_lots_of_elements = lots_of_elements[1:3]
print(lots_of_elements)
print(some_of_lots_of_elements)

In this example, we specify two indices, separated by a colon (`:`), in the square brackets on the first line, to take a _slice_ of the `lots_of_elements` list. The first number specifies the index of the first element that we want to extract, and the second number specifies where the slice stops: the slice will _not_ include this final element.

We can think of the indices being to the left of each element:

![List indices](images/list_indices.png "List indices")

And so, when we take a slice (say `lots_of_elements[1:3]`), it can be illustrated as:

![List indices (slice example)](images/have_a_slice.png "List indices (slice example)")

As shown in the above example, slicing a list will create a new list. This is true even when the slice contains a single element:

In [None]:
print(lots_of_elements[2:3])

When slicing, we can omit either or both of the start and end indices:

In [None]:
print(lots_of_elements[:2])
print(lots_of_elements[1:])
print(lots_of_elements[:])

Omitting the start index means that the slice will start at the beginning of the list, while omitting the end index means that the slice will continue until the end of the list. It follows that omitting both values is the same as taking a slice of the whole list (i.e., `lots_of_elements[:] == lots_of_elements`).

We can also specify a _step_:

In [None]:
many_numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(many_numbers[1:8:2])

The slice still starts with the element indicated by the first index, and stops before the second index. The third number specifies the step.

Both the start and end indices, and the step, can be negative:

In [None]:
print(many_numbers[-5:])
print(many_numbers[-5:-1])
print(many_numbers[-5:1:-1])
print(many_numbers[::-1])

Negative start and end indices behave as described for accessing individual elements, where counting begins at the end of the list. A negative step means that the slice will step through the list _in reverse_. As shown, the slice `[::-1]` is shorthand for reversing the whole list.

Finally, when accessing a nested list, we chain together the notation:

In [None]:
print(nested_lists[2][1])

In the example above, we first access the 3rd element of the `nested_lists` list, which is itself a list. We then access the 2nd element of that list.

### Mutating lists

Lists are _mutable_: that is, we can change the values that are stored in the list:

In [None]:
a_few_numbers = [1, 2, 3, 4, 5]
print(a_few_numbers)
a_few_numbers[3] = 12
print(a_few_numbers)

We can also change multiple values at once, using the slice notation described in the last section:

In [None]:
a_few_numbers[2:4] = [30, 31]
print(a_few_numbers)

All of the notation from the previous section can be used here. It is important to note that the length of the slice on the left hand side (i.e., the list we're mutating) _must_ be the same as the length of the list of the right hand side (i.e., the new values).

Lists in Python are also _dynamic_: we can add and remove values, changing the size of the list:

In [None]:
a_few_more_numbers = [7, 8, 9]
print(a_few_more_numbers)
a_few_more_numbers.append(10)
print(a_few_more_numbers)

Here, we use the `append` function of lists to add an individual element. We can also use `extend` to add a list of elements to another:

In [None]:
a_few_more_numbers.extend([10, 11, 12, 13])
print(a_few_more_numbers)

We can also remove elements from a list:

In [None]:
a_few_more_numbers.remove(10)
print(a_few_more_numbers)
del a_few_more_numbers[3:5]
print(a_few_more_numbers)

`remove` will delete the first occurence of the specified value, while `del` will remove the specified element or slice. It is important to remember that deleting elements from a list will shift the indices of all the elements that come after the deleted elements.

### List operators and operations

There are a couple of useful operators for lists:

In [None]:
two_lists = [1, 2, 3] + [4, 5, 6]
multiply_lists = [1, 2, 3] * 10

print(two_lists)
print(multiply_lists)

`+` is used to concatenate, or join, two lists together. It follows that `*` will join together a list with copies of itself, for the specified number of times. These operators are useful for adding elements to a list, or for instantiating a list with a known number of elements.

We could use `+` to add elements to a list like this:

In [None]:
a_few_more_numbers = a_few_more_numbers + [14, 15, 16]
print(a_few_more_numbers)

While this appears to be the same as using `extend`, there is a subtle difference that is sometimes important. In this example, using `+`, we're _creating a new list_, and assigning the new list to the `a_few_more_numbers` variable. When we used `extend`, we changed the existing list. We'll explore the impact of this with exercises later on.

While there are many library functions that work on lists, we'll introduce another two here. The first sorts the list:

In [None]:
some_words = ["the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]
print(some_words)
some_words.sort()
print(some_words)

The `sort` function will sort the specified list in place. By default, the list will be sorted in ascending lexicographical order (i.e., dictionary order). You can also sort the list in reverse order:

In [None]:
some_words.sort(reverse=True)
print(some_words)

Finally, we can find the indice of the first occurence of a given value in a list:

In [None]:
print(some_words.index("fox"))

An error will occur if the list doesn't contain the specified value. To check if a given value occurs in the list, we can check using the boolean `in` operator:

In [None]:
print("cat" in some_words)

### Strings as lists

In Python, there is some overlap between lists and strings. You can think of strings as being lists of individual characters. All of the methods for accessing individual elements -- or characters, in the case of strings -- and slices will work with strings as they do for lists.

However, it is important to note that, in contrast to lists, strings are _immutable_: they cannot be changed. That means that none of the functions described in the `Mutating lists` section above apply to strings.

### Iterating over lists

It is quite common to want to loop over a list, and perform some computation with each element in turn. Python's for loop makes this straightforward:

In [None]:
for word in some_words:
    print(f"The word is: {word}")

We can do this for lists containing any type, including numbers:

In [14]:
import numpy as np

In [None]:
corr = np.corrcoef(np_baseball[])

In [None]:
for i in many_numbers:
    print(i)

We can see some similarities with this approach - of looping over a list of numbers - with the `range` function introduced earlier in the week. While there is some difference, we can think of `range` as being a function that produces a list with the specified values and length. 

## Tuples

While we've seen that lists are _mutable_ -- that is, they can be changed after they've been created -- Python provides a similar type that is _immutable_, and so cannot be changed once it has been instantiated. These are called _tuples_ and are instantiated and accessed using syntax that is similar to that for lists:

In [None]:
some_things = ('a', 12, "hello")
print(some_things)
print(len(some_things))
print(some_things[2])

The difference here is that tuples are instantiated using using round brackets (`(` and `)`), as opposed to the square brackets used for lists.

Tuples can be accessed using the same syntax and operations described for lists, including using slices, iteration, getting the length of a tuple, and testing if some value is in a tuple. However, since tuples are immutable, none of the operations and methods for adding or removing elements (described in the "Mutating lists" section) do _not_ apply to tuples.

## Dictionaries

Lists and tuples are ideal for storing data that is both sequential - that is, needs to be stored in some order - and related. This is because we can uses numerical indices and slices to access the data.

However, sometimes we want to store data that does not have a sequential structure. For example, we might have a phonebook, where each a person's name is associated with a telephone number. Lists wouldn't be suited to storing this data: we'd have to access each telephone number in order, rather than being able to look-up the number of a particular name.

To allow us to store this kind of data, Python has _dictionaries_. In essence, dictionaries store a mapping between a _key_ (something unique that we want to associate with a value - in our example above, this would be the person's name) and a _value_ (anything that we want to associate with the key - again, in our example, this is the person's telephone number). 

We can create a dictionary using this syntax:

In [1]:
some_telephone_numbers = {"Bob": "0123456789", "Alice": "98765434210"}
print(some_telephone_numbers)

{'Bob': '0123456789', 'Alice': '98765434210'}


Here, we've created a dictionary with two keys - "Bob" and "Alice" - with associated telephone numbers.

While we can use anything as a value, there are a few rules that restrict the types that we can use as keys. Importantly, keys need to be unique. For example, we couldn't have two keys called "Alice" in our phonebook. It follows, then, that we must be able to check if something is unique. Python does this using a process called _hashing_. We won't explore this further in this tutorial, but it is important to note that keys must be _hashable_. Finally, given that we want to enforce uniqueness, it is also the case that keys shouldn't be able to be changed: this would prevent us from checking that they are unique. As a result, keys must be _immutable_. That means that strings, integers, and tuples can be used as keys, but that lists and dictionaries (which as we'll see shortly are mutable) are not.

### Indexing

Just as with lists, we access a specific item in a dictionary by specifying the key in square brackets:

In [2]:
print(some_telephone_numbers["Alice"])

98765434210


If we try to index using a key that isn't in the dictionary, we'll get an error:

In [3]:
print(some_telephone_numbers["Maude"])

KeyError: 'Maude'

We can avoid this in two ways. First, we can check if a key is in a dictionary using the boolean `in` operator:

In [4]:
print("Maude" in some_telephone_numbers)

False


Or, we can use `get`, and specify a default value:

In [7]:
print(some_telephone_numbers.get("Maude", "Call the switchboard"))

Call the switchboard


While this syntax is similar to that for lists, we can't use slicing. This is because dictionaries are used to represent data that isn't sequential: slices wouldn't make sense.

### Mutating dictionaries

Since dictionaries are mutable, we can change the values that are associated with keys, delete a key (and its value) altogether, and add new keys:

In [None]:
some_telephone_numbers["Alice"] = "0123456789"
del some_telephone_numbers["Bob"]
some_telephone_numbers["Maude"] = "9876556789"
print(some_telephone_numbers)

In [10]:
a_tuple = ('a', 'b','c')

In [15]:
print(a_tuple[2])

c


We use the assignment syntax (`=`) to add and modify values, and the `del` operation to remove a key.

### Iterating over dictionaries

There are three functions that are useful for iterating through a dictionary:

In [16]:
print(some_telephone_numbers.keys())
print(some_telephone_numbers.values())
print(some_telephone_numbers.items())

dict_keys(['Bob', 'Alice'])
dict_values(['0123456789', '98765434210'])
dict_items([('Bob', '0123456789'), ('Alice', '98765434210')])


`keys` provides a list of the keys that are present in the dictionary; `values` gives a list of the values; and `items` gives a list of tuples, where the first element is the key, and the second is the associated value. These lists are all ordered by the insertion order of the keys: so, in this example, "Alice" was in the phonebook before "Maude", and so the dictionary, and the lists, are in that order.

When we want to iterate over a list, we can use the for loop:

In [17]:
for name in some_telephone_numbers:
    print(f"You can phone {name} on {some_telephone_numbers[name]}.")

You can phone Bob on 0123456789.
You can phone Alice on 98765434210.


As you can see, by default, the for loop operates over the list of keys. We can then use each key to access the related value in turn.

We could also iterate over the other generated lists, including `items`:

In [None]:
for name, phone_number in some_telephone_numbers.items():
    print(f"You can phone {name} on {phone_number}.")

In [18]:
csc_dict = {"CSC104":"Intro. to Number Theory", "CSC103":"Programming Concept I", "CSC105": "Logic Gates"}


In [20]:
for t in csc_dict:
    print(csc_dict[t])

Intro. to Number Theory
Programming Concept I
Logic Gates


This example also shows us Python's special syntax for iterating over lists of tuples. We can split the tuple automatically: here, the first element of the tuple is assigned to the first variable in the loop (`name`), and the second element is assigned to the second name (`phone_number`).

## Functions

In Python, a function is a named block of code that can be _called_ anywhere in your program. When a function is _called_, the named block of statements is executed sequentially. That means that the flow of execution of the program is passed to the function when it is called, and then returned to the place the function was called once the function is finished. We can define, and call, a function like this:

In [22]:
def say_int()

In [23]:
say_hi("Safiyyah")

Hi Safiyyah


In [None]:
def say_hello():
    print("Hello!")
    
say_hello()

In this example, we define a function called `say_hello` that contains a block of statements (there is only one statement in the block, `print("Hello!")`), and we then call the `say_hello` function.

There are a number of important points to note. First, the definition of the function does not alter the flow of the program. The Python interpreter still interprets the code line-by-line, in order. When it sees the function definition (beginning `def ..`), it does not evaluate the lines contained in the definition, but rather saves them, labelled with the name that we've given the function.

Second, we can see that the function is given a name. The naming rules for functions are the same as for variable names: they can contain letters, numbers, and underscores (`_`), but must begin with a letter or underscore.

Finally, we can see that the function is called by giving the function's name (`say_hello`), followed by a set of brackets (`()`). These brackets indicate that the name is a function, rather than a variable or other keyword. In addition, we'll see that we can _parameterise_ functions later on, and we'll pass arguments to the function by putting them in these brackets.



### Parameterising functions

Just as `if` statements allowed our code to branch and vary depending on the value of data in our program, functions can also be made more useful by being expressed in terms of a set of _parameters_. This means that we can define our function in terms of a set of variable names that are given values when the function is called. For example:

In [None]:
def say_hello(name):
    print(f"Hello, {name}!")
    
say_hello("Bob")
say_hello("Alice")

In this example, we've defined the `say_hello` function in terms of the `name` parameter. We use the variable name `name` in the definition of `say_hello`, but the value of this variable isn't known until the function is called. We can see the function being called twice with different names, and the appropriate message being printed given the argument that is passed to the function. It is important to note the different terms used here: a _parameter_ is the variable name we use when we define the function, while an _argument_ is the value we set those parameters to when we call the function.

A function can have multiple parameters:

In [None]:
def divide(x, y):
    return x / y

print(divide(4, 2))

In this example, there are two parameters, `x` and `y`, that are set to `4` and `2` respectively, when the function is called. Each parameter is set to the value given in the function call in the order that is given in the function definition. These are called _positional arguments_, given that they are assigned their value based on their position in the definition. That means that the order in which we provide the arguments matters:

In [None]:
print(divide(2, 4))

Sometimes, trying to remember the order in which arguments need to be provided can be confusing. If we get the order wrong, as in the above example, we can end up with unexpected results. To overcome this, we can use _named arguments_:

In [None]:
def divide(dividend=1, divisor=1):
    return dividend / divisor

print(divide(4, 2))
print(divide(dividend=4, divisor=2))
print(divide(divisor=2, dividend=4))
print(divide())

Here, we redefine our `divide` function. This time, we've given the parameters names: the first is called `dividend`, while the second is called `divisor`. As shown, we name the parameters using the assignment syntax. This also specifies a default value: if we don't set the parameter, then the default value is used. As the example shows, we can call the function in the same way as we did before, in which case the arguments are set in the order they are given. However, we can also see that, using the argument names, we can switch the order and get the results that we expect. Finally, we see what happens when we use the default values.

Named arguments are useful where parameters are optional and can sensibly have default values. When defining a function with named arguments, these need to come last in the list of parameters, after any positional arguments.

### `return`

In our example functions so far, our functions have "done something" (i.e., printed a message). However, function calls in Python are also expressions, so they evaluate to a value:

In [None]:
print(say_hello("Bob"))

When we execute this example, we'll see two lines: the first is the message printed by the `say_hello` function, while the next line will read `None`. That is because we're printing out the value that function evaluates to. We can `return` a value in our function, and that is the value that our function will evaluate to. However, by default, functions return `None` if no other value is given. For example, instead of printing the message, our function might return it:

In [None]:
def say_hello(name):
    return f"Hello, {name}!"

say_hello_bob = say_hello("Bob")
say_hello_alice = say_hello("Alice")

print(say_hello_alice)

In this example, we create two variables `say_hello_bob` and `say_hello_alice` that contain a string generated by the `say_hello` function. Rather than the function printing a message, they construct a string instead, and `return` the generated string. This means that `say_hello_bob` contains the string `Hello, Bob!`, and `say_hello_alice` contains the string `Hello, Alice!`. We can then go on to print these variables if we want, as we do with `say_hello_alice`.

This can also be illustrated using functions that work with numbers:

In [None]:
def plus_one(x):
    return x + 1

print(1 + plus_one(3))

In this example, we define `plus_one`, that takes a number, `x`, and returns that number plus 1. We then print the expression `1 + plus_one(3)`, which contains a call to the `plus_one` function. `plus_one(3)` evaluates to `4`, and `1 + 4` evaluates to `5`, which is the value that is printed.

Flow returns to the place where the function is called as when the `return` statement is executed, wherever it is placed in the function definition. For example:

In [None]:
def return_in_the_middle():
    print("hello")
    return
    print("world")
    
return_in_the_middle()

In this example, when the `return_in_the_middle` function is called, `hello` is printed before the function returns. The second print statement (`print("world")`) is never executed, because the `return` statement is before it.

## Scoping rules

The _scope_ of a name (e.g., for a variable or a function) in our program is the area in the program where we can access that name. Once we define a name, it'll only be accessible within the scope in which it is defined. Scoping rules are needed so that we can reason about which name we're accessing and which names are accessible at a given time.

Throughout this track, we'll use two scopes: the _global scope_, where names that are defined are available everywhere in the code, and _local scope_, where names that are defined are only available within that scope.

Scoping rules can be difficult to understand, so we'll illustrate them with some examples. First, consider variables created inside a function definition:

In [None]:
def say_hello(name):
    message = f"Hello, {name}!"
    return message

say_hello("Bob")
print(message)

This will result in an error: `message` is defined inside the scope of the `say_hello` function, and is therefore not accessible outside it. This is useful if we want, for example, to use the same name in multiple scopes:

In [None]:
def say_hello(name):
    message = f"Hello, {name}!"
    return message

def say_goodbye(name):
    message = f"Goodbye, {name}!"
    return message

print(say_hello("Alice"))
print(say_goodbye("Alice"))

In this example, we have two sets of variables that have the same name: `name` and `message` are both used in the definitions of `say_hello` and `say_goodbye`. Python's scoping rules allow us to define these functions in this way, because there is no ambiguity about which name refers to which value. A fresh local scope is created for each function call. For example:

In [None]:
print(say_hello("Bob"))
print(say_hello("Alice"))

In this example, we call the `say_hello` function twice, each time with different parameters. This creates a new instance of both the `name` and `message` variables, within their own scope, each time the function is called.

Finally, we can also create variables in the global scope. For example:

In [None]:
time_of_day = "morning"

def say_hello(name):
    message = f"Good {time_of_day}, {name}!"
    return message

print(say_hello("Alice"))

Here, we have defined a variable, `time_of_day`, in the global scope. That means that this variable is available throughout the code, any where after it has been defined. This is useful for sharing data throughout different functions. However, it is easy to misuse the global scope: you should consider whether or not data might be better passed as an argument to the function.

We might define variables with the same name in different scopes:

In [None]:
message = "hello, bob!"
time_of_day = "morning"

def say_hello(name):
    message = f"Good {time_of_day}, {name}!"
    return message

print(message)
print(say_hello("Alice"))

In this example, we've defined two `message` variables: the first is in the global scope, while the second, set in the `say_hello` function, is in its local scope. Python's scoping rules mean that statements and expressions will refer to the name in the closest scope, preferring the local and then global scope. However, while these rules allow your code to run unambiguously, you should generally avoid reusing variable names in different scopes: it can make your code more difficult to read and debug.

## Using functions to improve your code

Now that we've been introduced to functions and the scoping rules that apply to them, we can see that there might be a number of reasons to make use of functions in our code:
- To split our code into blocks that are easier to write, and to debug later;
- To make it easier to reuse code, within the same program, across different programs, and across different projects;
- To make use of the scoping rules that give us a new _local scope_ to work within.

In general, as you write longer programs, you should be thinking about dividing your code into functions. Broadly, we wants to write functions that are small (in the order of 10s of lines of code) and that do one thing. In the problem set for this tutorial, we'll explore when and how to write useful functions.




## Summary

In this tutorial, we've recapped a number of Python concepts, including:
- Data types;
- Formatted strings;
- Getting input from the keyboard;
- If statements;
- Loops and iteration;
- Data structures, like lists, dictionaries, and tuples; and
- Functions.

In [4]:
import random
A_Z = [chr(65+i) for i in range(26)]
a_z = [chr(97+i) for i in range(26)]
special_characters = [chr(33+i) for i in range(15)]

# print(f"{A_Z}")
# print(f"{a_z}")
# print(f"{special_characters}")

def generate_random_string():
    random_string = "".join([random.choice(A_Z + a_z + special_characters) for i in range(16)])
    return random_string


print(generate_random_string())

niq!zUc"GkY(K$Pb


In [13]:
random.choice(sunaye)

'Ibrahim '

In [7]:
sunaye = ["Hajara ","Umaru ", "Ibrahim "]
print("".join(sunaye))

Hajara Umaru Ibrahim 
