# MTH4000 Programming in Python - Lecture 2
Module Organisers: Dr Matthew Lewis and Prof. Thomas Prellberg

## Modules and Packages

Last week, we defined a *module* to be a file that contained code that is readable by Python.  We further explored this concept by detailing examples of several key modules that we shall be using throughout this course; such as `math` and `numpy`.  

Modules are useful because they allow us to quickly define variables, functions and data types without having to rewrite their code each time.  The `math` module for instance, contains an implementation of the cosine function, and accessing this function is far more straightforward than writing new code from scratch.  It can be done instantly with the `import` keyword.

In [None]:
import math

math.cos(0.0)

More specifically, modules are text files that end with the file extension `.py`.  The text inside these files is Python code, and importing a file simply tells the interpreter to run this code in the active code environment.  Below, we can see the contents of the file `keyword.py` that we imported last week.

<div>
<img src="attachment:Python%20modules.png" align="left" width="450"/>
</div>

<div>
<img src="attachment:Python%20keyword%20module.png" align="right" width="450"/>
</div>

Python also contains folders called *packages*, which are collections of useful modules all bundled together.  The modules inside packages are normally relevant to each other, and often import each other's content.  Below is a list a filespace containing several packages, some of which should appear familiar.

<div>
<img src="attachment:Python%20packages.png" align="center" width="500"/>
</div>

*Note that even though `numpy` and `matplotlib` are officially packages, we will still mostly refer to them as "modules" for convenience.*

Packages can either be imported all at once:

In [None]:
import numpy

Or a single module can be imported from a package:

In [None]:
import matplotlib.pyplot

Here we have imported the `pyplot` module from the `matplotlib` package.

<div>
<img src="attachment:Python%20matplotlib%20modules.png" align="center" width="500"/>
</div>

Note that it may soon prove frustrating to type out this full file directory every time we wish to use a function inside the `matplotlib.pyplot` library:

In [None]:
matplotlib.pyplot.plot([-2,-1,0,1,2],[4,1,0,1,4])
matplotlib.pyplot.show()

Fortunately, Python allows us to assign new names to imported modules.  This is done via the `as` keyword.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x=np.linspace(-2,2,5)
plt.plot(x,x**2)
plt.show()

In fact, we can completely bypass the need to prepend each variable with the relevant file directory by importing specific variables from a module.  This can be done with the `from` keyword.

In [None]:
from math import pi

pi

We can even import several variables at once.

In [None]:
from math import e, floor

floor(e)

The `as` keyword can also be used here, allowing us to rename the variable `pi` to something a bit more contentious.

In [None]:
from math import pi as sqrt2
# Unacceptable (but technically permissible)

sqrt2

Python even allows us to use this feature to import *everything* at once.  This is done by using an asterisk instead of a variable name, like so:

In [None]:
# from math import *

This is, however, generally considered a bad idea.  There may be functions from very different modules that happen to have the same name, and you would never be entirely certain which one you are calling.  (It would have to be the last function to be imported, but it's extremely easy to lose track of the order in which modules have been imported).

Consider, for example, the cosine function.  The `math` module contains an implementation of this function, but it only works for numerical inputs, not lists or arrays.

In [None]:
# This does not work!
# print(math.cos([0,pi/2,pi]))

The `numpy` module also contains an implementation of cosine, but this version *does* allow for the input to be a list.

In [None]:
# This does work!
print(np.cos([0,pi/2,pi]))

For this reason, importing variables en masse should be done sparingly.

Note that we could also retain the `math` version and use the code:

In [None]:
# This also works, but is a bit more laboured.
print([math.cos(0), math.cos(pi/2), math.cos(pi)])

But you see that even the output looks different. There is a reason for that, and we will come back to this later. For now, I would like to recommend that you do ***not*** use import in a way that the reference to the module gets lost.

This section shows how versatile Python is when it comes to importing modules.  But good code discipline remains important, and just because `from math import pi as sqrt2` is valid code, it does not mean that it's convenient or desirable.  In general, it's far safer to stick to the conventional (and fairly standard) way of importing these modules:

In [None]:
import math
import numpy as np
import matplotlib.pyplot as plt

## Input/Output

### Default cell output versus `print()`

As mentioned last week (and used above), the `print` function can be used to write information directly underneath a code box.  Unlike calling a variable, a `print` statement does not have to be the final piece of code to be executed in order for the value to be displayed, the `print` function can be called anywhere inside the code.

In [None]:
n=5
# This is some value.

print(n)
# I can print this value.

2*n
# Even if I ask for something else to be computed, the value above is still printed.

We have already noted the difference between a value being printed and a value being returned.  The value $2n$ has been called in the code box above, and since this call is the last piece of code to be executed in the box labelled "<span style="color:blue">In [7]:</span>", the result is given the serialised read-out label "<span style="color:red">Out [7]:</span> 10".  This is in contrast to the value $n$ which was only printed, and so does not have a read-out label.

Compare the four following boxes, which all behave differently. Notice the differences and make sure you understand the reason why they are different.

In [None]:
pi
2*pi

In [None]:
print(pi)
2*pi

In [None]:
pi
print(2*pi)

In [None]:
print(pi)
print(2*pi)

### The `input` Function

It is sometimes necessary for a script to prompt the user for further information while the code is still being run.  For example, suppose we want to find the sum of two integers, `a` and `b`.  We could use the following code:

In [None]:
a=10
b=5
print(a+b)

This works perfectly well, but if we want to change the values of `a` and `b`, then we would need direct access to this script.  This is not a problem here, since we can simply edit the box above.  Consider however, a situation where this script was being run from another file, possibly as the result of an import.  How then could we let the user decide on the values of `a` and `b`?

Such functionality can be achieved with the `input` function.  The `input` function prompts the reader to submit a text string, and then returns this text string back to the interpreter.  If the result of the input function was assigned to a variable, then this variable can be used for further computations, like so:

In [None]:
txt=input("How are you today?\n")

print("\nThe user has said they are: ",txt)

Note that there is an optional argument allowing you to print a text string to guide the user in how they should respond to this prompt.

For our example above, we would need the value returned to be an integer, not a string.  We can use the function `int` to perform this conversion.

In [None]:
a=int(input("Enter the  first number: "))
b=int(input("Enter the second number: "))
a+b

Note that the values were returned to the interpreter as strings, but the `int` function converted them into integers, allowing them to be summed.

In [None]:
s=input()
print("You have entered",s,"which is of type",type(s))
a=int(s)
print("This is converted to",a,"which is of type",type(a))

Can you guess which function you would have to replace `int` with in order to get a conversion to a floating point number?

## Sequences

Last week, we discussed numerical data types and introduced the use of variables in Python. We now will learn how combine data in sequences and how to use these sequences.

Built into Python are [sequence types](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range): list, tuple, and range. The main differences between these sequence types is in how they are generated and stored by Python

* Lists (denoted `list`) e.g. [1,2,-4], [-3.2,1.0,2.9], ['hello','world']
* Tuples (denoted `tuple`) e.g. (1,2,-4), (17.1, 'spam', 93)
* Ranges (denoted `range`) e.g. `range(5)`, `range(2,7)`, `range(8,1,-1)`

## Lists

Lists are objects that are typically used to store collections of homogeneous items (i.e. every element should have the same data type).  A list is created with square brackets `[ ... ]`, with items in the list separated by commas. A list of the squares of the first six integers is created by the following code. 

In [None]:
[1,4,9,16,25,36]

As with all other data types, lists can be assigned to a variable and called.

In [None]:
squares=[1,4,9,16,25,36]
squares

The `type` function recognises lists.

In [None]:
type(squares)

As mentioned above, lists can contain other data types, including lists.

In [None]:
points=[[0,1],[2,3],[5,-1],[-2,-7]]
print(points)

There is a good reason why I have used the `print` function above. As we saw in the last lecture, an assignment does not produce any output.

In [None]:
more_squares=[1,4,9,15,25,36,49,64]

Just because the box above gave no output, it does not mean that the code was not run.  Indeed, simply calling the variable `more_squares` returns to us our list.

In [None]:
more_squares

Alternatively, we could display our list using the `print` function. Remember, this displays the results, but it does not produce output. 

In [None]:
print(more_squares)

Moreover, `print` often formats values more neatly than default cell output. Which of the two formattings below would you prefer?

In [None]:
long_list=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,\
           21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,\
           41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,\
           61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,\
           81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
print(long_list)
long_list

### Indexing Lists

The main functionality of sequence data types is that their individual elements can be accessed by referring to their index.  The elements of a list, for instance, can be returned by writing the name of the list and following it with an index value contained between a pair of square brackets.

The only subtlety here is that **Python indexes all sequences starting from zero**.  So if you wish to obtain the first entry of a list, you must ask for the **zeroth entry**, not the first!

![Screenshot%202023-09-27%20153732.png](attachment:Screenshot%202023-09-27%20153732.png)

In [None]:
primes=[2,3,5,7,11,13,17,19,23,29,31]
print(primes[0])

In [None]:
print(primes[1])
print(primes[6])
print(primes[10])

A peculiarity of Python (when compared to other programming languages) is that *negative* indices allow you to access elements starting from the *end* of the list, with the last entry of the list having index -1.

In [None]:
print(primes[-1])
print(primes[-2])
print(primes[-3])

### Operations on List Entries

Just as we can perform arithmetic operations on variables, so too can we perform these computations by calling entries in a list.  For instance:

In [None]:
print(primes[0]*primes[1])

In [None]:
print(primes[-1]-primes[0])

In [None]:
print(3*squares[3]*primes[4])

In fact, the entries of the list `primes` are integers, and so anything that we can do with integers, we can equivalently do with the entries of this list.

In [None]:
type(primes[7])

A similar property holds for lists of lists.  Consider our list `points` for example:

In [None]:
points

If we ask for the type of an entry:

In [None]:
type(points[1])

We are correctly informed that `points[1]` is a list.  We can therefore perform any operation on `points[1]` that we could perform on any other list, including indexing.

In [None]:
points[1][0]

We can print off each step of this computation to clarify what has happened here:

In [None]:
print(points) # This calls the overall list.
print(points[1]) # This calls the entry at index 1, which is another list.
print(points[1][0]) # This calls the entry at index 1, which is another list, and then calls the entry of that list at index 0.

### Mutability

Lists are mutable objects.  This means that we can assign new values to individual entries in a list, leaving the others unchanged.  This is done by calling the list for a certain index value, and then using the standard assignment operator `=` as if we were defining a standard variable.

In [None]:
primes[1]=4
print(primes)

Note that the list `primes` has been permanently changed.  The original `primes` is no longer in the system memory.

### Concatenate

Lists can be easily concatenated (joined together) by using the addition operator `+`.

In [None]:
fibonacci=[1,1,2,3,5,8,13,21,34,55,89,144,233]

print(fibonacci+squares)
print(squares+fibonacci)
print(squares+squares)

Note however, that in each of these cases, a new list has been returned to us.  The original lists, `fibonacci` and `squares`, have remained unchanged.

In [None]:
print(fibonacci)
print(squares)

### Append

We can also append entries to a list (that is, add new entries to the end of the list).  This is done with the `append` method.

In [None]:
print(squares)
squares.append(49)
print(squares)

The `append` method modifies the list `squares` by permanently assigning it this new entry.  Note that this is different to evaluating the list `squares+[49]`, which would have returned the same list to us, but would have left the original list `squares` unchanged.

In [None]:
squares=[1,4,9,16,25,36]
print(squares+[49])
print(squares)

The list `squares+[49]` is created as a completely new list in the box above, and since it was not assigned to any variable, it was immediately forgotten by the interpreter.

The use of `squares.append()` is an example of an object method. Any variable of type `list` has a variety of functions (called *methods*) associated with it, that can be used to modify the data contained in the variable. There are many more [list methods](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists), and you will encounter some of these in the tutorial. We will come back to this later.

## Tuples

Tuples are objects that are typically used to store collections of heterogeneous data (i.e. elements may have different data types).  As opposed to lists, which are created with square brackets `[ ... ]`, tuples are created with normal parentheses `( ... )`.

In [None]:
today=(2023,'January',30)
print(today)

### Indexing Tuples

Indexing for tuples is analagous to list indexing.

In [None]:
print(today[0])
print(today[1])
print(today[2])

### Operations on Tuple Entries

As with lists, if two entries in a tuple are of a data type that allows for a certain operation to be performed on them, then indexing allows us to perform computations directly on tuple entries.

In [None]:
today[0]+today[2]

### Immutability

The key difference between lists and tuples is that tuples are immutable.  That is, the entries of a tuple cannot be overwritten by assigning a new value to an index call.

In [None]:
# This does not work!
# today[2]=31

## Ranges

Ranges are objects that evaluate sequences of integers that are increasing with consistent step-size (i.e. arithmetic progressions).  Unlike lists and tuples, they cannot be manually constructed by writing out each of their elements.  They can only be constructed using the built-in `range` function.

The `range` function accepts three arguments; `range(a,b,step)`.  The value `a` is the start value; the first integer to be returned by this object.  The value `b` is the stop value; any value greater than or equal to b is not returned by this object.  The value `step` is the step-size; the difference between consecutive values returned by this object.  Each argument, `a`, `b`, `step`, must be an integer.

We can attempt to print an example of such an object:

In [None]:
digits_range=range(0,10,1)
print(digits_range)

But we find that Python is surprisingly unhelpful.  This is because of how the `range` object is computed.  Python does not compute *every* value inside the object at the point of creation.  In fact, Python doesn't compute *any* of the values inside the object until they are explicitly called by the interpreter.  The only values it currently knows are the start, stop and step values specified by the user.

The advantage of this is memory.  Python only needs to remember three values at a time, the current start value, the stop value and the step-size.  This is true regardless of whether the full progression should have 3 entries or 30000000.  The price we pay for this efficiency is that Python does not know whether an entry will be inside the range before it's computed, and so we cannot print out `range` objects.

We can however, convert them to lists using the built-in `list` function, and *then* print them out.

In [None]:
list(digits_range)

The fact that there is a built-in `list` function is an important reason why you should never assign your lists the variable name `list`.

Note that if we do not specify a step-size, Python provides a default step-size of $1$.  Similarly, if no start value is provided, a default value of $0$ is assumed.  For this reason, we could have written the above as:

In [None]:
digits_range=range(10)
list(digits_range)

We can use `range` objects to evaluate all even numbers between 0 and 100 (exclusive).

In [None]:
even_range=range(0,100,2)
even_list=list(even_range)
print(even_list)

Note that since $100$ is not less than $100$, it was not included in the range.

We can also create range objects with negative step-size.

In [None]:
print(list(range(20,0,-1)))

Even though ranges cannot be printed, their entries can still be called via indexing.

In [None]:
range(10,20,3)[2]

(This does, however, mean that the entries have to be recomputed every time).

Just like tuples, ranges are immutable.

In [None]:
# This does not work.
# range(10,20,3)[2]=17

## Unpacking

A nice feature of Python is that one can assign all entries of a sequence to variables in a single operation. This is known as *unpacking* the sequence. For example, for the tuple `today` defined above, we can assign the contents to `year`, `month`, and `day` as follows:

In [None]:
year,month,day=today
print(year)
print(month)
print(day)

## List Comprehensions

So far, we have seen that we can create lists in Python by writing them out by hand, which is inefficient, or by using the ranges, which is efficient but restricted to equally-spaced integers. What if we wanted to generate a list of the first 71 square numbers? There is a nice construction in Python called [list comprehensions](http://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) that allows us to do just this:

In [None]:
print([n**2 for n in range (1,72)])

We have just generated a list using the syntax "\[*expression* **for** *item* **in** *iterable*\]", where 

* *iterable* is a `range`, `list`, `tuple`, or any other kind of sequence object
* *item* is a variable name which sequentially takes each value in the iterable
* *expression* is a Python expression which is evaluated for each value of *item*

Let's compare the following three ways of generating a list of the first 20 integers. We can generate this list by hand, create the range object and convert it to a list, or use list comprehension.

In [None]:
list1=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
print(list1)
list2=list(range(1,21))
print(list2)
list3=[n for n in range(1,21)]
print(list3)

However, the list comprehension syntax is the most useful if we can explicitly express the $n$-th element of a list. Here are some more examples.

In [None]:
print([n%3 for n in range(21)])
print([k**k for k in range(1,10)])
print([math.factorial(i) for i in range(3,20,2)])
print([1 for n in range(30)])

Remember that the *iterable* does not need to be a range object, but can be a list.

In [None]:
list0=[n**2 for n in range(11)]
print(list0)
list1=[n**0.5 for n in list0]
print(list1)

#### Local (dummy) variables

Note that in the above creation of lists, I have chosen different names for the variable. These variables are called *local* (or *dummy*) variables.  They are only defined within the scope of the list comprehension, and are forgotten the moment the list has been computed. Their name does not matter at all, it is just a place holder.

In [None]:
[a for a in range(3)],[_2 for _2 in range(3)],\
[some_name_I_dont_care_about for some_name_I_dont_care_about in range(3)]

Neither `a`, `_2`, or `some_name_I_dont_care_about` are known after the previous code has been run

In [None]:
# After the first list comprehension was computed, the variable _2 was forgotten, so this call will be invalid.
# _2

## Conclusion and Outlook

In this lecture we have discussed sequences. Next week we will talk a bit more about sequences (specifically, a technique called "slicing" that allows us to formulate subsets of lists) and continue with functions.