# Welcome to the Intermediate Python Workshop

## Loops beyond for and while

This notebooks will give you an intermediate introduction to Loops in Python.
Here is a [beginners guide to For and While loops](https://www.youtube.com/watch?v=6iF8Xb7Z3wQ). We won't cover this here!

Eoghan O'Connell, Guck Division, MPL, 2023

In [1]:
# notebook metadata you can ignore!
info = {"topic": ["list comprehensions", "numpy", "tqdm"],
        "version" : "0.0.1"}

### How to use this notebook

- Click on a cell (each box is called a cell). Hit "shift+enter", this will run the cell!
- You can run the cells in any order!
- The output of runnable code is printed below the cell.
- Check out this [Jupyter Notebook Tutorial video](https://www.youtube.com/watch?v=HW29067qVWk).

See the help tab above for more information!


# What is in this Workshop?
In this notebook we cover:

- List comprehensions
   - Generator expressions
   - dict comprehensions

- When not to use loops
- TQDM progress bar

-----------
## List comprehensions

In Python we can loop over an iterable with a `for` or `while` loop.

We can also loop over lists with list comprehesions.

- There is a similar syntax used for generator expressions (which we can convert to tuple comprehensions).
- There is also dict comprehensions!

List comprehensions can be quite fast and look compact! We will see their speed below.

### Example List comprehension

Imagine we have a list of number and want to square each value. We would usually use a `for` loop...

In [26]:
my_lst = [2, 3, 5]

my_sq_lst = []
for i in my_lst:
    my_sq_lst.append(i**2)

print(my_sq_lst)

[4, 9, 25]


In [3]:
# let's do this with a list comprehension

my_sq_lst = [i**2 for i in my_lst]  # break this down!
print(my_sq_lst)

[4, 6, 10]


In the list comprehension, you can read it as the `for i in my_lst` part first, then do `i**2` to every `i`

Of course, we can add extra stuff which makes it slightly more complicated...

In [4]:
# using if else statements in a list comprehension

my_sq_lst = [i**2 if i>2 else i-5 for i in my_lst]  # break this down!
print(my_sq_lst)

[-3, 6, 10]


Let's break that down... How do we make sense of it???

Read it like this:
- `for i in my_lst`
- do `i**2` to every `i`
- then it feels like a normal if else clause `if i>3 else i-1`

It can take a while to get used to this syntax. But it is super fun, compact and can be very fast.

Let's look at how fast it is...

In [5]:
# speed of for loop vs. list comprehension
my_lst = list(range(10000))
my_sq_lst = []

%timeit for i in my_lst: my_sq_lst.append(i**2)

616 µs ± 32.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [6]:
my_lst = list(range(10000))
my_sq_lst = []

%timeit my_sq_lst = [i**2 for i in my_lst]

267 µs ± 4.78 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Wow, 3 times faster! That's great!

We can also use similar sytax for generator expressions (which we can convert to tuples or lists easily if need be).


### Generator expressions

A generator is an iterable but really fast. It is fast because the generator doesn't hold all of its information in one place. It "generates" it on the fly when you need it.

Let's compare syntax and speed of a generator expression to a list comprehension.

In [7]:
# create generator and list

my_generator = (i for i in range(10000))
my_lst = [i for i in range(10000)]

In [8]:
# loop over the generator and list in different ways...

print("\nloop with list comprehension")
my_sq_lst1 = []
%timeit my_sq_lst1 = [i**2 for i in my_lst]  # loop with list comprehension

print("\nloop with generator within list compr")
my_sq_lst2 = []
%timeit my_sq_lst2 = [i**2 for i in my_generator]  # loop with generator within list compr

print("\nloop with generator within generator expression")
my_sq_lst3 = []
%timeit my_sq_lst3 = (i**2 for i in my_generator)  # loop with generator within generator expression

print("\nloop with generator in a for loop")
my_sq_lst4 = []
%timeit for i in my_generator: my_sq_lst4.append(i**2)  # loop with generator in a for loop

assert my_sq_lst1 == my_sq_lst2 == my_sq_lst3 == my_sq_lst4  # make sure they are all the same result


loop with list comprehension
279 µs ± 4.03 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

loop with generator within list compr
104 ns ± 3.14 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

loop with generator within generator expression
180 ns ± 3.93 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

loop with generator in a for loop
18.3 ns ± 0.0317 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


So it looks like using a generator speeds up our looping by quite a lot!

Lesson learned, if you can use a generator to loop over an iterable, do!

### Dict comprehensions

The syntax for a dict comprehension is slightly more complicated than in a list comprehension because have both key and value!
Here is the syntax: `{key: value for (key, value) in iterable}`

**HOWEVER**, you might be able to use the `dict` constructor directly for your use-case. Let's look at them all!


In [9]:
# I have two lists and want to make a dictionary. We can do this with a for loop or a dict comprehension

my_keys = ["name", "age", "appearance"]
my_values = ["Earth", 4.54e9, ["tiny", "blue", "dot"]]

# use a for loop
my_dict = {}
for key, value in zip(my_keys, my_values):
    my_dict[key] = value
print(f"for loop:           {my_dict}")


# dict comprehension
my_dict = {key:value for (key, value) in zip(my_keys, my_values)}
print(f"dict comprehension: {my_dict}")


# dict constructor called directly
my_dict = dict(zip(my_keys, my_values))
print(f"dict constructor:   {my_dict}")


for loop:           {'name': 'Earth', 'age': 4540000000.0, 'appearance': ['tiny', 'blue', 'dot']}
dict comprehension: {'name': 'Earth', 'age': 4540000000.0, 'appearance': ['tiny', 'blue', 'dot']}
dict constructor:   {'name': 'Earth', 'age': 4540000000.0, 'appearance': ['tiny', 'blue', 'dot']}


In [10]:
# and the speeds...

print("\nfor loop")
my_dict = {}
%timeit for key, value in zip(my_keys, my_values): my_dict[key] = value

print("\ndict comprehension")
my_dict = {}
%timeit my_dict = {key:value for (key, value) in zip(my_keys, my_values)}

print("\ndict constructor")
my_dict = {}
%timeit my_dict = dict(zip(my_keys, my_values))



for loop
250 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

dict comprehension
357 ns ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

dict constructor
284 ns ± 8.45 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


Looks like the for loop and dict constructor are very fast.
This is because in Python looping over and accessing a dict is very fast.

The for loop and the list comprehension add extra options. For example, you can manipuate the keys or values during dict creation.

## When not to use loops

Sometimes you don't need to use loops, or you might find that your code is slow because of many loops.

If you are working with lists with numbers, arrays or matrixes, you can use numpy to very fast matrix calculations.



In [11]:
# example when to use numpy instead of looping over items

## TQDM progress bar

One can add a nifty progress to any iterable in Python with the TQDM Python package. Just install with `pip install tqdm` and use as below...

See documentation here: https://tqdm.github.io/

In [17]:
# Normally we do: from tqdm import tqdm
# but in notebooks we do
from tqdm.notebook import tqdm

In [37]:
# using tqdm with for loop

for i in tqdm(range(1_000_000)):
    _ = i**6


  0%|          | 0/1000000 [00:00<?, ?it/s]

In [38]:
# use tqdm with list comprehensions

_ = [i**6 for i in tqdm(range(1_000_000))]

  0%|          | 0/1000000 [00:00<?, ?it/s]

In [40]:
# we can't use tqdm well with generators because they don't know their length! A generator only calls the "next" item...

my_generator = (i for i in range(1_000_000))

_ = (i**6 for i in tqdm(my_generator))

0it [00:00, ?it/s]