# Python Lists Extended: List and Dictionary Comprehension

In [None]:
#### library import cell
import numpy as np
from random import randint
from music21 import *

**Author:** Sebastian Klassmann, sklassm1@uni-koeln.de  
  
Assumed Reading: Guttag (2016), chapter 5: structurd types, mutability and higher order functions

Date: October 20$^{th}$, 2022  
Libraries used: numpy, music21, random  
Python Version: 3.7  
Other dependencies: None

-----

# A short introduction to list comprehension and dictionaries in Python 3

List comprehension...
* can serve to reduce number of for loops used in your code
* according to some, computes more efficiently than loop structures and - in tandem with recursive functions - can be used to express anything that can be expressed with loops.
* can express and achieve things in a tidy and short fashion
* will usually **build a list**.

In [None]:
list_something = ["a","b","c","d"]
for (i,e) in enumerate(list_something):
    print(i,e)
list_b = [(i,e) for i, e in enumerate(list_something)]
print(list_b)

In [None]:
list_c = [[i,e] for i, e in enumerate(list_something)]
print(list_b)
print(type(list_b[0]))
print(list_c)
print(type(list_c[0]))
list_d = list_b.copy()
list_d.append(list_c)
print(list_d)

Basic structure:  
```[expr for val in collection]```
Where expr is any Python expression, val has to be an object that can be evaluated to a value and collection is a Python iterable.

In [None]:
listA = []
for value in range(25):
    listA.append(value)
print(listA)

In [None]:
listB = [value for value in range(25)]
print(listB)

**Expression vs. Statement?** (have a look at Guttag (2016), chapter 2 - pp. 9f.)

* Objects and operators can be combined to form **expressions**, each of which evaluates to an object of some type. We will refer to this as the value of the expression.
* A **command**, often called a **statement**, instructs the interpreter to do something.

Let's check out how it works by starting from a list.

In [None]:
list1 = [1,2,3,4,5]
print(list1)

Now, if we want to add 2 to every item in this list, we can do so using for loops:

In [None]:
for i in range(len(list1)):
    list1[i] += 2
print(list1)

However, this can also easily be achieved by using list comprehension:

In [None]:
list2 = [(item+2) for item in [val for val in range(89)]]
print(list2)

In [None]:
def square_e(x):
    
    epsilon = 0.01
    step = epsilon**2
    numGuesses = 0
    ans = 0.0
    while abs(ans**2 - x) >= epsilon and ans <= x:
        ans += step
        numGuesses += 1
    #print('numGuesses =', numGuesses)
    if abs(ans**2 - x) >= epsilon:
        #print('Failed on square root of', x)
        return None
    else:
        #print(ans, 'is close to square root of', x)
        return ans

In [None]:
squarelist = [i**2 for i in range(2,12)]
print(squarelist)

In [None]:
rootlist = [round(square_e(item),0) for item in squarelist]
print(rootlist)

In [None]:
list2 = []
for item in range(89):
    list2.append(item+2)
print(list2)

Why is the distinction between statements and expressions important?

There is some statements in Python that you would not intuitively seem to be a statement. Yet, they do not work in list comprehensions:

In [None]:
list1 = [1,2,3,4,5]

In [None]:
print(list1)

In [None]:
list2b = [a+=1 for a in list1]
print(list2b)

In [None]:
list2c = [a+1 for a in list1]
print(list2c)

$\rightarrow$ In Python, variable assignments are **assignment statements** and therefore can not be used directly in list comprehensions.
* Think about the implications for a second.

### Exercise!  
  
1. Define a list that contains the following integers: ```[2,4,7,3]```  
2. using listname.copy(), make a copy of the list.
3. Using a for loop, square every item in the list.
4. Using the copy of our initial list, print a list containing the squares of every integer in the list. You are not allowed to use loops in this one, make use of list comprehension!

In [None]:
i_list = [2,4,7,3]
c_list = i_list.copy()
print(i_list)
print(c_list)

In [None]:
for item in i_list:
    item = item **2
print(i_list)

In [None]:
for i, _ in enumerate(i_list):
    i_list[i] = i_list[i]**2
print(i_list)

In [None]:
c_list = [item**2 for item in i_list]
print(c_list)
print(i_list)

----

Here's another example solved in both ways:

In [None]:
list3 = []
for i in range(1,11):
    list3.append(i**2)
print(list3)

In [None]:
list4 = [i**2 for i  in range(1,11)]
print(list4)

Can you use functions inside list comprehensions? Sure, if the function in itself fulfills the criterion for being an expression (if it evaluates to an object having a value):

In [None]:
## modified code from Guttag (2016), figure 3.4:

def rootit(x):
    epsilon = 0.01
    numGuesses = 0
    low = 0.0
    high = max(1.0, x)
    ans = (high + low)/2.0
    while abs(ans**2 - x) >= epsilon:
        # print('low =', low, 'high =', high, 'ans =', ans)
        numGuesses += 1
        if ans**2 < x:
            low = ans
        else:
            high = ans
        ans = (high + low)/2.0
    # print('numGuesses =', numGuesses)
    # print(ans, 'is close to square root of', x)
    return ans

In [None]:
rootablelist = [i**2 for i in range(2,64,2)]
print(rootablelist)
# print(rootablelist)

In [None]:
rootlist = [round(rootit(element),0) for element in rootablelist]
print(rootlist)

-----

Let us now consider a list of names:

In [None]:
list5 = ["Adam", "Sven", "Andrea", "Jim", "Tim", "John", "Anette", "Julia"]
print(list5)

What if we want to create another list that only contains those names that start with "A" from the above?

In [None]:
list6 = []
for item in list5:
    if item.startswith("A"):
        list6.append(item)
print(list6)

That seems to be working. However, list comprehension can also contain if-conditions:

Basic structure:  
```[expr for val in collection if <condition>]```


If we apply that to our list5 from above and try to achieve the same goal:

In [None]:
list7 = [name for name in list5 if name.startswith("A")]
print(list7)

We can even specify different expressions in case the first condition is not met:

In [None]:
list8 = [name if name.startswith("A") else "name didn't start with A" for name in list5]
print(list8)

In [None]:
list8 = [name if name.startswith("A") else name if name.startswith("S") else "name didn't start with A" for name in list5]
print(list8)

Please note the different syntax based on whether or not an alternative condition is specified:

In [None]:
list7 = [name for name in list5 if name.startswith("A")]
list8 = [name if name.startswith("A") else "name didn't start with A" for name in list5]

Let us do one more! How about starting from a list of integers?

In [None]:
from random import randint
list9 = [randint(0,250) for i in range(20)]
print(list9)

Okay, now - how if we want to make all numbers in the list even by subtracting 1 from every number that is odd? 

It can surely be achieved by for loops:

In [None]:
list10 = []
for element in list9:
    if element % 2 == 0:
        list10.append(element)
    else:
        element -= 1
        list10.append(element)
print(list10)

Using list comprehension, the neccessary code becomes considerably shorter:

In [None]:
list11 = [element if element%2==0 else element-1 for element in list9]
print(list11)

----

### Exercise!  
  
1. Define a list that contains 250 random integers between 0 and 10000. 
2. Using a for loop, create a list that only contains items from the list that can be devided by 3 or 5 (no rest).
3. Solve the same task, but only use a single list comprehension statement.

----

## Dealing with ordered collections of items

Imagine that the following list represents the current stock of spices in a warehouse.

In [None]:
stock = [('sugar', 100),('salt', 60),('pepper', 15),('thyme', 5),('chili flakes', 140)]
print(stock)
print(type(stock[0]))

As you can see, every item in the list is now a tuple consisting of a string that is paired with a number representing the current stock.  
  
Imagine that we want to iterate over the entire list and generate a *shopping list* of spices that need restock as soon as the current stock is lower than 20 units. When restocking, we always want to aim at restocking up to a stock of 100 units.
  
We can of course solve this using for loops:

In [None]:
shoppinglist = []
for element in stock:
    if element[1] < 20:
        shoppinglist.append((element[0], 100-element[1]))
    else:
        pass
print(shoppinglist)

As you may have guessed, this is also possible using list comprehension:

In [None]:
stock = [('sugar', 100),('salt', 60),('pepper', 15),('thyme', 5),('chili flakes', 140)]
print(stock)
print(type(stock[0]))

In [None]:
slist = [(i, 100 - j) for i, j in stock if j<20]
print("shopping list:", slist)
print("we're still good on:", [(i, j) for i, j in stock if j>=20])

As you can see above, list comprehensions can also be used inside other methods and functions.

Here is two more examples:  

In [None]:
import numpy as np
### example 1: np.arrays
a = np.array([np.zeros(4) for i in range(4)])
print(type(a))
print(a)

In [None]:
## example 2: letter pairs:

letterlist = "abcbacbacbbacbacaabcab"
pairlist = [(i,j) for i,j in zip(letterlist, letterlist[1:])]
# note the zip method above! It returns an iterator that iterates 
# over two iterables and returns tuples of their respective elements at the given iteration step.

print(pairlist)

How about when we only want individual letter pairs and we want to count them at the same time?

In [None]:
## function for letter pairs:
def letterpairs(string):
    pairlist = [(i,j) for i,j in zip(string, string[1:])]
    return pairlist

In [None]:
# for loop solution
def wordcount(d, l):
    for el in l:
        if el in d.keys():
            d[el] += 1
        else:
            d[el] = 1
    return

In [None]:
# recursive solution, no for loops for iterating over list!
def recwordcount(l,d={},n=0):
    if l[n] in d.keys():
        d[l[n]] += 1
    else:
        d[l[n]] = 1
    # base case:
    if n == len(l)-1:
        return d
    # recursive function call:
    else:
        return recwordcount(l,d,n+1)

In [None]:
freqd = recwordcount(letterpairs("abcbacbacbbacbacaabcab"), {}, 0)
print(freqd)

In [None]:
dict2 = {}
for element in pairlist:
    if element not in dict2.keys():
        dict2[element] = 1
    else:
        dict2[element] += 1
print(dict2)

Finally, dictionary comprehension looks a lot like list comprehension:

In [None]:
dict2 = {element:0 for element in pairlist} ## dictionary comprehension saving us one condition
for element in pairlist:
    dict2[element] +=1
print(dict2)

But wait, what is that thing with curly brackets we have been using in the second example?

---

## A (far too) brief look at dictionaries in Python

* More information can be found in Guttag (2016), pp. 79-83!
* In Python, a dictionary is a set of key:value pairs.
* Keys and values do not have to share the same data type.
* It is defined using curly brackets:

In [None]:
exampledict = {}
print(exampledict)

In [None]:
exampledict["test"] = 1
exampledict["test"]

In [None]:
print(exampledict)

As with lists, individual key:value pairs in lists are separated by ",":

In [None]:
exdict2 = {"apples":2, "oranges":3, "chocolate bars":0, "sad academics":1}

Given a defined dictionary, the value of a given key can be queried as follows:

In [None]:
exdict2["bananas"]=17

In [None]:
exdict2["bananas"]

As you can see, in a way, values in a dictionary are indexed by their respective keys. Therefore you can change them in a fashion that is very similar to the kind of statements you have already been using on lists.  
  
Imagine that one of the apples has been eaten.

In [None]:
exdict2["apples"]-=1
exdict2

You can simply add a new key:value pair to a given dictionary as follows.  
Suppose that we have bought a chocolate bar and want to add it to our dictionary.
Also suppose that this makes at least one academic happy.

In [None]:
exdict2["mars bars"] = 1
exdict2["sad academics"] -= 1
exdict2

Please note that the "-= 1" statement used above can only be used to modify values corresponding to keys that can already be found in a dictionary:

In [None]:
exdict2["peanuts"]+=1

----

## Exercise!
  
Please have a look at the following table representing the amount of available silverware in a given drawer. Please create a dictionary that ties all items (as keys) to the amount in which they are given in the imagined drawer?  
  
| forks | spoons | knives |
|-------|--------|--------|
| 15    | 10     | 17     |

In [None]:
cutlery = {"forks":15, "spoons":10, "knives":17}
print(cutlery)

----

## Towards a transition matrix

Have a look at the whiteboard.

Imagine that "a", "b", and "c" refer to three possible states and that you would now like to count the transitions from any of these states in a matrix.  
  
In our case, the matrix rows might represent the original state, while every column counts the number of occurrence of any target state given this original state.

This is easily encoded using a dictionary, list comprehension and numpy:

In [None]:
tpdict={"a":0, "b":1, "c":2}
lettertp = np.array([np.zeros(3) for i in range(3)])
print(lettertp)

In [None]:
# note that something similar to list comprehension exists for dictionaries:
letters, integers = ["a","b","c","d","e"], [i for i in range(0,6)]
print(letters,integers)
tpdict2 = {k:v for k, v in zip(letters, integers)}
tpdict2

In [None]:
print(pairlist)
print(tpdict)

Let us first convert our letter pair tuples to integers via our dictionary from above:

In [None]:
intpairs = [(tpdict[i], tpdict[j]) for i,j in pairlist] 
# note the behaviour when using two iterating variables for ordered collections (tuples)
print(intpairs)

In [None]:
for e1, e2 in pairlist:
    lettertp[tpdict[e1]][tpdict[e2]]+=1 # note that this part might be difficult using list comprehension.
print(lettertp)

One more step. The same can be used to directly generate the transition probability for a given string, as strings in Python are also iterable:

In [None]:
str1 = "abbacacbacbac"
for element in str1:
    print(element)

In [None]:
# therefore, we can create a transition matrix from a given string as follows:  

tpdict={"a":0, "b":1, "c":2}

lettertp = np.array([np.zeros(3) for i in range(3)])

for e1,e2 in [(i,j) for i,j in zip(str1, str1[1:])]:
    lettertp[tpdict[e1]][tpdict[e2]]+=1

print(lettertp)

$\rightarrow$ But - why have we been using for loops in tandem with list comprehension in some of the examples above?

Keep in mind that:  

* list comprehension in Python is used to build lists
* the basic syntax for list comprehension in python is ```[expression for val in collection]```
* therefore, list comprehension is based on building expressions based on variable values iterating over collections.  
  
If you want to know more, you can for example refer to sections 2.4 and 2.5 of [this source](http://greenteapress.com/thinkpython/thinkCSpy/html/chap02.html).

-----

## Ecercise!!
1. Define a dictionary representing the following states (as the integer in brackets): "a"(0), "b"(1), "c"(2), "d"(4), "e"(5).

2. Using either a list of letter pairs (as tuples) or directly iterating over it, can you please create a transition matrix (5*5) for the individual letters in the following string, based on the dictionary from the last exercise?

In [None]:
exstring1 = "acdeabedcbaedbcaebdadabdaedccbaadaecbedddeabacbad"

In [None]:
tpdict={"a":0, "b":1, "c":2, "d":3, "e":4, "f":5}

lettertp = np.array([np.zeros(len(tpdict.keys())) for i in range(len(tpdict.keys()))])

for e1,e2 in [(i,j) for i,j in zip(exstring1, exstring1[1:])]:
    lettertp[tpdict[e1]][tpdict[e2]]+=1

print(lettertp)

---

## Musical application

Transition matrices can be readily applied to musical material.
  
For example, they could be used to account for the number of times that a certain musical event is followed by other possible musical events.  
(please remember our *bag of(pairs of notes) - model*!)
  
Let us once more turn to the blackboard.

A transition matrix for bigrams of chord states could for example be generated using the following function:

In [None]:
def transprop(piece):
    
    # take a given piece from the music21 corpus and convert it to a stream:
    c = corpus.parse(piece)

    # we now want to create a list of chord roots to represent the temporal structure of root progression in the piece.
    # we will be doing this by using list comprehension and a few filters and methods from music21 that you should be familiar with.
    rootlist = [element.root().pitchClass for element in c.chordify().recurse().getElementsByClass(chord.Chord)]
    
    # as above, we will create a list of ordered root note pairs: 
    rootpairlist = [(i, j) for i, j in zip(rootlist, rootlist[1:])] # zip once more allows us to iterate over two iterables (lists) at the same time.
    # we use list slicing to "look ahead" one item in our rootlist.
    
    # create a 12*12 array of 0s
    trmat = np.array([np.zeros(12)] * 12)
    
    # now we iterate over our root pair list. Exploiting list indexing and the fact that we are dealing with pitch classes form 0 to 11,
    # we can modify our array accordingly for every state transition that is observed.
    for element in rootpairlist:
        trmat[element[0]][element[1]] += 1
    # return our final array:    
    return trmat

Let us apply this function to a given piece from the music21 corpus:

In [None]:
freqarray = transprop("/bach/bwv25")
freqarray

Right now, this array is not of much use. However, if we take one final step.  
  
However, if we take into account that we now have an array representing observations of state transitions in a given musical context, we can imagine that this allows us to convert our values for absolute occurence of states based on various previous states to conditional probabilities.  
  
This can be achieved by dividing any value in a given row of our matrix by the sum of all values in that row. (After all, given the occurence of a given previous state represented in a row, the sum of conditional propabilities should intuitively be = 1.)

In [None]:
proparray = np.array([[(i / sum(j)) if i !=0 else i for i in j] for j in freqarray])
proparray

## That's all for today.

<img src='https://cdn.pixabay.com/photo/2016/03/04/19/36/beach-1236581_1280.jpg' width = 700>

------

### References

First of all, here are a few articles focussing on list comprehension:
  
[Jarrell, E. (2019). *List Comprehension in Python.* Hackernoon.](https://hackernoon.com/list-comprehension-in-python-c762ba1f523f)  
[Yordanov, V. (2018). *Python Basics: List Comprehensions*. Towardsdatascience.](https://towardsdatascience.com/python-basics-list-comprehensions-631278f22c40) 
  
As well as [this rather entertaining video](https://www.youtube.com/watch?v=AhSvKGTh28Q).
  
**General References for learning Python**  
 
Guttag, J. (2016). *Introduction to computation and programming using Python: With application to understanding data.* MIT Press.
 
Shaw, Z. A. (2017). *Learn python 3 the hard way: A very simple introduction to the terrifyingly beautiful world of computers and code.* Addison-Wesley Professional.

Shaw, Z. A. (2017). *Learn More Python 3 the Hard Way: The Next Step for New Python Programmers.* Addison-Wesley Professional.