# Locations in music21 and Python dictionaries
In this notebook we will expand our konwledge of both music21 and Python. Regarding music21, we will pay specially attention to locating objects in their containing streams, but we will also have a look to how music21 handles lyrics and ties. Regarding Python, we will learn about a very powerful class, the dictionaries, which will help us to organize information with a greater degree of specificity.

In this notebook, we will continue working with the score `lsxp-WoBenShi-KongChengJi.xml`. So let's load it. And since we are going to work with the music of the parts in this score, let's save each part in a corresponding variable.

In [None]:
from music21 import *

In [None]:
# Path of the folder that contains the score to be loaded
path = './lsxp-WoBenShi-KongChengJi/'

# Name of the score
file_name = 'lsxp-WoBenShi-KongChengJi.xml'

# Join the path of the folder with the file name to get the full path
fn = path + file_name

# Load the score
s = converter.parse(fn)

# Retrieve parts
p_instr = s.parts[0] # instrumental part
p_vocal = s.parts[1] # vocal part

## Locations in music21
The key concept for locating objects in music21 strems is `offset`. To have an explanation about this concept, please watch [the "Basic concepts in music21" of the introductory video to music21](https://youtu.be/wrREb68FwNM?t=4764). We already had a glimpse of offset when we retrieved measures from a part. Let's do this again for the vocal part: retrieve all the measures and just print them out.

In [None]:
mm_vocal = p_vocal.getElementsByClass('Measure').stream()

for m in mm_vocal.elements:
    print(m)

As you see, in music21 measure streams have both a number and an offset. To retrieve this information, we can conveniently call the attributes `.number` and `.offset` on a measure stream.

Let's take measure number 10. We can retrieve it by calling the method `.measure()` on the variable where we stored the vocal part, and passing the number `10` as parameter. Then, we will call the attributes `.number` and `.offset` on this variable.

In [None]:
m_vocal_10 = mm_vocal.measure(10)

print("Measure number {} of the vocal part starts at offset {}.".format(m_vocal_10.number, m_vocal_10.offset))

Let's now see what this measure contains.

In [None]:
for e in m_vocal_10.elements:
    print(e)

Measure 10 only contains notes. And these notes are contained within this measure in specific offsets. We can retrieve the offset of each note by calling the `.offset` attribute on each note.

In [None]:
for n in m_vocal_10.notes.stream():
    print("The note {} starts at offset {}.".format(n.nameWithOctave, n.offset))

Notice that the offsets of the notes are related to their position within the measure stream that contains them.

Let's now have a look to the offsets of the elements contained in the first measure of the score, which, in this case, has the number `0` because it is an anacrusis.

In [None]:
# Retrieve the measure with the method .measure()
m_vocal_0 = mm_vocal.measure(0)

# Iterate over the elements of the measure
for e in m_vocal_0.elements:
    print("At offset {} there is a {}.".format(e.offset, e))

As you can see, this measure contains many elements, but none of them is a note. However, all of them are contained in the measure stream in a specific offset, the first one of the measure, and therefore, we can call the `.offset` method on them.

However, we will probably won't want to work measure by measure. Instead, we will probably like to retrieve all notes from a part using the `.notes` attribute, applied to the "flattened" part. Let's retrieve all the notes from the vocal part, and to be sure that we did it correctly, let's open them in our score editor.

In [None]:
nn_vocal = p_vocal.flat.notes.stream()

nn_vocal.show()

Now that we have all the notes of the vocal part in the stream stored in variable `nn_vocal`, let's retrieve there the notes from measure number 10, which, in `nn_vocal`, are the notes from indexes 10 to 19. To verify that these are the notes we want, let's print them out, and also show them in our score editor.

In [None]:
nn_vocal_m10 = nn_vocal[10:20]

for n in nn_vocal_m10:
    print(n)

nn_vocal_m10.show()

If you showed those notes in your score editor, you will have noticed that the score editor creates few empty measures before. Why is this? Well, it has to do with the note offsets. So let's check them:

In [None]:
for n in nn_vocal_m10:
    print("The note {} starts at offset {}.".format(n.nameWithOctave, n.offset))

As you can see, the first note starts at offset `38.0`, so the score editor, having this information, has to leave the space equivalent to 38 quarter notes free before the first note. But, why, if these are the same notes that we analysed before, the offset is now different? Because their container is also different. Previously, we retrieved these notes from their measure, so the offsets were related to that measure. However, now we are retrieving these notes from a stream that we created and saved in the variable `nn_vocal`, and this stream is created by "flattening" the vocal part saved in the variable `p_vocal`. Therefore, the offsets of the notes contained in `nn_vocal` are no longer related to the measure, but to the entire part.

However, having the notes in this new stream doesn't mean that we lost the measure information. We can still retrieve this information by calling the `.measure` attribute on a note. Besides, we can even now in which beat of the measure a note starts, by calling the `.beat` attribute. Let's retrieve the measure and beat information from the notes of measure 10 that we saved in the variable `nn_vocal_m10`.

In [None]:
for n in nn_vocal_m10:
    print("The note {} is in measure {}, beat {}.".format(n.nameWithOctave, n.measureNumber, n.beat))

As we saw previously, everything that is contained in a measure is located in a specific position, which is indicated by an offset, but also by a beat. Therefore, all the objects contained in the first measure of the part also have measure and beat information. Let's call the `.measure` and `.beat` attributes on the objects of the first measure.

In [None]:
for e in m_vocal_0.elements:
    print("The element {} is in measure {}, beat {}.".format(e, e.measureNumber, e.beat))

As you can see, clefs, key signatures, time signatures and other objects also have information about the measure where they are contained, and in which specific beat.

By the way, notice that all these objects in this anacrusis measure, which, remember, where located at offset `0.0`, are stored in beat `3.0`. This shows that music21 is aware that this first measure is an anacrusis measure, with a time signature of 4/4, in which the first two beats are missing. Great!

So, as we just saw, we can retrieve information from this non-note objects. So let's look a bit more in depth on that. First, let's focus on key signature. And first of all, let's retrieve all the key signatures present in the vocal part by calling the `.getElementsByClass()` method on the "flattened" part. And remember, music21 likes you to keep your retrieved objects in streams.

⇒ **Note**: since key signatures are stored within measures, if do not use the `.flat` attribute we could only retrieve key signature objects from a measure stream. So, if we want to retrieve all the key signatures from a part, we need to use the `.flat` attribute.

In [None]:
kss = p_vocal.flat.getElementsByClass('KeySignature').stream()

print('The vocal part contains {} key signatures.'.format(len(kss)))

The vocal part of this score only contains 1 key signature. So, let's save it into a variable to better work with it.

In [None]:
ks = kss[0]

print(ks)

So now that we have the key signature saved in the variable `ks`, the next cell shows some of the information that can be retrieve from it. Pay attention to all the different attributes.

In [None]:
print('Offset:', ks.offset)
print('Measure number:', ks.measureNumber)
print('Beat:', ks.beat)
print('Number of altered pitches:', ks.sharps)
print('List of altered pitches:')
for p in ks.alteredPitches:    # the attribute .alteredPitches retrieves a list of pitches
    print('-', p.name)
print('Major key:', ks.asKey('major'))
print('Major key:', ks.asKey('minor'))
print('Minor key:', ks.asKey('dorian'))

Let's do now the same with time signatures. First, we retrieve all the time signatures in the part.

In [None]:
tss = p_vocal.flat.getElementsByClass('TimeSignature').stream()

print('The vocal part contains {} time sigantures.'.format(len(tss)))

In this case, this score has two time signatures. This might be useful to know for our analysis. Luckily, even though we retrived them from the "flattened" part, we still can know in which measure, and even beat, each time signature is located.

In [None]:
print('The first time signature is in offset {}, measure {}, beat {}.'.format(tss[0].offset, tss[0].measureNumber, tss[0].beat))
print('The second time signature is in offset {}, measure {}, beat {}.'.format(tss[1].offset, tss[1].measureNumber, tss[1].beat))

The most relevant information we can retrieve from a time signature is the number of beats per measure and the duration value for each beat. The former can be known from the `.numerator` of the time signature, and the latter from its `.denominator`, if we take the time signature as a fraction.

In [None]:
print('The first time signature is a {}/{}.'.format(tss[0].numerator, tss[0].denominator))
print('The second time signature is a {}/{}.'.format(tss[1].numerator, tss[1].denominator))

⇒ **Note**: Of course, much more infortatino than this can be retrieved from a time signature. If you want to explore it, save one time signature in a variable and run the cell. Then write the name of the variable followed by a period and press the tabulator. A pop-up window will appear with all the options. You can select one and write a question mark `?` directly after it. A new window will apppear with the docstrings of that attribute or method.

Finally, let's do the same for the clefs. First, we retrieve them.

In [None]:
clefs = p_vocal.flat.getElementsByClass('Clef').stream()

print('The vocal part contains {} clef.'.format(len(clefs)))

To better work with the only clef of the score, we save it in a variable. Pay attention to the attributes called in the next cell.

In [None]:
clef = clefs[0]

print('Offset:', clef.offset)
print('Measure number:', clef.measureNumber)
print('Beat:', clef.beat)
print('Name:', clef.name)
print('Sing:', clef.sign)
print('Line:', clef.line)

## Lyrics and ties
Before moving to Python dictionaries, let's briefly look at two important elements in a score: lyrics and ties.

### Lyrics
Lyrics are specific objects in music21, and as `pitch` and `duration` objects, they are contained in the `note` objects. So to explore lyrics, let's take the first note of measure 10 and save it in a variable.

In [None]:
n_m10_0 = nn_vocal_m10[0]

We can acces the lyrics of this note by calling the attribute `.lyrics`.

In [None]:
ll = n_m10_0.lyrics

print(l)

Notice that what the `.lyrics` attribute retrieves is a list, as indicated by the square brackets `[ ]`. Why would a note contain a list of lyrics? This is thought for scores of melodies sung with different lyrics, as it is the case of many folk songs. In this cases, the score gives the different stanzas of lyrics to one line of notation. As a result, one note can have more than one line of lyrics.

The retrieved list from the cell above contains only one lyric object. This object has three attributes. `number` indicates the number of the line, for those notes with more than one line of lyrics. `syllabic` indicates the position of the syllable contained in that note with the word to which it belongs. If we sing the word "Despacito", we will sing each syllable with one note. So, in the note containing the lyric "Des-", the value of `syllabic` will be `begin`. In the note with the lyric "-to", the value of `syllabic` will be `end`. And for "-pa-" and "-ci-" the value of `syllabic` will be `middle`. In the case of monosyllabic words, as it is the case for all Chinese characters, the `syllabic` value is `single`. And finally, `text` indicates the specific string of that particular lyric.

We can access each item of information by calling the corresponding attribute on the lyric object. So let's save it first in a variable, and then call all these three attributes.

In [None]:
l = ll[0]

print('Text:', l.text)
print('Line:', l.number)
print('Syllabe with respect to word:', l.syllabic)

However, most of the time, we will be just interested in the string of lyric. So we can retrieve it directly from the note with the attribute `.lyric`.

In [None]:
print('The first note of measure 10 in the vocal part hast the lyric {}.'.format(n_m10_0.lyric))

⇒ **Note**: in the case that a note have different lines of lyrics, the `.lyric` method will return all the lyrics from different lines in a single string, separated by the new line escape sequence (`\n`).

So, now that we know how to retrieve the lyrics from notes, let's print all the lyrics of the notes from measure 10. Remember that they were stored in variable `nn_vocal_m10`.

In [None]:
for n in nn_vocal_m10:
    print(n.lyric)

Well, not all notes have lyrics. So, maybe we could check if a note has lyric before trying to print it out.

In [None]:
for n in nn_vocal_m10:
    if n.lyric != None:
        print(n.lyric)

### Ties
The duration object gives us information about how long a note is performed. However, what happens when to notes are tied? Note objects in music21 contain an attribute called `.tie` that contains the information about the note being tied or not.

In our score, the last two notes of the vocal part are tied. So, let's save them in one variable each to better work with them. And just to compare them with not tied notes, let's also save the antepenultimate note in a variable.

In [None]:
nx = nn_vocal[-3]
ny = nn_vocal[-2]
nz = nn_vocal[-1]

# Print pitch and duration names to verify that we accessed the right notes
print('The antepenultimate note is a {} {} note.'.format(nx.nameWithOctave, nx.duration.fullName))
print('The penultimate note is a {} {} note.'.format(ny.nameWithOctave, ny.duration.fullName))
print('The last note is a {} {} note.'.format(nz.nameWithOctave, nz.duration.fullName))

So, to check if a note is part of a tie, we can call the attribute `.tie` on it. If if is NOT part of a tie, the value of this attribute will be `None`.

In [None]:
print('Is the antepenultimate note part of a tie?', nx.tie != None)
print('Is the penultimate note part of a tie?', ny.tie != None)
print('Is the last note part of a tie?', nz.tie != None)

So now that we have verify that the penultimate and last notes are part of a tie, we can even know in which part of the tie they are, either the `start` or the `stop`.

In [None]:
print('The antepenultimate note is {} of a tie.'.format(ny.tie.type))
print('The last note is {} of a tie.'.format(nz.tie.type))

⇒ **Note**: if more than two notes are tied, the value fo the `.tie.type` attribute for the notes in the middle will be `continue`.

## Python dictionaries
So far, we have been storing our collections of data in lists. However, Python has a more powerful way for storing data: dictionaries.

Each item of data stored in a dictionary contains two elements, a `key` and a `value`. We will see in the next cell what is the meaning of this, but for the moment, let's just look at the format of a dictionary. If lits are indicated by square brackets (`[ ]`), dictionaries are defined by curly brackets (`{ }`). Then each item within the dictionary is separated by commas (`,`). And the two elements of each item, that is, the `key` and the `value` are separated by colon (`:`). The `key` always comes firts.

Let's see an example. We will create a dictionary that containes personal information about a person. We will add that information as `value`s, and we will label them with a corresponding `key`.

In [None]:
person_01 = {'name': 'John Smith', 'age': 35, 'height': 1.78, 'married': False}

print(type(person_01))

As you can see, we created a dictionary using the curly brackets, and we saved it in the variable `person_01`.

The dictionary contains four items:

1. In the first one, the `key` is `'name'` and the `value` is `'John Smith'`.
2. In the second one, the `key` is `'age'` and the `value` is `35`.
3. In the third one, the `key` is `'height'` and the `value` is `1.78`.
4. In the fourth one, the `key` is `'married'` and the `value` is `False`.

If we print the type of data that is contained in the variable `person_01`, we get indeed a `dict` (short form of `dictionary`).

To understand the benefits of a dictionary, let's compare it with a list. We could have stored all the information about this person in a list in this way:

    person_01 = ['John Smith', 35, 1.78, False]
    
So, if we want to retrieve any particular item of information, we will need to remember its index. However, with a dictionary, each item of information is "labeled" with a `key`, and we can retrieve that information, that is, that `value`, by calling its corresponding `key`. Let's see an example.

In [None]:
person_01['height']

In the previous cell, we retrieved the information about height by calling the `key` `'height'` within square brackets right next to the name of the variable in which we saved the dictionary. The format looks very similar to indexing. But with dictionaries, we don't need to remeber the index of each item of information. We can conveniently call the `key`.

In fact, dictionaries do not accept indexing. If we try to retrieve the same information using the index `2` we will get an error:

In [None]:
person_01[2]

If you noticed, the type of error we got is a `KeyError`. This means that the code didn't consider `2` as an index, but as a `key`. Since the variable `person_01` containes a dictionary, if we use square brackets after it, Python understand that we are calling a `key`, not an index.

By the way, pay attention to the types of data of the values of our dictionary. The first `value` (`'John Smith'`) is a string. The second (`35`) is an integer. The third (`1.78`) is a floating point and the fourth (`False`) is a boolean. Dictionary values can be of any sort of data type. Regarding keys, in our dictionary all of them are strings, but actually, they can be of any type. For example, here's a dictionary in which the keys are integers:

In [None]:
students_per_year = {2015: 23, 2016: 25, 2017: 21, 2018: 21, 2019: 21}
print(students_per_year)

⇒ **Note**: even though in the dictionary saved in the variable `person_01` and in the one in `students_per_year` all the keys are of the same data type, this does not necessarily have to be so. Keys of one dictionary can be of different data types.

⇒ **Note**: a dictionary can have items of information with identical values, as you can see in the previous dictionary with the values of the keys `2017`, `2018` and `2019`. However, keys cannot be repeated, they have to be unique. And this makes logical sense: if a dictionary contains two identical keys, how would Python know to which one you are referring to when you call any of those?

The information of this second dictionary can be retrieved in the same way as in the first one:

In [None]:
print('How many students were there in 2018?')
print(students_per_year[2018])

So, let's use all the information stored in the first dictionary:

In [None]:
print('The first person is called {}.'.format(person_01['name']))
print('He is {} years old.'.format(person_01['age']))
print('He is {} metres tall.'.format(person_01['height']))
# Since the value of the key 'married' is a boolean, let's print a specific message accordingly
if person_01['married']:
    print('He is married.')
else:
    print('He is not married.')

The `value` of a specific `key` can be modified, or, better said, re-defined by using the operator `=`:

In [None]:
person_01['married'] = True

if person_01['married']:
    print('He is married.')
else:
    print('He is not married.')

Now, the `value` of the `key` `'married'` is no longer `False`, but `True`.

In [None]:
print(person_01)

In fact, we can modify the values of specific keys using all the operators, attributes and methods applicable to the data type of those values. For example, the `value` of the key `age` is an integer. So we can add `2` to that integer using the operator `+=`:

In [None]:
person_01['age'] += 2

print(person_01)

We can expand our dictionaries by adding new items of information. In order to do that, we call the new `key` in square brackes right after the name of the variable that stores the dictionary, and we assign a `value` to it using the `=` operator.

Let's add the hobbies of this person to our dictionary. For that, we call the variable `'hobbies'` on the dictionary saved in the variable `person_01`. That `key` doesn't exists yet in the dictionary, we will create it by assigning a `value` to it using the operator `=`.

Usually, people have more than one hobby. So, since values in a dictionary can be of any data type, let's assing a list to the `key` `'hobbies'` containing all the hobbies of this person.

In [None]:
person_01['hobbies'] = ['painting', 'fencing', 'gardening']

print(person_01)

Now, our dictionary contains a new `key`, `'hobbies'`, with a list as `value`.

As just mentioned, values can be modified using the operators, attributes and methods that are applicable applicable to the data type of those values. Since the `value` of the `key` `'hobbies'` is a list, we can modify using any of the list methods. For example, let's add a new hobby to the list of hobbies. The way of adding an item to a list is using the `.append()` method. So, let's call this method when calling the `key` `'hobbies'` on the `person_01` dictionary.

In [None]:
person_01['hobbies'].append('knitting')

print(person_01)

Let's now create two new dictionaries for containing information about two new people. However, now we will create this dictionaries starting from an empty one, and we will be adding the pairs of `key` and `value` one by one using the `=` operator.

In [None]:
# Dictionary for person_02
## Define empty dictionary
person_02 = {}

## Add information
person_02['name'] = 'Carol Smith'
person_02['age'] = 5
person_02['height'] = 0.95

# Dictionary for person_03
## Define empty dictionary
person_03 = {}

## Add information
person_03['name'] = 'Louise Smith'
person_03['age'] = 2
person_03['height'] = 0.51

# Print the dictionaries
print(person_02)
print(person_03)

Dictionaries can be as complex as you want, since they are very versatile. For example, imagine that the dictionaries we just created correspond to the daughters of the person one. So, we can add that information in the `person_01` dictionary, as a list of containing the `person_02` and `person_03` dictionaries.

In [None]:
person_01['offspring'] = [person_02, person_03]

print(person_01)

And now we can very conveniently retrieve all the different items of information saved in that dictionary.

In [None]:
print('The first person has {} daughters:'.format(len(person_01['offspring'])))

for daughter in person_01['offspring']:
    print('- {}, {} years old, {} meters high.'.format(daughter['name'], daughter['age'], daughter['height']))

I hope that you started gaining an idea of the potential of dictionaries for storing and organizing information.

Dictionaries, as all Python objects, have their own methods. Arguably the most used ones are the `.keys()` and `.values()` methods. Each of them respectively returns all the keys of a dictionary and all the values of a dictionary.

Let's see first all the keys of our dictionary `person_01`:

In [None]:
print(person_01.keys())

And now, all its values:

In [None]:
print(person_01.values())

Excellent! Dictionaries are great! But, how can they help us in our musicological research.

A common use is gathering information for a series of elements that were unknown a priory. For example, in exercise 2.6 from notebook 7 we could count the aggregated duration of all the pitches of the score. However, that code requires that we know beforehand the pitches contained in a score. And then, we need to create a variable for each of them, check for each note if its pitch corresponds to any of those, etc. The result, is a quite lengthy code. With dictionaries we can do the same task much more efficiently. We can create an empty dictionary, then iterate over the notes, if the note's pitch is not among the keys in our dictionary, we create it defining the duration of the note as its value, but if it alreay is among the keys, then we update its value by adding the new duration.

Better seen than explained:

In [None]:
# Create empty dictionary
pitch_durations = {}

# Iterate over all notes
for n in nn_vocal:
    # Retrive pitch and duration
    n_pitch = n.name
    n_dur = n.quarterLength
    # Check if pitch is already a key in the dictionary
    if n_pitch not in pitch_durations.keys():
        # Since it is not, we create the key and assign the duration as value
        pitch_durations[n_pitch] = n_dur
    else:
        # Since is already is, we update its value by adding the new duration
        pitch_durations[n_pitch] += n_dur

# Print the dictionary
print(pitch_durations)

Besides saving coding space, now we can retrieve detailed information at our convenience. For example, how long is the pitch E peformed?

In [None]:
print('Pitch E is performed for an aggregated duration of {} quarter notes.'.format(pitch_durations['E']))

What about the 4th and 7th degrees, in this case, A and D#?

In [None]:
print('Pitch A is performed for an aggregated duration of {} quarter notes.'.format(pitch_durations['A']))
print('Pitch D# is performed for an aggregated duration of {} quarter notes.'.format(pitch_durations['D#']))

Or we can even print all the information by iterating over the keys of the dictionary:

In [None]:
for p in pitch_durations.keys():
    print('- {} is sung for {} quarter notes.'.format(p, pitch_durations[p]))