# Music21: offsets, intervals, rests | Intro to Pyplot
This notebook presents new functionalities of music21, regarding the use of offsets for retrieving objects, the creation of and work with interval objects, and retrieval of information about rests. Besides it introduces **pyplot**, a collection of functions from the **Matplotlib** library for the creation of plots.

In this notebook, we will work with the vocal part of our `lsxp-WoBenShi-KongChengJi.xml` score.

In [None]:
from music21 import *

In [None]:
# Path of the folder that contains the score to be loaded
path = './lsxp-WoBenShi-KongChengJi/'

# Name of the score
file_name = 'lsxp-WoBenShi-KongChengJi.xml'

# Join the path of the folder with the file name to get the full path
fn = path + file_name

# Load the score
s = converter.parse(fn)

# Retrieve the vocal part
p_vocal = s.parts[1]

## Using offsets
In notebook 8 we learnt how to retrieve the offset of music21 objects. Now, we will learn the opposite operation, that is, retrieving the object(s) located in a specific offset.

To practice this function let's focus on the first line of the vocal part. It is the line with the lyrics "我本是卧龙岗散淡的人，" which starts in measure 8 and finishes in measure 16. The aim is retrieving this line according to the starting and ending offsets. But first, let's find out which these offsets are.

Since the start and end of this line do not coincide with full measures, we would need to find the offsets of the first and last notes of the line. So, first of all, let's have all the notes of the vocal part in one variable.

In [None]:
nn_vocal = p_vocal.flat.notes.stream()

The first note of the vocal part is precisely the first note of the first line. So the first note of the first line is the note in index `0` in the stream of all the notes of the vocal part. Let's retrieve it, and print its measure number and beat anyway to double check that it is the note we want.

In [None]:
# First note of the vocal part
line_01_first = nn_vocal[0]

# Print measure and beat to verify that we retrieved the correct note
print("Phrase 1 starts at measure {}, beat {}.".format(line_01_first.measureNumber, line_01_first.beat))

The last note of the first line is the one with index `56` (I counted it). Let's retrieve it too and also print its measure number and beat to verify that we got the correct one.

In [None]:
# Last note of the vocal part
line_01_last = nn_vocal[56]

# Print measure and beat to verify that we retrieved the correct note
print("Phrase 1 ends at measure {}, beat {}.".format(line_01_last.measureNumber, line_01_last.beat))

Now that we have found the first and last note of the first line, we can know in which offsets the first line starts and end.

In [None]:
print("Phrase 1 starts at offset {} and ends at offset {}.".format(line_01_first.offset, line_01_last.offset))

So, now we know that the first line starts at offset `32.0` and ends at offset `63.0`.

It is common in computational research to work with annotated data. In the case of music related tasks, annotated datasets usually consist of files of audio or symbolic data (the latter basically are midi files or machine readable scores), accompanied with annotations in different formats of text files (in the case of symbolic data, the annotations might be included in the files). In the particular case of the dataset from which our score is taken, the [Jingju Music Scores Collection](https://zenodo.org/record/1464653), the machine reable scores are accompanied by a spreadsheet containing information about the starting and ending offsets of all the lines and line sections for all the scores of the dataset, including descriptions of the musical features of each line and section. Therefore, you can automatically retrieve from that annoations file all the melodic lines that correspond to the particular musical features you are interested in. So, knowing how to use these annotations is a very useful skill.

In order to retrieve the object(s) located in a specific offset, music21 provides the `.getElementsByOffset()` method, which can be called on the stream from where you want to retrieve objects. Let's use it first with only one parameter, the starting offset of the first line in our score, that is, `32`. We will call it on the `nn_vocal` variable, which contains a stream with all the notes from the vocal part. And as always, the retrieved object(s) will be saved in a stream, which will be assigned to a variable. Finally, let's print the elements contained in the retrieved stream.

In [None]:
# Retrive objects at offset 32 from the stream of notes of the vocal part
offset_32 = nn_vocal.getElementsByOffset(32).stream()

print(offset_32.elements)

The resulting stream contains three notes, which means that there are three notes on offset `32`. How is that possible? If you look at the score, the first note of the first line has two grace notes. And if you remember, grace notes are located in the same offset as the following main note.

Just to double-check that the method worked, let's print the measure number and beat of the last of these three notes, which is the main note.

In [None]:
print("The main note in offset 32 is in measure {}, beat {}.".format(offset_32[-1].measureNumber, offset_32[-1].beat))

Now, let's do the same for the last note of the first line, in offset `63`.

In [None]:
# Retrive objects at offset 63 from the stream of notes of the vocal part
offset_63 = nn_vocal.getElementsByOffset(63).stream()

print(offset_63.elements)

Now, we retrieved two notes, because the last note has a grace note. But let's print the measure number and beat of the main note (the second one) to double-check the result.

In [None]:
print("The main note in offset 63 is in measure {}, beat {}.".format(offset_63[-1].measureNumber, offset_63[-1].beat))

So, as you can see, if we input one offset to the `.getElementsByOffset()` method, we can retrieve all the objects positioned at that offset in the stream on which we call the method.

However, we can also input two offsets to the `.getElementsByOffset()` method. In this case, it retrieves all the objects positioned between the first given offset and the second one from the stream on which we call the method. So, if we know that the first line in the vocal part starts at offset `32` and ends at offset `63`, we can call the `.getElementsByOffset()` on the stream where we saved all the notes of the vocal part, giving the starting and ending offsets as parameters. As always, the output will be saved as a stream. To check what we retrieved, let's open it in our score editor.

In [None]:
# Retrive objects between offsets 32 and 63 from the stream of notes of the vocal part
nn_line_01 = nn_vocal.getElementsByOffset(32, 63).stream()

# Print how many notes have been retrieved
print("The first line contains {} notes.".format(len(nn_line_01.elements)))

# Open the line in a score editor
nn_line_01.show()

Since we called the `.getElementsByOffset()` method on a stream of notes, what we retrieved are only notes. However, if we are interested in retrieving the measures that contain the first line, we can also call the `.getElementsByOffset()` method on the vocal part, which is the stream containing the measures.

In [None]:
# Retrive measures between offsets 32 and 63 from the vocal part
mm_line_01 = p_vocal.getElementsByOffset(32, 63).stream()

# Open the line in a score editor
mm_line_01.show()

If you noticed, something didn't work properly. The first measure is missing! Why is that? Let's check the docstrings of this method to see how it works.

In [None]:
p_vocal.getElementsByOffset?

As you can see, the `.getElementsByOffset()` method, as most methods in music21 (and other modules) contains a list of default parameters. Luckily, the names of these parameters are quite self-explanatory. But if they are not clear enough, the docstrings give a good explanation.

In our case, it seems that the problem is the `mustBeginInSpan` parameter. According to the docstrings description, this parameter "determines whether notes or other objects that do not begin in the region but are still sounding at the beginning of the region are excluded." That is, the first line starts at offset `32`, which happens to be in measure 8. But measure 8 does not start in offset `32`, but in offset `30` (two quarter notes earlier). Therefore, since the `mustBeginInSpan` parameter is set to `True` this measure is not retrieved. The solution is simple: call the `.getElementsByOffset()` method by setting the `mustBeginInSpan` parameter to `False`.

In [None]:
# Retrive measures between offsets 32 and 63 from the vocal part
mm_line_01 = p_vocal.getElementsByOffset(32, 63, mustBeginInSpan=False).stream()

# Open the line in a score editor
mm_line_01.show()

And now it's solved! So bear in mind that most music21 methods contain a series of default parameters. So, if something didn't work quite as you expected, just give a look to the docstrings, by calling the method on an appropriate object without parenthesis and followed by question mark `?`.

## Intervals
So far we have been working with objects already present in the score. However, in our analytical work, we might want to work with elements that are not explicitly present in the score, but that we have to create ourselves. A good example of this are intervals. An interval is the distance between two pitches. We can find the pitches in the score, but this distance is something that we measure. In order to work with intervals, music21 has an interval object, that can be easily created from two given pitches or notes (remember that in music21 pitch is an independent object contained in a note).

So, let's create an interval object using the first two notes of the vocal part. First, let's retrieve these notes and print their name with octave to know which interval to expect.

In [None]:
n1 = nn_vocal[0] # First note of the vocal part
n2 = nn_vocal[1] # Second note of the vocal part

# Print notes' name with octave
print("Note 1:", n1.nameWithOctave)
print("Note 2:", n2.nameWithOctave)

Now, we can cretate an interval by calling `interval.Interval()` and giving the previous two notes as parameters. We will save the interval object in a variable and just print it out.

In [None]:
itvl_1 = interval.Interval(n1, n2)

print(itvl_1)

An interval object is nothing explicit in the score. So, if we try to open it in our score editor, nothing will be shown.

In [None]:
itvl_1.show()

Now that we have created this interval object, we can retrieve a lot of information about it. The next cell shows just some information that can be retrieved from an interval object. Pay attention to the used attributes.

In [None]:
print("Interval name:", itvl_1.name)
print("Interval 'nice' name:", itvl_1.niceName)
print("Interval class:", itvl_1.intervalClass)
print("Interval directed name:", itvl_1.directedName)
print("Interval directed 'nice' name:", itvl_1.directedNiceName)
print("Interval direction:", itvl_1.direction.name)
print("Interval semitones:", itvl_1.semitones)
print("Interval cents:", itvl_1.cents)

Now pay attention to the interval's name. It has two parts, a letter and a number. The number refers to the interval's number, and the letter refers to the interval's quality: `m` for minor, `M` for major, `P` for perfect, `A` for augmented, and `d` for diminished. In this case, `m3` represents a minor third.

We can even retrieve the notes that form the interval that we created, by calling the `.noteStart` and `.noteEnd` attributes.

In [None]:
itvl_1_n1 = itvl_1.noteStart  # First note of the interval
itvl_1_n2 = itvl_1.noteEnd    # Second note of the interval

# Print information about the notes
print("This interval starts with {} in measure {}, beat {}.".format(itvl_1_n1.nameWithOctave, itvl_1_n1.measureNumber, itvl_1_n1.beat))
print("This interval ends with {} in measure {}, beat {}.".format(itvl_1_n2.nameWithOctave, itvl_1_n2.measureNumber, itvl_1_n2.beat))

Let's now create an interval object for the second and third notes of the vocal part. Since we already have the second note, let's retrieve the third one.

In [None]:
n3 = nn_vocal[2] # Third note of the vocal part

# Print notes' name with octave
print("Note 2:", n2.nameWithOctave)
print("Note 3:", n3.nameWithOctave)

Now, let's create the interval object and retrieve information about it.

In [None]:
# Create the interval object
itvl_2 = interval.Interval(n2, n3)

# Print information
print("Interval name:", itvl_2.name)
print("Interval 'nice' name:", itvl_2.niceName)
print("Interval class:", itvl_2.intervalClass)
print("Interval directed name:", itvl_2.directedName)
print("Interval directed 'nice' name:", itvl_2.directedNiceName)
print("Interval direction:", itvl_2.direction.name)
print("Interval semitones:", itvl_2.semitones)
print("Interval cents:", itvl_2.cents)

This second interval is also a minor third, but differently from the first one, this is a descending interval. Notice how you can find the direction information from the `.directedName` attribute with the inclusion of a minus sign `-`. Consequently, if the direction of the interval is important for us, we should call the `.directedName` attribute. If not, the `.name` attribute will retrieve the same name for intervals with the same number and quality, regardless of their direction.

Now we are ready to anlyse our score in terms of intervals. For example, let's count how many perfect fourths are present in the vocal line of this score, regardless of its direction. To do that, we will create an interval between each note and the following one. So, we will iterate over the indexes of all the notes, and create an interval between the note in the current index and the note in the following index. Therefore, we need to iterate over all the indexes up to the penultimate one. Since we are not interested in the interval's direction, we will retrieve the name of the interval using the `.name` attribute. If it is a perfect fourth, `P4`, we will update a previously created counter.

To verify that the code is working properly, we will change the color of the notes that form the found perfect fourth intervals, using the `.style.color` attribute, so that we can easily find them in the score when we open it in our score editor.

In [None]:
### Perfect fourths analysis ###

# Initiate a counter
p4_counter = 0

# Iterate over the indexes of the notes until the penultimate one
for i in range(len(nn_vocal)-1):
    # Note in the current index
    n_start = nn_vocal[i]
    # Note in the following index
    n_end = nn_vocal[i+1]
    # Create interval between the previous two notes
    itvl = interval.Interval(n_start, n_end)
    # Check if the name of the interval is a perfect fourth 'P4'
    if itvl.name == 'P4':
        # Update the counter
        p4_counter += 1
        # Change the color of the starting note to green
        itvl.noteStart.style.color = 'green'
        # Change the color of the ending note to red
        itvl.noteEnd.style.color = 'red'

# Print the result
print('The vocal part contains {} perfect fourths.'.format(p4_counter))

# Open the whole vocal part in a score editor
p_vocal.show()

All the found intervals, as it can be seen in the score, are indeed perfect fourths. However, if you look closely, our code has detected perfect fourths in dubious cases, like the one starting with the last note of measure 11, `E4`, and ending with the first note of measure 13, `B3`. The interval between these two notes is indeed a perfect fourth. However, there is a whole empty measure between these two notes (a time during which the instrumental acompaniment plays a melodic filling). So it seems dubious that this case could be considered a valid interval from a perceptual point of view. Therefore, we could discard all cases in which there is a rest between the two candidate notes to form an interval. Consequently, we would need to work with rests.

## Rests
Rests are specific objects in music21, and can be handled as all the other music21 objects. For retrieving rests together with the notes of a stream, instead of the attribute `.notes`, we can use the attribute `.notesAndRests` in the same way as `.notes`. This attribute retrieves all the notes and rests, and only notes and rests, from the stream on which it is called.

So, let's call the `.notesAndRests` attribute on the vocal part, the exact same way as the attribute `.notes` is used. Let's compare how many elements are retrieved from both of these attributes.

In [None]:
# Retrieve notes and rests from vocal part
nr_vocal = p_vocal.flat.notesAndRests.stream()

# Compare length of notes and notes-and-rests
print("The vocal part has {} notes.".format(len(nn_vocal.elements)))
print("The vocal part has {} notes and rests.".format(len(nr_vocal.elements)))

Now that we have both notes and rests in the same stream, we might face a problem. If we do our usual iteration over all the elements of a stream with notes and rests, and our loop retrieves the pitch of each element, it will raise an error with the first rest (because rests, obviously, do not contain a pitch object), and the code will stop. Luckily both note and rest objects contain a `.isNote` and `.isRest` attribute which returns `True` if the element on which it is called is respectively a note or a rest.

So, let's take the first element of the stream that contains all the notes and rests of the vocal part. Since we don't know if it is a note or a rest, we will save it in an `x` variable. And we will call the `.isNote` and `.isRest` methods on it.

In [None]:
# First element of the stream with notes and rests
x = nr_vocal[0]

# Check if it is a note
if x.isNote:
    print("The first element of 'nr_vocal' is a note.")
else:
    print("The first element of 'nr_vocal' is NOT a note.")

# Check if it is a rest
if x.isRest:
    print("The first element of 'nr_vocal' is a rest.")
else:
    print("The first element of 'nr_vocal' is NOT a rest.")

So, the first element of vocal part is a rest. Let's see now what kind of information we can retrieve from a rest.

In [None]:
print('Name:', x.name)
print('Full name:', x.fullName)
print('Duration as quarter length:', x.quarterLength)
print('Dots:', x.duration.dots)

Let's go back now to our previous task, the analysis of perfect fourths. We will iterate over the indexes of the stream with all notes and rests, and we will create an interval object only if the element in the current index and the one in the next index are both of them notes. The rest of the code, would just be the same.

In [None]:
### Perfect fourths analysis 2.0 ###

# Initiate a counter
p4_counter = 0

# Iterate over the indexes of the notes and rests until the penultimate element
for i in range(len(nr_vocal)-1):
    # Element in the current index
    n_start = nr_vocal[i]
    # Element in the following index
    n_end = nr_vocal[i+1]
    if n_start.isNote and n_end.isNote:
        # Create interval between the previous two notes
        itvl = interval.Interval(n_start, n_end)
        # Check if the name of the interval is a perfect fourth 'P4'
        if itvl.name == 'P4':
            # Update the counter
            p4_counter += 1
            # Change the color of the starting note to green
            itvl.noteStart.style.color = 'green'
            # Change the color of the ending note to red
            itvl.noteEnd.style.color = 'red'

# Print the result
print('The vocal part contains {} perfect fourths.'.format(p4_counter))

# Open the whole vocal part in a score editor
p_vocal.show()

The first version of our code found 32 perfect fourths. This new version finds 27, so it seems that it returns a more precise result. However, if we look at the score, the last note of measure 11 and the first note of measure 13 are respectively colored in green and red, indicating that they respectively are starting and ending notes of an interval. So, why is that? Well, these notes were colored the first time we run the code, and since we didn't modify their color, they are still colored.

We have to keep in mind the permanent changes that our code makes, in case we want to reverse them. We could do two things: coloring all notes back to black, or reload the score. None of the changes we make with our code affect the original file (unless we explicitly program that). Therefore, if we reload a score in the same variable as before, we will get a new "clean" version. So let's do that. Of course, we will need to retrieve the corresponding part and the stream of notes and rests again (otherwise, we will be still using the previous ones).

In [None]:
# Reload the score
s = converter.parse(fn)

# Retrieve vocal part
p_vocal = s.parts[1]

# Retrieve notes and rests
nr_vocal = p_vocal.flat.notesAndRests.stream()

If you run now the cell with the "Perfect fourths analysis 2.0" code, those two notes will be now not colored.

## Introduction to Matplotlib
Doing science do not only consists in obtaining sound results. An important aspect of doing science is the communication in an effective way of these results. When working with quantitative methods, as the ones that the programming that we are learning allows us to do, a very effective resource for communicating results are plots and charts. [**Matplotlib**](https://matplotlib.org), as it defines itself, "is a comprehensive library for creating static, animated, and interactive visualizations in Python." You can see a series of examples of plots that can be created with **Matplotlib** in the following page: [https://matplotlib.org/gallery/index.html](https://matplotlib.org/gallery/index.html). Within **Matplotlib**, **pyplot** arguably is the most commonly used collection of functions for the creation of plots. In the following cells, an introductory glimpse to **pyplot** is offered.

However, to produce a plot, we need results to be plotted. Let's do a pitch analysis of the vocal part of our score. We will count the aggregated duration of the pitches present in our score and we will plot a bar chart to better interpret the results. Since we don't know *a priori* which pitches are present in the vocal part, we will count them using a dictionary, as it was exemplified in notebook 8. Therefore, the following code is a replica of what was explained there.

In [None]:
# Empty dictionary
pitch_count = {}

# Iterate over all notes and rests
for n in nr_vocal:
    # Check if current element is a note
    if n.isNote:
        # Retrieve pitch name with octave
        n_pitch = n.nameWithOctave
        # Retrieve duration
        n_dur = n.quarterLength
        # Check if the pitch of the current note is NOT yet among the keys of our dictionary
        if n_pitch not in pitch_count.keys():
            # Add this pitch as key with an initial value of the duration of the current note
            pitch_count[n_pitch] = n_dur
        else:
            # Update the value of the current pitch by adding the duration of the current note
            pitch_count[n_pitch] += n_dur

# Print restuls
# Iterate over the dictionary's keys:
for k in pitch_count.keys():
    # Print the key and its value
    print("- {}: {}".format(k, pitch_count[k]))

These are our results. So now let's create a bar chart using **pyplot** to better interpret them.

In order to do that, the first think that we need to do is importing **pyplot**, which is a part of **Matplotlib**. A common convention, suggested in the **Matplotlib** webpage itself, is giving this module an abbreviated name as `plt`. So, the conventional way of importing **pyplot** is like this:

In [None]:
import matplotlib.pyplot as plt

⇒ **Note**: if you don't have **Matplotlib** installed you will get an error. In that case, you just need to install it in the same Anaconda environment from where you are running this notebook. To do that, close the notebook and quit Jupyter, and run, in the corresponding environment, the following command

    conda install matplotlib

Most plots consist in a represantion of a series of values over a bidimensional space, which **pyplot** considers as a Cartesian plane. So, most of **pyplot** graphs requires information about the positions in the horizontal dimension (x axis) and their corresponding values in the vertical dimension (y axis). The horizontal positions and the vertical values can be given as lists to the **pyplot** functions (of course, these two lists have to be of the same length). All **pyplot** graphs are initiated with a first command that indicates the type of graph, and is closed when it is displyed using the `.show()` function (or saved as an image file).

In our case, we want to create a bar chart. So we need to use the `.bar()` function. This will initiate the creation of our plot. To this function, we have to input a list of horizontal positions and a list with their corresponding values to be displayed in the vertical dimension. Luckily, we already have all this information in our `pitch_count` dictionary. The horizontal positions are the found pitches, which are the keys of our dictionary. And these can be retrieved with the `.keys()` method. The values to be displayed in the vertical dimension are the values of the dictionary, which can be retrieved using the `.values()` method. To close and visualize our plot, we just need to call the `.show()` function.

⇒ **Note**: all **pyplot** commands should be preceded by the module's abbreviation `plt`.

In [None]:
# Initiate the bar chart
plt.bar(pitch_count.keys(), pitch_count.values())
# Close and display the plot
plt.show()

Great! And just with two lines of code! However, the plot is not very intuitive, because the pitches are not organized along the horizontal dimension (x axis). So, let's try to improve our plot.

We can order the keys of our dictionary using the `sorted()` function.

In [None]:
sorted_keys = sorted(pitch_count.keys())

print(sorted_keys)

Yes, now the pitches are ordered, but since they are strings, they are ordered alphabetically. But we would like to order them in terms of pitch height.

A possible solution for that is using music21 to retrieve a numeric pitch height value for this pitch names. With music21 we can retrieve frequencies in Hertz or midi values. Since midi values are just integers, it would simplify a bit the task. So, having a string with a pitch name, how do hay retrieve its midi value?

To do that, we have to create a pitch object in music21 using that string (this is one example of using pitch as independent from notes). Then, we can retrieve the midi value from this object. Let's test this process first with an example. We know that the midi value of C4 is 60. So let's use this for this example.

In [None]:
# Create a pitch object
myPitch = pitch.Pitch('C4')

# Print the midi value
print("The midi value of C4 is {}.".format(myPitch.midi))

Great! So now, let's do this for all the pitch names in our `pitch_count` dictionary, that is, for all its keys.

We will use the midi values to order the pitch names. But we do not want to display the midi values in our plot, but the pitch names. So, in order to go back to these names, we will create a dictionary which will have the midi values as keys and the pitch names as values.

In [None]:
# Create empty dictionary
pitch_midi = {}

# Iterate over the keys of the pitch_count dictionary
for pitch_name in pitch_count.keys():
    # Create a pitch object using the current pitch name
    pitch_object = pitch.Pitch(pitch_name)
    # Retrieve the midi value
    midi_value = pitch_object.midi
    # Add the midi value as key in the new dictionary, with the pitch name as value
    pitch_midi[midi_value] = pitch_name

# Print the obtained dictionary    
print(pitch_midi)

Since the midi values, that is, the keys of our `pitch_midi` dictionary, are integers, they can be sorted in an increasing way.

In [None]:
# Sort the midi values, which are the keys of the pitch_midi dictionary
sorted_midi = sorted(pitch_midi.keys())

# Print the results
print(sorted_midi)

However, we do not want midi values in our plot, but the pitch names. So let's create a list of ordered pitch names using the now sorted midi values and the `pitch_midi` dictionary.

In [None]:
# Empty list for storing the ordered pitch names
sorted_pitch = []

# Iterate over the ordered midi values
for m in sorted_midi:
    # Retrieve the corresponding pitch name from the pitch_midi dictionary, and
    # append it to the sorted_pitch list
    sorted_pitch.append(pitch_midi[m])

# Print the obtained list    
print(sorted_pitch)

Finally! Now we have the order of pitch names we wanted. Now we are just missing the corresponding ordered list of duration values, which we can retrieve from the first `pitch_count` dictionary, by iterating over the list of ordered pitch names, `sorted_pitch`.

In [None]:
# Empty list for storing the ordered duration values
sorted_values = []

# Iterate over the ordered pitch names
for p in sorted_pitch:
    # Retrieve the corresponding duration value from the pitch_count dictionary, and
    # append it to the sorted_values list
    sorted_values.append(pitch_count[p])

# Print the obtained list
print(sorted_values)

So, now we can create our bar chart again, using now the list `sorted_pitch` as parameter for the horizontal dimension (x axis) and the list `sorted_values` as parameter for the verstical dimension (y axis)

In [None]:
# Initiate the bar chart
plt.bar(sorted_pitch, sorted_values)
# Close and display the plot
plt.show()

This looks much better, and intuitively makes much more sense!

However, we can improve even more this graph, so that pitches are separated according to a distance of tone and semitone between them (that is, the distance between E4 and F#4, which is a tone, should be the double of the distance between D#4 and E4, which is a semitone). The problem is that `sorted_pitch` is just a list of strings, with no numerical information, and **pyplot** displays them just one after the other. However, we do have a sorted list with numerical information: `sorted_midi`. So, let's use this list as the parameter for the horizontal dimension (x axis) and see what happens.

In [None]:
# Initiate the bar chart
plt.bar(sorted_midi, sorted_values)
# Close and display the plot
plt.show()

Now the position of the bars is more meaningful. However, the information in the x axis is midi values, and this is not what we wanted. Even more, **pyplot** considered the x axis as a continuous scale (and, in fact, midi values are a continuous scale) and automatically decided to just show labels at regular steps. Luckily, **pyplot** allows us to keep this distribution over the x axis, and, at the same time, decide for ourselves which labels we want to display along that axis. And the labels we want to display are the ordered pitch names that we have in the `sorted_pitch` list.

To add this modification to the previous plot, we have to understand how **pyplot** creates plots. As previously mentioned, plots are initiated with a command that defines the type of plot and are closed when the `.show()` function is called (or saved to an image file). This means that after closing the plot, no further changes can be made. Therefore, all the changes we want to apply to a plot should be called between the intiation and closing statements. In this case, we will use the `.xticks()` function, for specifying the labels (called "ticks") of the x axis. This function takes two parameters, a list with the positions of the ticks and the list of ticks to be displayed. Since in our plot the x axis is the scale of midi values, we want to display the pitch names in the midi positions to which they correspond, which are saved in the `sorted_midi` list. And the ticks to be displayed are the pitch names themselves, saved in the `sorted_pitch` list.

In [None]:
# Initiate the bar chart
plt.bar(sorted_midi, sorted_values)
# Define ticks for the x axis
plt.xticks(sorted_midi, sorted_pitch)
# Close and display the plot
plt.show()

Finally! This plot shows our results in a very intuitive and meaningful way.

The power of **pyplot** is the enourmous versatility that it offers for customizing your plots. The options are endless. The next cell just shows few of them, improving the visual aspect of the plot for its use in a publication. You can find a lot of more options in the [**pyplot** documentation](https://matplotlib.org/contents.html). Besides, its webpage offers a very useful [**pyplot** tutorial](https://matplotlib.org/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py). 

In [None]:
# Initiate the bar chart
plt.bar(sorted_midi, sorted_values, color='gray') # Gray bars for black and white printing

# Define ticks for the x axis
plt.xticks(sorted_midi, sorted_pitch, rotation=45) # Give a 45º rotation to the ticks for clearer display

# Define the title of the plot
plt.title("Pitch histogram", size=20) # Fontsize of 20

# Define a label for the x axis
plt.xlabel("pitch", size=15) # Fontsize of 15

# Define a label for the y axis
plt.ylabel("duration (quarter notes)", size=15)  # Fontsize of 15

# Close and display the plot
plt.show()