# Learning to Program: Objects + Libraries

*"Some of the best programming is done on paper, really. Putting it into the computer is just a minor detail."*<br>*- Max Kanat-Alexander*

## Objects

Recall the types of values and data structures we have come across so far:
1. Strings
2. Numbers
3. Lists
4. Dictionaries

However, there is no type of value that allows us to combine the previous values together. For instance, let's say we had a list of response times of infants naming objects:

In [None]:
response_times = [4.5, 3.4, 4.7, 4.0, 3.2, 3.9, 3.4, 4.2]

We could then store the maximum and minimum times using a couple of variables:

In [None]:
min_response_time = 3.2
max_response_time = 4.7

But what if we had another list of response times? Then we would need to add two more variables to store the minimum and maximum times of this list:

In [None]:
response_times_2 = [3.3, 3.5, 4.2, 3.8, 4.3, 3.7, 4.3, 4.2]
min_response_time_2 = 3.3
max_response_time_2 = 4.3

And if we had another list? As the amount of lists grows, so will the total amount of variables used. Our code will become unreadable! 

Luckily, Python offers a solution to this problem. This solution is **objects**. 

An object esentially "bundles" a group of values and treats it as one value. In Python, we can create objects through a **class**, which acts as a blueprint for these objects. 

To make things clear, let's look at an example of a class for our response times problem:

In [None]:
# Class framework
class ResponseTimes:
    # Constructor: parameters are self + values to bundle
    def __init__(self, rt, min_time, max_time):
        # Syntax for each field: self.<field_name> = <field_name_value>
        self.response_times = rt
        self.min_response_time = min_time
        self.max_response_time = max_time
        # no return statement here, but Python implicitly returns an object. 

Let's look at this in pieces:
1. **class ResponseTimes:** This is where we specify the name of the class, after the **class** keyword. 
2. **def __init__(self,rt):** This is a special function that all classes have. It is the function that allows us to create **instances** of this **blueprint** - in other words, to create **objects** of this **class**. In computer lingo it is called the **constructor** of the class. Don't worry about the meaning of the *self* parameter for now. rt, min_time and max_time represent the values that we are going to bundle together into our object. 
3. **self.response_times = rt:** this is where we actually create the object. When we bundle our values in a object, we need a way to distinguish them if we want to access them later. **Field names** allow us to do this. We specify one field name for each value. Here the field name for the response times list is response_times. To make sure that it is associated with our object, we have to prepend it with "self.". Once we have specified the field name, we associate it with its respective value (in this case, rt) through a variable assignment.
4. **self.min_response_time = min_time:** Just like we did above, we now create a field in the object called min_response_time to hold the value of min_time. 
5. **self.max_response_time = max_time:** We now create a field in the object called max_response_time to hold the value of max_time. 
6. The __init__ function returns an object that contains these three fields. Notice that there is no return statement at the end of this function, so we would expect that this function wouldn't return a value. However, Python implicitly does this for us in the __init__ function. 

That's alot, I know. But it will all come together now when we create an object:

In [None]:
response_times = [4.5, 3.4, 4.7, 4.0, 3.2, 3.9, 3.4, 4.2]
min_response_time = 3.2
max_response_time = 4.7

# Create an object using the ResponseTimes blueprint, passing in the appropriate parameters.
responseTimesObject = ResponseTimes(response_times, min_response_time, max_response_time)

Notice here that creating an object following the ResponseTimes blueprint is much like calling a function. In fact, when we do **variable_name = < ClassName >(parameter1, parameter2, ...)**, we are actually calling the < ClassName >'s __init__ function, this is, we are calling its constructor. So here when we do **ResponseTimes(response_times, min_response_time, max_response_time)**, it is calling the __init__ function we created above to create an object that will bundle the three values we gave to the function as arguments.

You may have noticed that we passed in 3 arguments to the __init__ function instead of 4. This is because the 'self' parameter is an implicit parameter: Python takes care of this for us, so we don't have to give it a value. 

We can now retrieve each value accordingly by using the field names we created in the __init__ function:

In [None]:
print("Value of response_times field:", responseTimesObject.response_times)
print("Value of min_response_time field:", responseTimesObject.min_response_time)
print("Value of max_response_time field:", responseTimesObject.max_response_time)

Note here that to retrieve a value in an object we use the syntax < name_of_object >.< field name >. Therefore, responseTimesObject.response_times gives us the list of response times, responseTimesObject.min_response_time gives us the minimum value of the list of response times and responseTimesObject.max_response_time gives us the minimum value of the list of response times. 

We can now repeat the same procedure for the second list of response times:

In [None]:
response_times_2 = [3.3, 3.5, 4.2, 3.8, 4.3, 3.7, 4.3, 4.2]
min_response_time_2 = 3.3
max_response_time_2 = 4.3

responseTimesObject2 = ResponseTimes(response_times_2, min_response_time_2, max_response_time_2)

print("Value of response_times field:", responseTimesObject2.response_times)
print("Value of min_response_time field:", responseTimesObject2.min_response_time)
print("Value of max_response_time field:", responseTimesObject2.max_response_time)

**Exercise:** Let's say you have a list of tagged words:

In [1]:
words_with_tags = ['jump_VERB','television_NOUN','eat_VERB','book_NOUN','cook_VERB']

You have counted the number of verbs and the number of nouns in the list:

In [2]:
number_verbs = 3
number_nouns = 2

Create a class called TaggedWords that bundles a list of tagged words with its number of verbs and number of nouns. Create an object of this class. 

In [4]:
# Solution
class TaggedWords:
    
    def __init__(self, tagged_words_list, number_verbs, number_nouns):
        self.tagged_words_list = tagged_words_list
        self.number_verbs = number_verbs
        self.number_nouns = number_nouns

tagged_words = TaggedWords(words_with_tags, number_verbs, number_nouns)
print("Tagged words list:", tagged_words.tagged_words_list)
print("Number of verbs:",tagged_words.number_verbs)
print("Number of nouns:",tagged_words.number_nouns)

Tagged words list: ['jump_VERB', 'television_NOUN', 'eat_VERB', 'book_NOUN', 'cook_VERB']
Number of verbs: 3
Number of nouns: 2


*Brief aside:* We can also add functions to our classes. Recall our ResponseTime class:

In [None]:
class ResponseTimes:

    def __init__(self, rt, min_time, max_time):
        self.response_times = rt
        self.min_response_time = min_time
        self.max_response_time = max_time
        

If we wanted a function that calculated the range of the list (the difference between the max_time and the min_time), we can simply add it to our class:

In [None]:
class ResponseTimes:

    def __init__(self, rt, min_time, max_time):
        self.response_times = rt
        self.min_response_time = min_time
        self.max_response_time = max_time
        
    # Function added to calculate the range of the response times list 
    def list_range(self):
        return self.max_response_time - self.min_response_time

And then we can call it on an object:

In [None]:
responseTimesObject = ResponseTimes(response_times, min_response_time, max_response_time)

rt_list_range = responseTimesObject.list_range()
print("List range of response times object:", rt_list_range)

Notice here how we passed in zero parameters instead of one. Again, recall that the 'self' parameter is implicit and Python takes care of it. 

## Libraries

In our previous class we mainly talked about functions in Python, which allows us to reuse snippets of code over and over again without the need to copy the code every single time. 

So far, we have only dealt with functions that only we have created. But many of the programming tasks we will be performing in our day-to-day have already been written by other programmers. So how can we access this code authored by other programmers and not reinvent the wheel? 

We can do this through **libraries** in Python. 

A library in Python is collection of functions that a programmer has written for a specific purpose. For instance, recall this little snippet of code from Class 2:

In [6]:
# Run this before the next snippet of code
title_list = ['One Hundred Years of Solitude', 'Chronicle of a Death Foretold', 'Love in the Time of Cholera']

In [7]:
# Code to select a random title from title_list 
import random
random_title = random.choice(title_list)

In the first line we **imported** the **random** library. When we import a library, we're essentially telling Python that we want have the *all* of the functions from that library (in this case, the random library) at our disposal. 

In the second line, now that we have access to the functions from the random library, we **select** the **choice** function from the library and **call** it, passing in the argument title_list. The end result is a random title chosen from title_list. 

But wait - how did we know that the random library has a function called 'choice'. How do we know what functions the random library has to offer, anyway? 

It turns out that each library in Python offers descriptions of its functions in a document called an **Application Programming Interface (API)**. These APIs usually begin with a general description of the library, followed by a description of each of its functions and arguments. For instance, the API for the random library, located at https://docs.python.org/2/library/random.html, begins with a brief high-level description of the library (sometimes called **modules** in some programming circles):


![Screen%20Shot%202019-06-22%20at%201.38.46%20PM.png](attachment:Screen%20Shot%202019-06-22%20at%201.38.46%20PM.png)

And the description for the choice function appears a little bit below:

![Screen%20Shot%202019-06-22%20at%203.12.36%20PM.png](attachment:Screen%20Shot%202019-06-22%20at%203.12.36%20PM.png)

**Exercise:** Suppose we have a pool of 3-grams:

In [8]:
three_gram_pool = ['there she was','to be continued','the unicorn book','pillow cases forever','sleeping in shade']

In order to validate a theory we have via a simulated experiment, we need to generate samples of three 3-grams each. Use the random library to create a random sample of size 3 using the three gram pool described above. 

In [9]:
random_sample = random.sample(three_gram_pool, 3)
print(random_sample)

['there she was', 'sleeping in shade', 'the unicorn book']


**Pro tip**: The easiest way to find the API for a library is by googling it. For the above example, if you googled 'random api python', you can find the API by clicking on the first result, the one that begins with 'https://docs.python.org'. 

Notice that we can also import only specific functions from a library, and not the entire library itself, by using the **from** keyword. Notice that we don't need to prepend the name of the library (random) before calling the function, as we did before:

In [None]:
from random import choice,shuffle

print("Original title list:",title_list)
random_title = choice(title_list)
print("Random title from original title list:",random_title)
shuffle(title_list)
print("Title list shuffled:",title_list)

We can also rename libraries - especially useful if the library name is long:

In [None]:
import random as rd
random_title = rd.choice(title_list)
print(random_title)

The random library we just worked with is part of a collection of libraries called the *Python Standard Library*. This collection of libraries comes already installed with Python, so all we did with the **import** statement was make it 'visible'.   

However, not all libraries that we'll use are part of the Python Standard Library. They are **external**: they have to be downloaded and installed through special commands. There are many ways to fetch and install external libraries in Python. One of the most popular ways is through the **pip** command. 

To install a library using the **pip** command, use the following syntax:

In [None]:
### do not run this code
!pip install <name-of-library>

For example, one library that is not part of the Python Standard Library that can be installed through the pip command is the Parselmouth, a library that does phonetics processing. We can install Parselmouth by running the command:

In [None]:
!pip install praat-parselmouth

If this runs ok, you should see the line 'Successfully installed praat-parselmouth-0.3.3' or something similar amongst the last lines of output. 

*Note:* The name that we use to install a library through pip doesn't necessarily have to be the same as the name of the library itself. For instance, even though we used the name 'praat-parselmouth' for the pip command, the name of the library in Python is parselmouth.

Next class we'll take a look at some examples of using this library. 