### Workshop Week 4
Functions/methods, classes, and libraries - what are they and how are they useful?

##### Functions
Let's start with the smallest unit, functions. Functions are reusable blocks of code that operate on data in some way, called by name within a line of code. Methods are similar, although they technically are associated with an object (though you might find people using these terms relatively interchangably).
You've likely seen these in class before, and have certainly used them. Python uses the def keyword to start a function.

In [21]:
def my_function(x, y):
    x = x + 5
    return (x+ y)

Note that the above block of code isn't doing anything: that's because we just declared the function and didn't call it. Let's call it real quick:

In [22]:
x = 2
y = 3

result = my_function(2,3)

print(f"x: {x}, y: {y}, result of my_func: {result}")

x: 2, y: 3, result of my_func: 10


You might be surprised by the results of the above code. my_func included the line x = x + 5, and indeed my_func gets the result of x + 5 + y properly, but x is still only 2 when we print its value.  

Immutable objects in Python (like strings or integers) are (basically) passed **by value**  
Mutable objects (like lists or dictionaries) are (basically) passed **by reference**

In practice means that the immutable integers we're passing to my_func **will not be changed** by what you're doing in your function. 

In [23]:
def my_function2(x, y):
    x.append(y)

x = ["hello", "world"]
y = "!"

my_function2(x, y)

print(x)

['hello', 'world', '!']


Functions are sort of the building blocks of classes. Classes generally combine functions and/or data attributes, and can (sometimes) be instantiated to create an Object. Since we aren't likely going to be creating classes or custom objects in this class, I'm not going to go super in-depth into that for this workshop. We will take a quick look at EnglishTextAnalyzer.py to see a small example of what a class could look like.

In [26]:
from EnglishTextAnalyzer import EnglishTextAnalyzer

#creates an EnglishTextAnalyzer object called "text_analyzer"
text_analyzer = EnglishTextAnalyzer("For never was a story of more woe\nThan this of Juliet and her Romeo")

#uses the get_top_word() method that's built into the EnglishTextAnalyzer class
text_analyzer.get_top_word()

('of', 2)

Libraries are collections of classes, functions, etc that provide reusable tools for doing certain tasks within your program. In Python, there are built in libraries (like math) as well as external libraries that you have to install using pip (like Spacy and Pandas).

In [27]:
#there are multiple ways to import a library
import pandas
import pandas as pd
from pandas import DataFrame

### What is Pandas?

We've done a lot of work with Pandas and Spacy already, but for the next workshop we are going to go over what you can do with Pandas and Spacy, and a general overview of the data types involved so that we can know exactly it *is* that we're working with. 

Pandas is a library that helps you analyze and manipulate data. 
A **Pandas DataFrame** is a two-dimensional data structure. Two dimensional means that is has rows and columns, like a spreadsheet or a table. Each row and column can hold different data and be labeled for easier access. 

In [29]:
import pandas as pd

# Let's start by creating a simple DataFrame with some sample text data.
data = {
    'id': [1, 2, 3, 4, 5, 6, 7],
    'text': [
        "Arma virumque canō, Trōiae quī prīmus ab ōrīs",
        "Ītaliam, fātō profugus, Lāvīniaque vēnit",
        "lītora, multum ille et terrīs iactātus et altō.",
        "vī superum saevae memorem Iūnōnis ob īram;",
        "multa quoque et bellō passus, dum conderet urbem,",
        "inferretque deōs Latiō, genus unde Latīnum,",
        "Albānīque patrēs, atque altae moenia Rōmae."
    ]
}

df = pd.DataFrame(data)

#head displays the first few rows of your dataframe
df.head()


Unnamed: 0,id,text
0,1,"Arma virumque canō, Trōiae quī prīmus ab ōrīs"
1,2,"Ītaliam, fātō profugus, Lāvīniaque vēnit"
2,3,"lītora, multum ille et terrīs iactātus et altō."
3,4,vī superum saevae memorem Iūnōnis ob īram;
4,5,"multa quoque et bellō passus, dum conderet urbem,"


In [30]:
df.shape

(7, 2)

In [31]:
df["new_column"] = df['id'].apply(abs)
df

Unnamed: 0,id,text,new_column
0,1,"Arma virumque canō, Trōiae quī prīmus ab ōrīs",1
1,2,"Ītaliam, fātō profugus, Lāvīniaque vēnit",2
2,3,"lītora, multum ille et terrīs iactātus et altō.",3
3,4,vī superum saevae memorem Iūnōnis ob īram;,4
4,5,"multa quoque et bellō passus, dum conderet urbem,",5
5,6,"inferretque deōs Latiō, genus unde Latīnum,",6
6,7,"Albānīque patrēs, atque altae moenia Rōmae.",7


In [32]:
df["new_column"] = df['text'].apply(len)
df

Unnamed: 0,id,text,new_column
0,1,"Arma virumque canō, Trōiae quī prīmus ab ōrīs",45
1,2,"Ītaliam, fātō profugus, Lāvīniaque vēnit",40
2,3,"lītora, multum ille et terrīs iactātus et altō.",47
3,4,vī superum saevae memorem Iūnōnis ob īram;,42
4,5,"multa quoque et bellō passus, dum conderet urbem,",49
5,6,"inferretque deōs Latiō, genus unde Latīnum,",43
6,7,"Albānīque patrēs, atque altae moenia Rōmae.",43


In [33]:
#if a column of your dataframe is a list (or, since Pandas converts it, a Pandas Series), 
#you can access elements like how you'd access list elements

print(type(df["text"]))
df["text"][3:6]


<class 'pandas.core.series.Series'>


3           vī superum saevae memorem Iūnōnis ob īram;
4    multa quoque et bellō passus, dum conderet urbem,
5          inferretque deōs Latiō, genus unde Latīnum,
Name: text, dtype: object