What is a function?
-------------------

A function is a piece of code that performs a particular task.  Defining one allows you to create new "words" that will  do the job for you, encapsulating a complex task into a simple command.  


A function definition has to have several elements:
=============

1. The keyword "def", short for "define".
2. The name of the function.  This can be any word you choose, but it is best if it is a word that means or helps you remember the task that the function performs, and it should not be a word that is already in use (a part of python), or in use in some other part of your program.
3. Parentheses after the name.  
4. A colon
5. Indented underneath the line that names the function, it has to do something.



In [1]:
def doSomething():
    print("I did something")
    
doSomething()

I did something


This was not a particularly exciting function, but it is a function that does something.

1. It has the keyword **def**.
2. It has a name that reminds me what the function does.  **doSomething**
3. It has parentheses after the name.
4. It has a colon after the parentheses.
5. Indented underneath the line that names the function, it does something. **it prints**


The final line of the cell above is not part of the function, it is the line that **calls** the function.  You call a function by saying its name with parentheses.  The program then does what the function tells it to.

Make a function that does something more useful
---------------------------------------------

Make a function that squares a number.  It should *print* the squared value of the input number

In [2]:
def square(number):
    result = number*number
    print(result)
    
square(15)

225


Notice that this function for squaring still has all the required elements:

1. It has the keyword **def**.
2. It has a name that reminds me what the function does.  **square**
3. It has parentheses after the name.
4. It has a colon after the parentheses.
5. Indented underneath the line that names the function, it does something. **it computes a value and prints it**

What's new is that there is a word *inside* the parentheses, *number*, and when you call the function, there is an actual number inside the parentheses.

When you call the function and put the number inside the parentheses, you are sending that number into the function.  Inside the function, python puts that number into the variable in the matching spot in the parentheses -- so the word "number" now is a variable that holds a particular value, in this case, the number 15.  

Then the function squares the number by multiplying by itself, 

stores the result in the word "result", 

and then prints it.

Try calling the function a few more times with different numbers sent or "passed" into the function.

In [3]:
square(5)
square(0)
square(-2)

25
0
4


What can you do in a function?
-----------------------------

So what all  can you do inside a function?  Anything you can do in the  rest of the program, but it's a good idea to try to limit every function to just one task -- that makes it easier to debug if something goes wrong.

There's one additional thing that functions can do that we need to discuss at this point -- when they are finished, they can "return" a value back to the main program.  When you call the function you can "pass" values to it, when it finishes, it can "return" one thing back.  This can allow you to use that returned value in the rest of the program.



In [4]:
def square(number):
    result = number*number
    return result
    
sq15 = square(15)

It doesn't look like that code cell did anything, because it didn't print anything -- but it did do something.  It has now stored a value in the variable sq15.  What value?  Let's print it:


In [5]:
print(sq15)

225


So the value that came from inside the function is now available for use outside the function.

Let's make another function that takes two numbers, multiplies them together, then outputs the result.



In [6]:
def multiply(a,b):
    result = a*b
    return result

multiply(6,4)

24

When you pass values into the function like this, they are assigned to the holder variables inside the function in order.  So in that first example of calling the multiply function, inside the function the variable **a** ends up having the value **6** and **b** has the value **4**

This is more important when you will do different things with the variables, for example, a division function:

In [7]:
def divide(a,b):
    result = a/b
    return result

print("6 divided by 3:  ", divide(6,3))
print("3 divided by 6:  ", divide(3,6))

6 divided by 3:   2.0
3 divided by 6:   0.5


Task 1
-----

Make a function that prints a number every time you call it.  It doesn't matter what the number is

In [2]:
def printthirteen():
    print("13")
    
printthirteen() 

13


In [3]:
printthirteen() 

13


Task 2
------
Make a function that takes an input number and prints the result of dividing it by 3.



In [4]:
def divide(number):
    result = number/3
    print(result)

divide(15) 

5.0


In [5]:
divide(120) 

40.0


Task 3
------

Make a function that takes as input two numbers, then **returns** the sum of all positive integers *between* those two numbers.

For example, if the input was 2 and 5, the returned value should be **7**, because 3+4 = 7.  Notice that 2 and 5 are not included because they are not *between* 2 and 5. 

This function will have to combine all the extra elements of multiple input values, a function body that does something substantial, and returning a value.  It will also be an example of the accumulator pattern.

In [6]:
def sumbetween(a,b):
    middle_sum = 0
    for middle_value in range(a+1,b):
        middle_sum = middle_sum + middle_value
        a < middle_value < b
    return middle_sum
sumbetween(1,4) 
    

5

In [7]:
sumbetween(10,14) 

36

Task 4
------


I'll get you some new data to look at, in this case it is a data set of 144 thousand tweets harvested from Twitter in 2009.  The dataset was developed to aid in training algorithms to recognize the sentiment in human generated text.  The columns of the dataset are:

 * sentiment: a numerical score indicating the emotion inferred from the text of the tweet (0 = negative, 2=neutral, 4 =positive)  These scores were assigned by people reading the tweets.  

 * id: a unique numeric identifier for the particular tweet

 * datetime:  the date and time the tweet was sent

 * query: a column you can ignore for now

 * user: the username of the account that sent the tweet

 * text:  the actual text of the sent tweet

In [8]:
import pandas as pd

In [9]:
df = pd.read_csv('../data/twitter.10.csv', header=None, encoding="ISO-8859-1")
df.columns = ['sentiment', 'id', 'datetime', 'query', 'user', 'text']

In [10]:
len(df.user.unique())

105907

There are 105907 unique users in this dataset.  Here are the first ten:


In [11]:
df['user'].head(10)

0         jacqqq_xo
1           mykerob
2           TukeGuy
3          kevinlcc
4           OStephy
5       singingbell
6    megsapatsfan12
7            Danlol
8          Wildaris
9       bagthoughts
Name: user, dtype: object

Make a function that takes as input the dataframe and the name of a single user.  The function should return the number of tweets sent by that user.

Then use it to find out how many tweets were sent by the first 5 names from the list just above.  Also find out how many tweets were sent by the account "tweetpet".

In [12]:
df.loc[0]['user']=='jacqqq_xo'

True

In [13]:
There were 297 tweets sent by the username "tweetpet"

SyntaxError: invalid syntax (<ipython-input-13-57bf96b50a58>, line 1)

In [None]:
#Use this function to find out how many tweets were sent by the first 5 names from the list

number_tweets = 0
user = list(df['user'].head(5))
for tweet in df.index:
    if df.loc[tweet]['user'] in user:
        number_tweets = number_tweets + 1
        print(number_tweets)

1
2
3
4
5


The function you produce here is probably going to be really slow -- there are more elegant ways of accomplishing this task, but for now I want you to use the approach of writing your own function to practice that skill.  If your function doesn't seem to be working, please check in with me quickly so I can help determine if it is working but it's slow, or if it is in fact not working.

Once you get the function to work, you can continue exploring the data set in ways that seem interesting to you, or you can submit the assignment as is.