# Python Data Science Toolbox Part 1

## Chapter 1 - Writing Your Own Functions

### Lecture - User Defined Functions

Sometimes you will need functionality that specific to your needs and you will want to be able to reuse this functionality and that is when you will create your own user defined functions. The first step is writing the function header using the key word def and giving your function a name(). The example below does not have any parameters that would be included between the parentheses.

In [1]:
def square() :     # <- function header
    new_value = 4 ** 2  # <- function body
    print(new_value)
    
square()

16


To allow for user defined parameters, you add a variable name to the function header between the parentheses.

A quick word on parameters and arguments: When you define a function, you write parameters in the function header. when you call a function, you pass arguments into the function.  

In [2]:
def square(value) :     
    new_value = value ** 2  
    print(new_value)
    
square(4)

16


If you don't the function to print the value directly, but instead want to return the square value and assign it to some variable, you add the return key word rather than print the outcome.

In [3]:
def square(value) :     
    new_value = value ** 2  
    return new_value
    
num = square(4)
num

16

Docstrings are another important part of writing a function. Docstrings describe what your function does and serves as documentation for anyone else who is reading your function and keeps them from having to trace through all the code in the function definition. Function Docstrings are placed in the immediate line after the function header between triple quotation marks '''. 

In [None]:
def square(value) :
    '''Returns the square of a value'''
    new_value = value ** 2  
    return new_value

### Exercise 1 - User Defined Functions

Docstrings are another important part of writing a function. Docstrings describe what your function does and serves as documentation for anyone else who is reading your function and keeps them from having to trace through all the code in the fu

In [4]:
# Write a simple function

def shout() : 
    '''Print a string with three exclamation marks'''
    shout_word = "congratulations" + "!!!"
    print(shout_word)

shout()

congratulations!!!


In [6]:
#Single parameter function

def shout(word) :
    '''Print a string with 3 exclamation marks'''
    shout_word = word + "!!!"
    print(shout_word)

shout('congratulations')

congratulations!!!


In [7]:
#Functions that return a single value

def shout(word):
    '''Return a string with 3 exclamation marks'''
    shout_word = word + "!!!"
    return shout_word

yell = shout("congratulations")
print(yell)

congratulations!!!


### Lecture - Multiple Parameters and Return Values

Using our square function, we will learn how to pass multiple arguments by allowing the user to define the power as well as the value to be raised to that power. 

In [8]:
def raise_to_power(value1, value2):
    '''Raise value1 to the power of value2'''
    new_value = value1 ** value2
    return new_value

result = raise_to_power(2,3)
print(result)

8


#### Tuples
You can also make the function return multiple values by constructing objects known as Tuples. A tuple is like a list, in that it can contain multiple values, but unlike a list, a tuple is immutable. That means that the values in a tuple cannot be changed once it has been constructed. While lists are defined using square brackets, tuples are defined using parentheses. 

Tuples can be unpacked into several variables on a single line. In the example below, you assign the variables a, b and c the values in the tuple in the order they appear in the tuple.

In [9]:
even_nums = (2,4,6)
a,b,c = even_nums
print(a)
print(b)
print(c)

2
4
6


In addition, you can access individual tuple elements like you do with lists, using square brackets to identify the index of the value you want to select.

In [10]:
even_nums = (2,4,6)
print(even_nums[1])

4


We can modify our raise_to_power to provide two values, one for value1 raised to the power of value2 and value2 raised to the power of value1, using tuples.

In [11]:
def raise_both(value1, value2) :
    '''Raise value1 to the power of value2 and vice versa'''
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    new_tuple = (new_value1, new_value2)
    return new_tuple

result = raise_both(2,3)
print(result)

(8, 9)


### Exercise 2 Functions with Multiple Parameters

In [12]:
#Functions with multiple parameters

def shout(word1, word2) :
    '''Concatenate strings with 3 exclamation marks'''
    shout1 = word1 + "!!!"
    shout2 = word2 + "!!!"
    new_shout = shout1 + shout2
    return new_shout

yell = shout('congratulations', 'you')
print(yell)

congratulations!!!you!!!


In [13]:
#A brief introduction to tuples

nums = (3,4,6)
nums[0] = 2



TypeError: 'tuple' object does not support item assignment

In [17]:
nums = (3,4,6)
num1, num2, num3 = nums

even_nums = (2,4,6)

In [18]:
#Functions that return multiple values
def shout_all(word1, word2):
    shout1 = word1 + "!!!"
    shout2 = word2 + "!!!"
    shout_words = (shout1, shout2)
    return shout_words

yell1, yell2 = shout_all("congratulations", 'you')

print(yell1)
print(yell2)

congratulations!!!
you!!!


### Lecture - Bringing it all together

Basic ingredients of a function
* Function header starting with the key word def
    def raise_both(value1, value2):
* Function body which contains docstrings contained in triple quotation marks to document what the function does. The rest of the body performs the computation and then closes with the key word return followed by the value or values returned by the function. 
    "raises value1 to the power of value2 and vice versa"
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    new_tuple = (new_value1, new_value2)
    return new_tuple

### Exercise 3 - Bringing it all together

In [1]:
#Bringing it all together(1)
import os
os.chdir('c:\\datacamp\\data\\')
import pandas as pd
df = pd.read_csv('tweets.csv')

# Initialize an empty dictionary: langs_count
langs_count = {}

# Extract column from DataFrame: col
col = df['lang']

# Iterate over lang column in DataFrame
for entry in col:

    # If the language is in langs_count, add 1 
    if entry in langs_count.keys():
        langs_count[entry] = langs_count[entry] + 1
    # Else add the language to langs_count, set the value to 1
    else:
        langs_count[str(entry)] = 1

# Print the populated dictionary
print(langs_count)

{'en': 97, 'et': 1, 'und': 2}


In [2]:
'''Bringing it all together (2)
generalize this function one step further by allowing the user to pass it a flexible argument, that is, in this case, 
as many column names as the user would like'''

import pandas as pd
tweets_df = pd.read_csv('tweets.csv')

def count_entries(df, col_name):
    """Return a dictionary with counts of 
    occurrences as value for each key."""

    # Initialize an empty dictionary: langs_count
    langs_count = {}
    
    # Extract column from DataFrame: col
    col = tweets_df[col_name]

    # Iterate over lang column in DataFrame
    for entry in col:

        # If the language is in langs_count, add 1
        if entry in langs_count.keys():
            langs_count[entry] = langs_count[entry] + 1
        # Else add the language to langs_count, set the value to 1
        else:
            langs_count[str(entry)] = 1

    # Return the langs_count dictionary
    return langs_count

# Call count_entries(): result
result = count_entries(tweets_df,'lang')

# Print the result
print(result)

{'en': 97, 'et': 1, 'und': 2}
