## Python Data Science Toolbox (Part 1)

## Chapter 2 - Scope, Default Arguments and Variable Length Arguments

### Lecture - Scope and User Defined Functions

Throughout the courses so far, variables have been created and called without any issues. However, not all objects that are defined are accessible everywhere in a program. A variable's scope defines the part of the program where an object or name is accessible. There are three types of scopes:
Global Scope - defined in the main body of a program/script
Local Scope - defined inside a function. Once the execution of the function is done, any name inside the local scope ceases to exist.
Built-in Scope - names in the pre-defined built-ins module that Python provides

#### Global vs Local Scope (1)
When we look at our square function from Chapter 1, we defined the function and it uses a variable called new_val. Once the function has been executed, we cannot call the new_val variable. This is because the variable new_val was only defined within the scope of the function, it was not defined globally. 

In [1]:
def square(value):
    """Returns the square of a number"""
    new_val = value ** 2
    return new_val
square(3)

9

In [2]:
new_val

NameError: name 'new_val' is not defined

#### Global vs Local Scope (2)
Now, if we define the variable new_val before we define and call the function square, you will see that any time we call the name in the global scope, we will access the variable that is defined in the global scope. Any time we call the name in the local scope of the function, it will look first in the local scope and if Python cannot find the name in the local scope, then it will look in the global scope. 

In [3]:
new_val = 10
def square(value):
    """Returns the square of a number"""
    new_val = value ** 2
    return new_val
square(3)

9

In [4]:
new_val

10

For example, note in the code below that the function references the variable new_value, defined in the global scope, so that when the function is called, the global value is accessed, not the value when the function is defined. Thus, if new_val is given a new value, we will get the new value squared.

In [7]:
new_val = 10
def square(value):
    """Returns the square of a number"""
    new_val2 = new_val ** 2
    return new_val2
square(3)

100

In [8]:
new_val = 20
square(3)

400

As a recap, when we reference a name, first the local scope is searched, then the global. If the name is in neither, then the built-in scope is searched. 

#### Global vs Local Scope (3)
What if we want to alter the value of a global name within an function call? This is where the keyword global is used. Within the function definition, we use the keyword global followed by the name of the variable. In the example below, we declare the variable new_val in the function as a global variable and then return its value squared. When we call the variable new_val after running the function, we see that we can still the value that was updated after the function was called. 

In [10]:
new_val = 10
def square(value):
    """Returns the square of a number"""
    global new_val
    new_val = new_val ** 2
    return new_val
square(3)

100

In [11]:
new_val

100

### Exercise 1

In [13]:
#The Keyword Global

team = "team titans"

def change_team():
    '''Change the value of the global variable team.'''
    global team
    team = 'justice league'

print(team)
change_team()
print(team)

team titans
justice league


### Lecture - Nested Functions

#### Nested Functions - Scope Search
Scope becomes important when there are functions within functions. In the example below, Python will first search the scope of the inner function to find the value of x, then it will search of the scope of the outer function for the value of x. The outer function is referred to as an enclosing function of the inner function. If Python can't find x in the scope of the enclosing function, it only then searches for x in the global scope and then the built in scope. 

def outer(....):
  """..."""
  x = ...
  
  def inner(...):
     """..."""
     y = x ** 2
  
  return....
  
#### Nested Functions - Value of
Nested functions are useful when it is necessary to use a process a number of times within a function. In example below we have written out the computation for each of the arguments provided in the function, but this is not efficient nor scalable. 

In [None]:
def mod2plus5(x1, x2, x3):
    '''Returns the remainder plus 5 of three values'''
    newx1 = x1 % 2 + 5
    newx2 = x2 % 2 + 5
    newx3 = x3 % 2 + 5
    
    return(newx1, newx2, newx3)

Instead, what we can do is created an inner function, as shown below and call it when necessary. The inner function is called a nested function and the syntax is exactly the same as any other function. 

In [1]:
def mod2plus5(x1, x2, x3):
    '''Returns the remainder plus 5 of 3 values'''
    def inner(x):
        '''Returns the remainer plus 5 of a value'''
        return x % 2 + 5
    return(inner(x1), inner(x2), inner(x3))

print(mod2plus5(1,2,3))

(6, 5, 6)


#### Returning Functions
Another valueable use case for a nested function is the ability to return a function. In the example below, raised_val returns the nested function inner. 

Passing the number 2 to raise_val() creates a function that squares any number. Pass the number 3 to raise_val() creates a function that cubes any number. One interesting details is that when we call the function square, it remembered the value of n = 2, as defined when we created the variable square, although the enclosing scope defined by raised_val() and to which n = 2 is local has finished execution. This is a subtly referred to as a closure in computer science and is necessary to be aware of in case you encounter it in the real world. 

In [2]:
def raise_val(n):
    '''Return the inner function'''
    
    def inner(x):
        "Raise x to the power of n"
        raised = x ** n
        return raised
    
    return(inner)

square = raise_val(2)
cube = raise_val(3)
print(square(5), cube(8))

25 512


#### Using Nonlocal Variables
Using the keyword global in function definitions to create and change global variables. Similarly, in a nested function, the keyword nonlocal allows you to create and change variables in an enclosing function. In the example below, the value of n is altered using the nonlocal keyword and thus calling outer() returns the value of n as defined in the nested function, inner(), as opposed to the value of n as first defined in the enclosing function of n = 1.

In [4]:
def outer():
    """Prints the value of n"""
    n = 1
    
    def inner():
        nonlocal n
        n = 2
        print(n)
    
    inner()
    print(n)

outer()

2
2


#### Summary
Scope search prioritization is known as the LEGB rule:
* Local scope
* Enclosing function
* Global scope
* Built-in scope

Declaring and naming variables will only be created locally unless the keywords global or nonlocal are first used. 

### Exercise 2

In [9]:
# Nested Functions (1)

#Define three_shouts
def three_shouts(word1, word2, word3):
    """Returns a tuple of strings concatenated with '!!!.'"""
    
    #Define inner()
    def inner(word):
        """Returns a string concatenated with '!!!.'"""
        return word + "!!!"
    
    #Return a tuple of strings
    return (inner(word1), inner(word2), inner(word3))

print(three_shouts('a','b','c'))

('a!!!', 'b!!!', 'c!!!')


In [10]:
# Nested Functions (2)

def echo(n):
    "Return the inner_echo function"
    
    #define inner_echo
    def inner_echo(word1):
        "concatenates n copies of word1"
        echo_word = word1 * n
        return echo_word
    
    return inner_echo

twice = echo(2)
thrice = echo(3)

print(twice("hello"), thrice('world'))

hellohello worldworldworld


In [17]:
#Keyword nonlocal and nested functions

def echo_shout(word):
    "Change the value of a nonlocal variable"
    
    #Concatenate word with itself and set to variable echo_word
    echo_word = word + word
    print(echo_word)
    
    def shout():
        "Alter enclosing scope"
        #Use echo_word in a nonlocal scope
        nonlocal echo_word 
        echo_word = echo_word + "!!!"
    
    shout()
    print(echo_word)

(echo_shout('hello'))

hellohello
hellohello!!!


### Lecture - Default and Flexible Arguments
When there is a function that takes multiple parameters and there is often a common value for some of these parameters, the function can be called without having to specifically the value for every parameter by setting default values for these parameters. Flexible arguments allow a functions to accept any number of arguments.

#### Adding a Default Argument
To define a function with a default value for one of the parameters, in the function header we follow the parameter of interest with an equal sign and a default value.

In [18]:
def power(number, pow = 1):
    '''Raise number to the power of pow'''
    new_value = number ** pow
    return new_value

print(power(9,2))
print(power(9))

81
9


#### Flexible Arguments: * args(1)

When the function needs to be able to accept as many arguments as needed, using the star symbol with the word args creates a tuple called args made up of all the values provided to the function when it is called. This args tuple can then be used in the body of the function.  

In [20]:
def sum_all(*args):
    """Sum all values in *args together"""
    sum_all = 0
    for num in args:
        sum_all += num
    return sum_all

sum_all(5,6,7,8,9,10)

45

#### Flexible Arguments: ** kwargs
Using a double star to pass an arbitrary number of keyword arguments (kwargs), or arguments that are preceeded by identifiers. Using the double star and the parameter kwargs turns the identifier keyword pairs into a dictionary named kwargs into the function body.

In [23]:
def print_all(**kwargs):
    """Print out the key-value pairs in **kwargs"""
    for key, value in kwargs.items():
        print(key + ": " + value)

print_all(name = 'Janine', job = 'Mom')

name: Janine
job: Mom


Note that it's not the names args and kwargs that are meaningful, but the use of * or ** that defines whether a function will take multiple arguments that are turned into a tuple that will named whatever text is provided following the * or a dictionary that will be named whatever text is provided following the **. 

def sum_all(* sa_tup):
def print_all(** pa_dict):

Both work the same as using args and kwargs.

In [29]:
def sum_all(*sa_tup):
    print(sa_tup)
    sum_num = 0
    for num in sa_tup:
        sum_num += num
    print(sum_num)

sum_all(5,6,7,8)

def print_all(**pa_dict):
    print(pa_dict)
    for key, value in pa_dict.items():
        print(key + ": " + value)

print_all(name = "Janine", job = "Mom")

(5, 6, 7, 8)
26
{'name': 'Janine', 'job': 'Mom'}
name: Janine
job: Mom


## Exercise (3)

In [30]:
# Functions with one default argument

def shout_echo(word1, echo = 1):
    echo_word = word1 * echo
    shout_word = echo_word + "!!!"
    return shout_word

no_echo = shout_echo("Hey")
with_echo = shout_echo("Hey", 5)
print(no_echo)
print(with_echo)

Hey!!!
HeyHeyHeyHeyHey!!!


In [33]:
# Functions with multiple default arguments

def shout_echo(word1, echo = 1, intense = False):
    echo_word = word1 * echo
    if intense is True:
        echo_word_new = echo_word.upper() + "!!!"
    else:
        echo_word_new = echo_word + "!!!"
    return echo_word_new

with_big_echo = shout_echo("Hey", 5, True)
big_no_echo = shout_echo("Hey", intense = True) #note you have to include the argument name intense for this to work properly

print(with_big_echo)
print(big_no_echo)

HEYHEYHEYHEYHEY!!!
HEY!!!


In [34]:
#Functions with variable length arguments

def gibberish(*args):
    """Concatenate strings in *args together"""
    hodgepodge = ''
    for word in args:
        hodgepodge += word
    return hodgepodge

one_word = gibberish('luke')
many_words = gibberish('luke', 'leia','han', 'obi','darth')

print(one_word)
print(many_words)

luke
lukeleiahanobidarth


In [38]:
#Functions with variable length keyword arguments

def report_status(**kwargs):
    """Print ou the status of a movie character"""
    print("\nBEGIN: REPORT\n")
    for key, value in kwargs.items():
        print(key + ": " + value)
    print("\n END REPORT")

report_status(name = 'luke', affliation = 'jedi', status = 'missing')
report_status(name = 'anakin', affiliation = 'sith lord', status = 'deceased')


BEGIN: REPORT

name: luke
affliation: jedi
status: missing

 END REPORT

BEGIN: REPORT

name: anakin
affiliation: sith lord
status: deceased

 END REPORT


### Exercise (4)

In [1]:
#Bringing it all together using the count_entries example from Chapter 1
import os
os.chdir('c:\\datacamp\\data\\')
import pandas as pd
tweets_df = pd.read_csv('tweets.csv')

def count_entries(df, col_name = 'lang'):
    """Returns a dictionary with counts of occurrences as value for each key"""
    col_count = {}
    col = df[col_name]
    for entry in col:
        if entry in col_count.keys():
            col_count[entry] += 1
        else:
            col_count[entry] = 1
    return col_count

result1 = count_entries(tweets_df)
result2 = count_entries(tweets_df, 'source')

print(result1)
print(result2)

{'en': 97, 'et': 1, 'und': 2}
{'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>': 24, '<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>': 1, '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>': 26, '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>': 33, '<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>': 2, '<a href="http://www.google.com/" rel="nofollow">Google</a>': 2, '<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>': 6, '<a href="http://linkis.com" rel="nofollow">Linkis.com</a>': 2, '<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>': 2, '<a href="http://ifttt.com" rel="nofollow">IFTTT</a>': 1, '<a href="http://www.myplume.com/" rel="nofollow">Plume\xa0for\xa0Android</a>': 1}


In [1]:
'''Bringing it all together (2)
generalize this function one step further by allowing the user to pass it a flexible argument, that is, in this case, 
as many column names as the user would like'''

import pandas as pd
tweets_df = pd.read_csv('C:\\datacamp\\data\\tweets.csv')

def count_entries(df, *args):
    '''Return a dictionary with counts of occurences as value for each key'''
    cols_count = {}
    
    #Iterate over column names in args
    for col_name in args:
        col = df[col_name]
        for entry in col:
            if entry in cols_count.keys():
                cols_count[entry] += 1
            else:
                cols_count[entry] = 1
        return cols_count

result1 = count_entries(tweets_df, "lang")
result2 = count_entries(tweets_df, "source")

print(result1)
print(result2)

{'en': 97, 'et': 1, 'und': 2}
{'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>': 24, '<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>': 1, '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>': 26, '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>': 33, '<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>': 2, '<a href="http://www.google.com/" rel="nofollow">Google</a>': 2, '<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>': 6, '<a href="http://linkis.com" rel="nofollow">Linkis.com</a>': 2, '<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>': 2, '<a href="http://ifttt.com" rel="nofollow">IFTTT</a>': 1, '<a href="http://www.myplume.com/" rel="nofollow">Plume\xa0for\xa0Android</a>': 1}
