### grp

# Course: _Python Data Science Toolbox (Part 1)_:
1.  user defined functions
2.  arguments and scope
3.  lambda functions and error-handling

## _1. Writing your own Functions_:
-  functions w/o parameters
-  functions w/ 1 parameter
-  functions that return 1 value
-  docstrings (***describe what your function does***)
-  functions w/ multiple parameters (***# of arguments = # of parameters***)
-  tuples (immutable [***can't modify values***] objects):
    -  uses parentheses
    -  unpack tuple into several variables
    -  return multiple function values

### assigning a variable to a function that prints a value but does not return a value will result in that variable being _type_ NoneType! ... better to _return_ value rather than _print()_

#### string manipulation

In [1]:
object1 = "data" + "analysis" + "visualization"
object2 = 1 * 3
object3 = "1" * 3
print(object1)
print(object2)
print(object3)

dataanalysisvisualization
3
111


#### built-in function types

In [2]:
x = 7.77
y1 = str(x)
y2 = print(x)
print(type(x))
print(type(y1))
print(type(y2))

7.77
<class 'float'>
<class 'str'>
<class 'NoneType'>


#### function w/o parameter

In [3]:
def shout():
    """Print a string with three exclamation marks"""
    shout_word = 'congratulations' + '!!!'
    print(shout_word)

shout()

congratulations!!!


#### function w/ single parameter

In [4]:
def shout(word): # word = parameter
    """Print a string with three exclamation marks"""
    shout_word = word + '!!!'
    print(shout_word)

shout('congratulations')

congratulations!!!


#### function returns single value

In [5]:
def shout(word):
    """Return a string with three exclamation marks"""
    shout_word = word + '!!!'
    return shout_word # returns variable

yell = shout('congratulations')

print(yell)
print(type(yell))

congratulations!!!
<class 'str'>


#### function w/ multiple parameters

In [6]:
def shout(word1, word2): # 2 parameters
    """Concatenate strings with three exclamation marks"""
    shout1 = word1 + '!!!'
    shout2 = word2 + '!!!'
    new_shout = shout1 + shout2
    return new_shout

yell = shout('congratulations', 'you')

print(yell)

congratulations!!!you!!!


#### unpack tuple

In [7]:
nums = (3, 4, 6)
num1, num2, num3 = nums

print(num1)
print(num2)
print(num3)

even_nums = (2,4,6)
print(even_nums)
print(even_nums[0]) # element-wise index like lists

3
4
6
(2, 4, 6)
2


#### function returns multiple values

In [8]:
def shout_all(word1, word2):
    """Return a tuple of strings"""
    shout1 = word1 + '!!!'
    shout2 = word2 + '!!!'
    shout_words = (shout1, shout2)
    return shout_words

yell1, yell2 = shout_all('congratulations', 'you') # unpack returned tuple

print(yell1)
print(yell2)

congratulations!!!
you!!!


#### extract data via logic to populate twitter language count dictionary

In [9]:
import pandas as pd

tweets = '/users/grp/Documents/BIGDATA/DATACAMP/3 - pythondatasciencetoolboxpart1/tweets.csv'
tweets_df = pd.read_csv(tweets)
tweets_df[0:3]

Unnamed: 0,contributors,coordinates,created_at,entities,extended_entities,favorite_count,favorited,filter_level,geo,id,...,quoted_status_id,quoted_status_id_str,retweet_count,retweeted,retweeted_status,source,text,timestamp_ms,truncated,user
0,,,Tue Mar 29 23:40:17 +0000 2016,"{'hashtags': [], 'user_mentions': [{'screen_na...","{'media': [{'sizes': {'large': {'w': 1024, 'h'...",0,False,low,,714960401759387648,...,,,0,False,"{'retweeted': False, 'text': "".@krollbondratin...","<a href=""http://twitter.com"" rel=""nofollow"">Tw...",RT @bpolitics: .@krollbondrating's Christopher...,1459294817758,False,"{'utc_offset': 3600, 'profile_image_url_https'..."
1,,,Tue Mar 29 23:40:17 +0000 2016,"{'hashtags': [{'text': 'cruzsexscandal', 'indi...","{'media': [{'sizes': {'large': {'w': 500, 'h':...",0,False,low,,714960401977319424,...,,,0,False,"{'retweeted': False, 'text': '@dmartosko Cruz ...","<a href=""http://twitter.com"" rel=""nofollow"">Tw...",RT @HeidiAlpine: @dmartosko Cruz video found.....,1459294817810,False,"{'utc_offset': None, 'profile_image_url_https'..."
2,,,Tue Mar 29 23:40:17 +0000 2016,"{'hashtags': [], 'user_mentions': [], 'symbols...",,0,False,low,,714960402426236928,...,,,0,False,,"<a href=""http://www.facebook.com/twitter"" rel=...",Njihuni me Zonjën Trump !!! | Ekskluzive https...,1459294817917,False,"{'utc_offset': 7200, 'profile_image_url_https'..."


In [10]:
langs_count = {} # empty dict to store data

col = tweets_df['lang'] # df column
print(col[:3])
print("="*10)

for entry in col: # iterate over df lang column

    if entry in langs_count.keys(): # keys in df lang column
        langs_count[entry] += 1
    else:
        langs_count[entry] = 1 # add value counts

print(langs_count)

0    en
1    en
2    et
Name: lang, dtype: object
{'en': 97, 'et': 1, 'und': 2}


#### convert twitter language count logic into function

In [11]:
def count_entries(df, col_name):
    """Return a dictionary with counts of 
    occurrences as value for each key."""
    langs_count = {}
    col = df[col_name]

    for entry in col:
        if entry in langs_count.keys():
            langs_count[entry] += 1
        else:
            langs_count[entry] = 1
    return langs_count # return dictionary

result = count_entries(tweets_df, 'lang') # tweets_df = df; 'lang' = df column

print(result)

{'en': 97, 'et': 1, 'und': 2}


## _2. Default Arguments, Variable-Length Arguments and Scope_:
-  scope (***part of the program where an object or name may be accessible***):
    1.  (L) local => defined inside a function thus cannot access outside function definition
    2.  (E) enclosing functions
    3.  (G) global => defined in the main body of a script
    4.  (B) built-in => built-in modules (ex: print(), sum(), max())
-  nested functions
-  default (***used when argument not specified***) and flexible arguments (***pass any # of arguments to function***):
    -  *args => variable-length tuple
    -  **kwargs => variable-length dictionary

#### scope

In [12]:
num = 5

def func1():
    num = 3 # local variable
    print(num)

def func2():
    global num
    double_num = num * 2 # local variable
    num = 6 # local variable
    print(double_num)

func1()
func2()
print(num)

3
10
6


#### global

In [13]:
team = "destiny 1" # global variable

def change_team():
    """Change the value of the global variable team."""
    global team
    team = "destiny 2" # local variable

print(team)
change_team()
print(team)

destiny 1
destiny 2


#### built-in

In [14]:
import builtins
print(dir(builtins))



#### nested function

In [15]:
def three_shouts(word1, word2, word3):
    """Returns a tuple of strings
    concatenated with '!!!'."""

    def inner(word):
        """Returns a string concatenated with '!!!'."""
        return word + '!!!'

    return (inner(word1), inner(word2), inner(word3))

print(three_shouts('a', 'b', 'c'))

('a!!!', 'b!!!', 'c!!!')


In [16]:
def echo(n):
    """Return the inner_echo function."""

    def inner_echo(word1):
        """Concatenate n copies of word1."""
        echo_word = word1 * n
        return echo_word

    return inner_echo

twice = echo(2)
thrice = echo(3)
print(twice('hello'), thrice('hello'))

hellohello hellohellohello


#### nonlocal

In [17]:
def outer():
    """Prints the value of n."""
    n = 1
    def inner():
        nonlocal n
        n = 3
        print(n)
    inner()
    print(n)
outer()

3
3


In [18]:
def echo_shout(word):
    """Change the value of a nonlocal variable"""
    echo_word = word*2
    print(echo_word)
    
    def shout():
        """Alter a variable in the enclosing scope"""    
        nonlocal echo_word
        echo_word = echo_word + '!!!'
    
    shout()
    print(echo_word)

echo_shout('hello')

hellohello
hellohello!!!


#### 1 default argument

In [19]:
def shout_echo(word1, echo=1): # echo = default argument set to 1
    """Concatenate echo copies of word1 and three
     exclamation marks at the end of the string."""
    echo_word = echo * word1
    shout_word = echo_word + '!!!'
    return shout_word

no_echo = shout_echo("Hey")
with_echo = shout_echo("Hey", 5)
print(no_echo)
print(with_echo)

Hey!!!
HeyHeyHeyHeyHey!!!


#### multiple default arguments

In [20]:
def shout_echo(word1, echo=1, intense=False):
    """Concatenate echo copies of word1 and three
    exclamation marks at the end of the string."""
    echo_word = word1 * echo
    if intense is True:
        echo_word_new = echo_word.upper() + '!!!'
    else:
        echo_word_new = echo_word + '!!!'
    return echo_word_new

with_big_echo = shout_echo("Hey", 5, True)
big_no_echo = shout_echo("Hey") # pulls in default arguments set
print(with_big_echo)
print(big_no_echo)

HEYHEYHEYHEYHEY!!!
Hey!!!


#### *args

In [21]:
def gibberish(*args):
    """Concatenate strings in *args together."""
    hodgepodge = ''
    for word in args:
        hodgepodge += word
    return hodgepodge

one_word = gibberish("luke")
many_words = gibberish("luke", "leia", "han", "obi", "darth")
print(one_word)
print(many_words)

luke
lukeleiahanobidarth


#### **kwargs

In [22]:
def report_status(**kwargs):
    """Print out the status of a movie character."""
    print("\nBEGIN: REPORT\n")
    for k, v in kwargs.items():
        print(k + ": " + v)
    print("\nEND REPORT")

report_status(name="luke", affiliation="jedi", status="missing")
report_status(name="anakin", affiliation="sith lord", status="deceased")


BEGIN: REPORT

name: luke
affiliation: jedi
status: missing

END REPORT

BEGIN: REPORT

name: anakin
affiliation: sith lord
status: deceased

END REPORT


#### convert twitter language count logic into function w/ default argument

In [23]:
def count_entries(df, col_name = 'lang'):
    """Return a dictionary with counts of
    occurrences as value for each key."""
    cols_count = {}
    col = df[col_name]
    for entry in col:
        if entry in cols_count.keys():
            cols_count[entry] += 1
        else:
            cols_count[entry] = 1
    return cols_count

result1 = count_entries(tweets_df)
result2 = count_entries(tweets_df, 'source')
print(result1)
print("="*10)
for k,v in result2.items(): print(str(k) + ": " + str(v))

{'en': 97, 'et': 1, 'und': 2}
<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>: 24
<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>: 1
<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>: 26
<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>: 33
<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>: 2
<a href="http://www.google.com/" rel="nofollow">Google</a>: 2
<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>: 6
<a href="http://linkis.com" rel="nofollow">Linkis.com</a>: 2
<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>: 2
<a href="http://ifttt.com" rel="nofollow">IFTTT</a>: 1
<a href="http://www.myplume.com/" rel="nofollow">Plume for Android</a>: 1


#### convert twitter language count logic into function w/ flexible argument(s)

In [24]:
def count_entries(df, *args):
    """Return a dictionary with counts of
    occurrences as value for each key."""
    cols_count = {}
    for col_name in args:
        col = df[col_name]
        for entry in col:
            if entry in cols_count.keys():
                cols_count[entry] += 1
            else:
                cols_count[entry] = 1
    return cols_count

result2 = count_entries(tweets_df, 'lang', 'source')
for k,v in result2.items(): print(str(k) + ": " + str(v))

en: 97
et: 1
und: 2
<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>: 24
<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>: 1
<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>: 26
<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>: 33
<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>: 2
<a href="http://www.google.com/" rel="nofollow">Google</a>: 2
<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>: 6
<a href="http://linkis.com" rel="nofollow">Linkis.com</a>: 2
<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>: 2
<a href="http://ifttt.com" rel="nofollow">IFTTT</a>: 1
<a href="http://www.myplume.com/" rel="nofollow">Plume for Android</a>: 1


## _3. Lambda Functions and Error-Handling_:
-  lambda functions
-  map() function [***applies a function over an object*** (ex: list)]
-  filter() function
-  reduce function [***returns single value as result***]

-  error-handling:
    -  exceptions => errors caught during execution:
        -  try-except clause
        -  raise clause

#### lambda function

In [25]:
add_bangs = (lambda a: a + '!!!')
add_bangs('hello')

'hello!!!'

In [26]:
echo_word = (lambda word1, echo: word1 * echo)
result = echo_word('hey', 5)
print(result)

heyheyheyheyhey


#### map function

In [27]:
help(map)

Help on class map in module builtins:

class map(object)
 |  map(func, *iterables) --> map object
 |  
 |  Make an iterator that computes the function using arguments from
 |  each of the iterables.  Stops when the shortest iterable is exhausted.
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.



In [28]:
spells = ['protego', 'accio', 'expecto patronum', 'legilimens']

shout_spells = map(lambda item: item + '!!!', spells)
shout_spells_list = list(shout_spells)
print(shout_spells_list)

['protego!!!', 'accio!!!', 'expecto patronum!!!', 'legilimens!!!']


#### filter function

In [29]:
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']

result = filter(lambda member: len(member) > 6, fellowship)
result_list = list(result)
print(result_list)

['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']


#### reduce function

In [30]:
from functools import reduce

stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']

result = reduce(lambda item1, item2: item1+item2, stark)
print(result)

robbsansaaryabrandonrickon


#### try-except error handling

In [31]:
def shout_echo(word1, echo=1): # default argument is an int
    """Concatenate echo copies of word1 and three
    exclamation marks at the end of the string."""
    echo_word = ''
    shout_words = ''
    try:
        echo_word = echo * word1
        shout_words = echo_word + '!!!'
    except:
        print("word1 must be a string and echo must be an integer.")
    return shout_words

shout_echo("particle", echo="accelerator")
shout_echo("particle")

word1 must be a string and echo must be an integer.


'particle!!!'

#### raise error handling

In [32]:
def shout_echo(word1, echo=1):
    """Concatenate echo copies of word1 and three
    exclamation marks at the end of the string."""
    if echo < 0:
        raise ValueError('echo must be greater than 0')
    echo_word = word1 * echo
    shout_word = echo_word + '!!!'
    return shout_word

print(shout_echo("particle", echo=3))
print("="*10)
shout_echo("particle", echo=-1) # force error

particleparticleparticle!!!


ValueError: echo must be greater than 0

#### lambda & error handling exercise

In [33]:
# filter for tweets w/ first 2 characters as 'RT' in df's text column
result = filter(lambda x: x[0:2]=='RT', tweets_df['text'])
res_list = list(result)
res_list[:5] # list first 5 elements (tweets)

["RT @bpolitics: .@krollbondrating's Christopher Whalen says Clinton is the weakest Dem candidate in 50 years https://t.co/pLk7rvoRSn https:/…",
 'RT @HeidiAlpine: @dmartosko Cruz video found.....racing from the scene.... #cruzsexscandal https://t.co/zuAPZfQDk3',
 'RT @AlanLohner: The anti-American D.C. elites despise Trump for his America-first foreign policy. Trump threatens their gravy train. https:…',
 'RT @BIackPplTweets: Young Donald trump meets his neighbor  https://t.co/RFlu17Z1eE',
 'RT @trumpresearch: @WaitingInBagdad @thehill Trump supporters have selective amnisia.']

In [34]:
def count_entries(df, col_name='lang'):
    """Return a dictionary with counts of
    occurrences as value for each key."""
    cols_count = {}
    try:
        col = df[col_name]
        for entry in col:
            if entry in cols_count.keys():
                cols_count[entry] += 1
            else:
                cols_count[entry] = 1
        return cols_count
    except:
        print('The DataFrame does not have a ' + col_name + ' column.')

result1 = count_entries(tweets_df, 'lang')
print(result1)
result2 = count_entries(tweets_df, 'lang1')

{'en': 97, 'et': 1, 'und': 2}
The DataFrame does not have a lang1 column.


In [35]:
def count_entries(df, col_name='lang'):
    """Return a dictionary with counts of
    occurrences as value for each key."""
    if col_name not in df.columns:
        raise ValueError('The DataFrame does not have a ' + col_name + ' column.')
    cols_count = {}
    col = df[col_name]
    for entry in col:
        if entry in cols_count.keys():
            cols_count[entry] += 1
        else:
            cols_count[entry] = 1
    return cols_count

result1 = count_entries(tweets_df, 'lang')
print(result1)
print("="*10)
count_entries(tweets_df, 'lang1') # force error

{'en': 97, 'et': 1, 'und': 2}


ValueError: The DataFrame does not have a lang1 column.

### grp