# Python Data Science Toolbox (Part 1)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from jupyterthemes import jtplot
import warnings

jtplot.style(theme='chesterish')
warnings.filterwarnings('ignore')
%config InlineBackend.figure_format='retina'

## Chapter 1. Writing your own functions

## 1. User-defined functions

### Built-in functions

* `str()`

In [2]:
x = str(5)
print(x)

5


In [3]:
print(type(x))

<class 'str'>


### Defining a function

In [4]:
def square():            # function header
    new_value = 4 ** 2   # function body (indented)
    print(new_value)
    
square()

16


### Function parameters

In [5]:
def square(value):
    new_value = value ** 2
    print(new_value)
    
square(4)

16


In [6]:
square(5)

25


### Return values from functions

* Return a value from a function using `return`

In [7]:
def square(value):
    new_value = value ** 2
    return new_value

num = square(4)
print(num)

16


### Docstrings

* Docstrings describe what your function does
* Serve as documentation for your function
* Placed in the immediate line after the function header
* In between triple double quotes """

In [8]:
def square(value):
    '''Return the square of a value.'''
    new_value = value ** 2
    return new_value

### №1 Strings in Python

Execute the following code in the shell:

```python
object1 = 'data' + 'analysis' + 'visualization'
object2 = 1 * 3
object3 = '1' * 3
```

What are the values in `object1`, `object2`, and `object3`, respectively?

* `object1` contains `'data + analysis + visualization'`, `object2` contains '`1*3'`, `object3` contains `13`
* `object1` contains `'data+analysis+visualization'`, `object2` contains `3`, `object3` contains `'13'`
* *`object1` contains `'dataanalysisvisualization'`, `object2` contains `3`, `object3` contains `'111'`*

### №2 Recapping built-in functions

1. Assign `4.89` to a variable `x`: `x = 4.89`
1. Assign `str(x)` to a variable `y1`: `y1 = str(x)`
1. Assign `print(x)` to a variable `y2`: `y2 = print(x)`
1. Check the types of the variables `x`, `y1`, and `y2`

What are the types of `x`, `y1`, and `y2`?

* They are all `str` types
* `x` is a `float`, `y1` is an `float`, and `y2` is a `str`
* *`x` is a `float`, `y1` is a `str`, and `y2` is a `NoneType`*
* They are all `NoneType` types

### №3 Write a simple function

* Complete the function header by adding the appropriate function name, `shout`
* In the function body, concatenate the string, `'congratulations'` with another string, `'!!!'`. Assign the result to `shout_word`
* Print the value of `shout_word`
* Call the `shout` function

In [9]:
def shout():
    '''Print a string with three exclamation marks'''
    shout_word = 'congratulations' + '!!!'
    print(shout_word)

shout()

congratulations!!!


### №4 Single-parameter functions

* Complete the function header by adding the parameter name, `word`
* Assign the result of concatenating `word` with `'!!!'` to `shout_word`
* Print the value of `shout_word`
* Call the `shout()` function, passing to it the string, `'congratulations'`

In [10]:
def shout(word):
    '''Print a string with three exclamation marks'''
    shout_word = word + '!'
    print(shout_word)

shout('congratulations')

congratulations!


### №5 Functions that return single values

* In the function body, concatenate the string in `word` with `'!!!'` and assign to `shout_word`
* Replace the `print()` statement with the appropriate `return` statement
* Call the `shout()` function, passing to it the string, `'congratulations'`, and assigning the call to the variable, `yell`
* To check if yell contains the value returned by `shout()`, print the value of `yell`

In [11]:
def shout(word):
    '''Return a string with three exclamation marks'''
    shout_word = word + '!!!'
    return shout_word

yell = shout('congratulations')
print(yell)

congratulations!!!


## 2. Multiple parameters and return values 

### Multiple function parameters

* Accept more than 1 parameter:

In [12]:
def raise_to_power(value1, value2):
    '''Raise value1 to the power of value2.'''
    new_value = value1 ** value2
    return new_value

* Call function: # of arguments = # of parameters

In [13]:
result = raise_to_power(2, 3)
print(result)

8


### A quick jump into tuples

* Make functions return multiple values: Tuples!
* Tuples:
    * Like a list - can contain multiple values
    * Immutable - can’t modify values!
    * Constructed using parentheses ()

In [14]:
even_nums = (2, 4, 6)
print(type(even_nums))

<class 'tuple'>


### Unpacking tuples

* Unpack a tuple into several variables:

In [15]:
even_nums = (2, 4, 6)
a, b, c = even_nums

print(a)
print(b)
print(c)

2
4
6


### Accessing tuple elements

* Access tuple elements like you do with lists:

In [16]:
even_nums = (2, 4, 6)
print(even_nums[1])

4


In [17]:
second_num = even_nums[1]
print(second_num) 

4


* Uses zero-indexing

### Returning multiple values

In [18]:
def raise_both(value1, value2):
    '''Raise value1 to the power of value2 and vice versa.'''
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    new_tuple = (new_value1, new_value2)
    return new_tuple

result = raise_both(2, 3)
print(result)

(8, 9)


### №6 Functions with multiple parameters

* Modify the function header such that it accepts two parameters, `word1` and `word2`, in that order
* Concatenate each of `word1` and `word2` with `'!!!'` and assign to `shout1` and `shout2`, respectively
* Concatenate `shout1` and `shout2` together, in that order, and assign to `new_shout`
* Pass the strings `'congratulations'` and `'you'`, in that order, to a call to `shout()`. Assign the return value to `yell`

In [19]:
def shout(word1, word2):
    '''Concatenate strings with three exclamation marks'''
    shout1 = word1 + '!!!'
    shout2 = word2 + '!!!'
    new_shout = shout1 + shout2
    return new_shout

yell = shout('congratulations', 'you')
print(yell)

congratulations!!!you!!!


### №7 A brief introduction to tuples

* Unpack `nums` to the variables `num1`, `num2`, and `num3`
* Construct a new tuple, `even_nums` composed of the same elements in `nums`, but with the 1st element replaced with the value, 2

In [20]:
nums = (3, 4, 6)
num1, num2, num3 = nums[0], nums[1], nums[2]
even_nums = 2, num2, num3

### №8 Functions that return multiple values

* Modify the function header such that the function name is now `shout_all`, and it accepts two parameters, `word1` and `word2`, in that order
* Concatenate the string `'!!!'` to each of word1 and word2 and assign to `shout1` and `shout2`, respectively
* Construct a tuple `shout_words`, composed of `shout1` and `shout2`
* Call `shout_all()` with the strings `'congratulations'` and `'you'` and assign the result to `yell1` and `yell2` (remember, `shout_all()` returns 2 variables!)

In [21]:
def shout_all(word1, word2):
    shout1 = word1 + '!!!'
    shout2 = word2 + '!!!'
    shout_words = shout1, shout2
    return shout_words

yell1, yell2 = shout_all('congratulations', 'you')

print(yell1)
print(yell2)

congratulations!!!
you!!!


### №9 Bringing it all together (1)

* Import the pandas package with the alias `pd`
* Import the file `'tweets.csv'` using the pandas function `read_csv()`. Assign the resulting DataFrame to `df`
* Complete the `for` loop by iterating over `col`, the `'lang'` column in the DataFrame `df`
* Complete the bodies of the `if-else` statements in the `for` loop: if the key is in the dictionary `langs_count`, add `1` to the value corresponding to this key in the dictionary, else add the key to `langs_count` and set the corresponding value to `1`. Use the loop variable `entry` in your code

In [22]:
import pandas as pd

df = pd.read_csv('Python_Data_Science_Toolbox_Part1/tweets.csv')
langs_count = {}
col = df['lang']

for entry in col:
    if entry in langs_count.keys():
        langs_count[entry] += 1
    else:
        langs_count[entry] = 1

print(langs_count)

{'en': 97, 'et': 1, 'und': 2}


### №10 Bringing it all together (2)

* Define the function `count_entries()`, which has two parameters. The first parameter is `df` for the DataFrame and the second is `col_name` for the column name
* Complete the bodies of the `if-else` statements in the `for` loop: if the key is in the dictionary `langs_count`, add `1` to its current value, else add the key to `langs_count` and set its value to `1`. Use the loop variable `entry` in your code
* Return the `langs_count` dictionary from inside the `count_entries()` function
* Call the `count_entries()` function by passing to it `tweets_df` and the name of the column, `'lang'`. Assign the result of the call to the variable `result`

In [23]:
tweets_df = pd.read_csv('Python_Data_Science_Toolbox_Part1/tweets.csv')

def count_entries(df, col_name):
    '''Return a dictionary with counts of woccurrences as value for each key.'''
    langs_count = {}
    col = df[col_name]
    
    for entry in col:
        if entry in langs_count.keys():
            langs_count[entry] += 1
        else:
            langs_count[entry] = 1

    return langs_count

result = count_entries(tweets_df, 'lang')
print(result)

{'en': 97, 'et': 1, 'und': 2}


## Chapter 2. Default arguments, variable-length arguments and scope

## 3. Scope and user-defined functions

### Crash course on scope in functions

* Not all objects are accessible everywhere in a script
* **Scope** - part of the program where an object or name may be accessible
    * *Global scope* - defined in the main body of a script
    * *Local scope* - defined inside a function
    * *Built-in scope* - names in the pre-defined built-ins module

### Global vs. local scope (1)

```python
In [1]: def square(value):
            """Returns the square of a number."""
            new_val = value ** 2
            return new_val
    
In [2]: square(3)
Out[2]: 9
    
In [3]: new_val
```
```
        -------------------------------------------------------------------
        NameError                       Traceback (most recent call last)
        <ipython-input-3-3cc6c6de5c5c> in <module>()
        ----> 1 new_value
        NameError: name 'new_val' is not defined
```

### Global vs. local scope (2)

In [24]:
new_val = 10
def square(value):
    '''Returns the square of a number.'''
    new_val = value ** 2
    return new_val

square(3)

9

In [25]:
new_val

10

### Global vs. local scope (3)

In [26]:
new_val = 10
def square(value):
    '''Returns the square of a number.'''
    new_value2 = new_val ** 2
    return new_value2

square(3)

100

In [27]:
new_val = 20
square(3)

400

### Global vs. local scope (4)

In [28]:
new_val = 10
def square(value):
    '''Returns the square of a number.'''
    global new_val
    new_val = new_val ** 2
    return new_val

square(3)

100

In [29]:
new_val 

100

### №11 Pop quiz on understanding scope

The variable num has been predefined as 5, alongside the following function definitions:
```python
def func1():
    num = 3
    print(num)

def func2():
    global num
    double_num = num * 2
    num = 6
    print(double_num)
```

Try calling `func1()` and `func2()` in the shell, then answer the following questions:
1. What are the values printed out when you call `func1()` and `func2()`?
1. What is the value of `num` in the global scope after calling `func1()` and `func2()`?
  
  
* `func1()` prints out `3`, `func2()` prints out `6`, and the value of `num` in the global scope is `3`
* `func1()` prints out `3`, `func2()` prints out `3`, and the value of `num` in the global scope is `3`
* `func1()` prints out `3`, `func2()` prints out `10`, and the value of `num` in the global scope is `10`
* *`func1()` prints out `3`, `func2()` prints out `10`, and the value of `num` in the global scope is `6`*

In [30]:
num = 5

def func1():
    num = 3
    print(num)

def func2():
    global num
    double_num = num * 2
    num = 6
    print(double_num)

func1()
func2()
print(num)

3
10
6


### №12 The keyword global

* Use the keyword `global` to alter the object `team` in the global scope
* Change the value of `team` in the global scope to the string `'justice league'`. Assign the result to `team`
* Defined function `change_team()` changes the value of the name team!

In [31]:
team = 'teen titans'

def change_team():
    '''Change the value of the global variable team.'''
    global team
    team = 'justice league'
    
print(team)

change_team()
print(team)

teen titans
justice league


### №13 Python's built-in scope

After executing `import builtins` in the IPython Shell, execute `dir(builtins)` to print a list of all the names in the module `builtins`. Have a look and you'll see a bunch of names that you'll recognize! Which of the following names is NOT in the module builtins?

* `'sum'`
* `'range'`
* *`'array'`*
* `'tuple'`

In [32]:
import builtins

dir(builtins)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

## 4. Nested functions

### Nested functions (1)

```python
def outer( … ):
    ''' … '''
    x = …

    def inner( … ):
        ''' … '''
        y = x ** 2
    return …
```

### Nested functions (2)

In [33]:
def raise_both(value1, value2):
    '''Raise value1 to the power of value2 and vice versa.'''
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    new_tuple = (new_value1, new_value2)
    return new_tuple

### Nested functions (3)

In [34]:
def mod2plus5(x1, x2, x3):
    '''Returns the remainder plus 5 of three values.'''
    def inner(x):
        '''Returns the remainder plus 5 of a value.'''
        return x % 2 + 5
    return (inner(x1), inner(x2), inner(x3))

print(mod2plus5(1, 2, 3))

(6, 5, 6)


### Returning functions

In [35]:
def raise_val(n):
    '''Return the inner function.'''
    def inner(x):
        '''Raise x to the power of n.'''
        raised = x ** n
        return raised
    return inner

In [36]:
square = raise_val(2)
cube = raise_val(3)
print(square(2), cube(4))

4 64


### Using `nonlocal`

In [37]:
def outer():
    '''Prints the value of n.'''
    n = 1
    def inner():
        nonlocal n
        n = 2
        print(n)
    inner()
    print(n)
    
outer()

2
2


### Scopes searched

* Local scope
* Enclosing functions
* Global
* Built-in

### №14 Nested Functions I

* Complete the function header of the nested function with the function name `inner()` and a single parameter `word`
* Complete the return value: each element of the tuple should be a call to `inner()`, passing in the parameters from `three_shouts()` as arguments to each call

In [38]:
def three_shouts(word1, word2, word3):
    '''Returns a tuple of strings concatenated with '!!!'.'''

    def inner(word):
        '''Returns a string concatenated with '!!!'.'''
        return word + '!!!'

    return (inner(word1), inner(word2), inner(word3))

print(three_shouts('a', 'b', 'c'))

('a!!!', 'b!!!', 'c!!!')


### №15 Nested Functions II

* Complete the function header of the `inner` function with the function name `inner_echo()` and a single parameter `word1`
* Complete the function `echo()` so that it returns `inner_echo`
* We have called `echo()`, passing `2` as an argument, and assigned the resulting function to twice. Your job is to call `echo()`, passing `3` as an argument. Assign the resulting function to thrice
* Call `twice()` and `thrice()` and print the results

In [39]:
def echo(n):
    
    def inner_echo(word1):
        echo_word = word1 * n
        return echo_word
        
    return inner_echo

twice = echo(2)
thrice = echo(3)

print(twice('hello'), thrice('hello'))

hellohello hellohellohello


### №16 The keyword nonlocal and nested functions

* Assign to `echo_word` the string word, concatenated with itself
* Use the keyword `nonlocal` to alter the value of `echo_word` in the enclosing scope
* Alter `echo_word` to `echo_word` concatenated with `'!!!'`
* Call the function `echo_shout()`, passing it a single argument `'hello'`

In [40]:
def echo_shout(word):
    '''Change the value of a nonlocal variable'''
    echo_word = word + word
    print(echo_word)
    
    def shout():
        '''Alter a variable in the enclosing scope'''
        nonlocal echo_word
        echo_word = echo_word + '!!!'
    
    shout()
    print(echo_word)

echo_shout('hello')

hellohello
hellohello!!!


## 5. Default and flexible arguments

In [41]:
def power(number, pow=1):
    '''Raise number to the power of pow.'''
    new_value = number ** pow
    return new_value

In [42]:
power(9, 2)

81

In [43]:
power(9, 1)

9

In [44]:
power(9)

9

### Flexible arguments: `*args` (1)

In [45]:
def add_all(*args):
    '''Sum all values in *args together.'''
    
    # Initialize sum
    sum_all = 0
    
    # Accumulate the sum
    for num in args:
        sum_all += num
    
    return sum_all

### Flexible arguments: `*args` (2)

In [46]:
add_all(1)

1

In [47]:
add_all(1, 2)

3

In [48]:
add_all(5, 10, 15, 20)

50

### Flexible arguments: `**kwargs`

In [49]:
def print_all(**kwargs):
    '''Print out key-value pairs in **kwargs.'''
    
    # Print out the key-value pairs
    for key, value in kwargs.items():
        print(key + ': ' + value)

print_all(name='Hugo Bowne-Anderson', employer='DataCamp')

name: Hugo Bowne-Anderson
employer: DataCamp


### №17 Functions with one default argument

* Complete the function header with the function name `shout_echo`. It accepts an argument `word1` and a default argument `echo` with default value `1`, in that order
* Use the `*` operator to concatenate echo copies of `word1`. Assign the result to `echo_word`
* Call `shout_echo()` with just the string, `'Hey'`. Assign the result to `no_echo`
* Call `shout_echo()` with the string `'Hey'` and the value 5 for the default argument, `echo`. Assign the result to `with_echo`

In [50]:
def shout_echo(word1, echo=1):
    '''Concatenate echo copies of word1 and three 
    exclamation marks at the end of the string.'''
    echo_word = word1 * echo
    shout_word = echo_word + '!!!'
    return shout_word

no_echo = shout_echo('Hey')
print(no_echo)

with_echo = shout_echo('Hey', echo=5)
print(with_echo)

Hey!!!
HeyHeyHeyHeyHey!!!


### №18 Functions with multiple default arguments

* Complete the function header with the function name `shout_echo`. It accepts an argument `word1`, a default argument `echo` with default value `1` and a default argument `intense` with default value `False`, in that order
* In the body of the if statement, make the string object `echo_word` upper case by applying the method `.upper()` on it
* Call `shout_echo()` with the string, `'Hey'`, the value `5` for `echo` and the value `True` for `intense`. Assign the result to `with_big_echo`
* Call `shout_echo()` with the string `'Hey'` and the value `True` for `intense`. Assign the result to `big_no_echo`

In [51]:
def shout_echo(word1, echo=1, intense=False):
    '''Concatenate echo copies of word1 and three
    exclamation marks at the end of the string.'''
    echo_word = word1 * echo

    if intense is True:
        echo_word_new = echo_word.upper() + '!!!'
    else:
        echo_word_new = echo_word + '!!!'

    return echo_word_new

with_big_echo = shout_echo('Hey', 5, True)
print(with_big_echo)

big_no_echo = shout_echo('Hey', intense=True)
print(big_no_echo)

HEYHEYHEYHEYHEY!!!
HEY!!!


### №19 Functions with variable-length arguments (*args)

* Complete the function header with the function name `gibberish`. It accepts a single flexible argument `*args`
* Initialize a variable `hodgepodge` to an empty string
* Return the variable `hodgepodge` at the end of the function body
* Call `gibberish()` with the single string, `'luke'`. Assign the result to `one_word`
* Call `gibberish()` with multiple arguments and to print the value to the Shell

In [52]:
def gibberish(*args):
    '''Concatenate strings in *args together.'''
    hodgepodge = ''
    for word in args:
        hodgepodge += word
    return hodgepodge

one_word = gibberish('luke')
print(one_word)

many_words = gibberish('luke', 'leia', 'han', 'obi', 'darth')
print(many_words)

luke
lukeleiahanobidarth


### №20 Functions with variable-length keyword arguments (**kwargs)

* Complete the function header with the function name `report_status`. It accepts a single flexible argument `**kwargs`
* Iterate over the key-value pairs of kwargs to print out the keys and values, separated by a colon `':'`
* In the first call to `report_status()`, pass the following keyword-value pairs: `name='luke'`, `affiliation='jedi'` and `status='missing'`
* In the second call to `report_status()`, pass the following keyword-value pairs: `name='anakin'`, `affiliation='sith lord'` and `status='deceased'`

In [53]:
def report_status(**kwargs):
    '''Print out the status of a movie character.'''
    print('\nBEGIN: REPORT\n')

    for key, value in kwargs.items():
        print(key + ': ' + value)
    print('\nEND REPORT')

report_status(name='luke', affiliation='jedi', status='missing')
report_status(name='anakin', affiliation='sith lord', status='deceased')


BEGIN: REPORT

name: luke
affiliation: jedi
status: missing

END REPORT

BEGIN: REPORT

name: anakin
affiliation: sith lord
status: deceased

END REPORT


## 6. Bringing it all together

### Next exercises:

* Generalized functions:
    * Count occurrences for any column
    * Count occurrences for an arbitrary number of columns

### Add a default argument

In [54]:
def power(number, pow=1):
    '''Raise number to the power of pow.'''
    new_value = number ** pow
    return new_value

power(9, 2)

81

In [55]:
power(9)

9

### Flexible arguments: `*args` (1)

In [56]:
def add_all(*args):
    '''Sum all values in *args together.'''
    
    # Initialize sum
    sum_all = 0
    
    # Accumulate the sum
    for num in args:
        sum_all = sum_all + num
    
    return sum_all

### №21 Bringing it all together (1)

* Complete the function header by supplying the parameter for a DataFrame `df` and the parameter `col_name` with a default value of `'lang'` for the DataFrame column name
* Call `count_entries()` by passing the `tweets_df` DataFrame and the column name `'lang'`. Assign the result to `result1`. Note that since `'lang'` is the default value of the `col_name` parameter, you don't have to specify it here
* Call `count_entries()` by passing the `tweets_df` DataFrame and the column name `'source'`. Assign the result to `result2`

In [57]:
def count_entries(df, col_name = 'lang'):
    '''Return a dictionary with counts of
    occurrences as value for each key.'''
    cols_count = {}
    col = df[col_name]
    
    for entry in col:
        if entry in cols_count.keys():
            cols_count[entry] += 1
        else:
            cols_count[entry] = 1

    return cols_count

result1 = count_entries(tweets_df)
print(result1)

result2 = count_entries(tweets_df, 'source')
print(result2)

{'en': 97, 'et': 1, 'und': 2}
{'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>': 24, '<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>': 1, '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>': 26, '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>': 33, '<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>': 2, '<a href="http://www.google.com/" rel="nofollow">Google</a>': 2, '<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>': 6, '<a href="http://linkis.com" rel="nofollow">Linkis.com</a>': 2, '<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>': 2, '<a href="http://ifttt.com" rel="nofollow">IFTTT</a>': 1, '<a href="http://www.myplume.com/" rel="nofollow">Plume\xa0for\xa0Android</a>': 1}


### №22 Bringing it all together (2)

* Complete the function header by supplying the parameter for the dataframe `df` and the flexible argument `*args`
* Complete the for loop within the function definition so that the loop occurs over the tuple args
* Call `count_entries()` by passing the `tweets_df` DataFrame and the column name `'lang'`. Assign the result to `result1`
* Call `count_entries()` by passing the `tweets_df` DataFrame and the column names `'lang'` and `'source'`. Assign the result to `result2`

In [58]:
def count_entries(df, *args):
    '''Return a dictionary with counts of
    occurrences as value for each key.'''
    cols_count = {}
    
    for col_name in args:
        col = df[col_name]
        for entry in col:
            if entry in cols_count.keys():
                cols_count[entry] += 1
            else:
                cols_count[entry] = 1

    return cols_count

result1 = count_entries(tweets_df, 'lang')
print(result1)

result2 = count_entries(tweets_df, 'lang', 'source')
print(result2)

{'en': 97, 'et': 1, 'und': 2}
{'en': 97, 'et': 1, 'und': 2, '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>': 24, '<a href="http://www.facebook.com/twitter" rel="nofollow">Facebook</a>': 1, '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>': 26, '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>': 33, '<a href="http://www.twitter.com" rel="nofollow">Twitter for BlackBerry</a>': 2, '<a href="http://www.google.com/" rel="nofollow">Google</a>': 2, '<a href="http://twitter.com/#!/download/ipad" rel="nofollow">Twitter for iPad</a>': 6, '<a href="http://linkis.com" rel="nofollow">Linkis.com</a>': 2, '<a href="http://rutracker.org/forum/viewforum.php?f=93" rel="nofollow">newzlasz</a>': 2, '<a href="http://ifttt.com" rel="nofollow">IFTTT</a>': 1, '<a href="http://www.myplume.com/" rel="nofollow">Plume\xa0for\xa0Android</a>': 1}


## Chapter 3. Lambda functions and error-handling

## 7. **lambda** functions

### **lambda** functions

In [59]:
raise_to_power = lambda x, y: x ** y
raise_to_power(2, 3)

8

### Anonymous functions

* Function map takes two arguments: `.map(func, seq)`
* `map()` applies the function to ALL elements in the sequence

In [60]:
nums = [48, 6, 9, 21, 1]
square_all = map(lambda num: num ** 2, nums)
print(square_all)

<map object at 0x1a1a3bfdd8>


In [61]:
 print(list(square_all))

[2304, 36, 81, 441, 1]


### №23 Pop quiz on lambda functions

How would you write a lambda function `add_bangs` that adds three exclamation points `'!!!'` to the end of a string `a`? How would you call `add_bangs` with the argument `'hello'`?
  
  
* The lambda function definition is: `add_bangs = (a + '!!!')`, and the function call is: `add_bangs('hello')`
* *The lambda function definition is: `add_bangs = (lambda a: a + '!!!')`, and the function* * The lambda function definition is: `(lambda a: a + '!!!') = add_bangs`, and the function call is: `add_bangs('hello')`

### №24 Writing a lambda function you already know

* Define the lambda function `echo_word` using the variables `word1` and `echo`. Replicate what the original function definition for `echo_word()` does above
* Call `echo_word()` with the string argument `'hey'` and the value `5`, in that order. Assign the call to `result`

In [62]:
echo_word = (lambda word1, echo: word1 * echo)

result = echo_word('hey', 5)
print(result)

heyheyheyheyhey


### №25 Map() and lambda functions

* In the `map()` call, pass a lambda function that concatenates the string `'!!!'` to a string `item`; also pass the list of strings, `spells`. Assign the resulting map object to `shout_spells`
* Convert `shout_spells` to a list and print out the list

In [63]:
spells = ['protego', 'accio', 'expecto patronum', 'legilimens']
shout_spells = map(lambda a: a + '!!!', spells)

shout_spells_list = list(shout_spells)
print(shout_spells_list)

['protego!!!', 'accio!!!', 'expecto patronum!!!', 'legilimens!!!']


### №26 Filter() and lambda functions

* In the `filter()` call, pass a lambda function and the list of strings, `fellowship`. The lambda function should check if the number of characters in a string `member` is greater than 6; use the `len()` function to do this. Assign the resulting filter object to `result`
* Convert `result` to a list and print out the list

In [64]:
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn',
              'boromir', 'legolas', 'gimli', 'gandalf']
result = filter(lambda member: len(member) > 6, fellowship)

result_list = list(result)
print(result_list)

['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']


### №27 Reduce() and lambda functions

* Import the `reduce` function from the `functools` module
* In the `reduce()` call, pass a lambda function that takes two string arguments `item1` and `item2` and concatenates them; also pass the list of strings, `stark`. Assign the result to `result`. The first argument to `reduce()` should be the lambda function and the second argument is the list `stark`

In [65]:
from functools import reduce

stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']

result = reduce(lambda item1, item2: item1 + item2, stark)
print(result)

robbsansaaryabrandonrickon


## 8. Introduction to error handling

### Passing an incorrect argument

In [66]:
float(2)

2.0

In [67]:
float(2.3)

2.3

```python
In [66]: float('hello')
```
```
------------------------------------------------------------------
ValueError                       Traceback (most recent call last)
<ipython-input-3-d0ce8bccc8b2> in <module>()
----> 1 float('hi')
ValueError: could not convert string to float: 'hello'
```

### Passing valid arguments

In [68]:
def sqrt(x):
    '''Returns the square root of a number.'''
    return x ** (0.5)

sqrt(4)

2.0

In [69]:
sqrt(10) 

3.1622776601683795

### Passing invalid arguments

```python
In [69]: sqrt('hello')
Out[69]:
```
```
------------------------------------------------------------------
TypeError                        Traceback (most recent call last)
<ipython-input-4-cfb99c64761f> in <module>()
----> 1 sqrt('hello')
<ipython-input-1-939b1a60b413> in sqrt(x)1 def sqrt(x):
----> 2 return x**(0.5)
TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'float'
```

### Errors and exceptions

* Exceptions - caught during execution
* Catch exceptions with **try-except** clause
    * Runs the code following `try`
    * If there’s an exception, run the code following `except`

### Errors and exceptions (1)

In [70]:
def sqrt(x):
    '''Returns the square root of a number.'''
    try:
        return x ** 0.5
    except:
        print('x must be an int or float')

sqrt(4)

2.0

In [71]:
sqrt(10.0)

3.1622776601683795

In [72]:
sqrt('hi') 

x must be an int or float


### Errors and exceptions (2)

In [73]:
def sqrt(x):
    '''Returns the square root of a number.'''
    try:
        return x ** 0.5
    except TypeError:
        print('x must be an int or float')
        
sqrt(-9)

(1.8369701987210297e-16+3j)

In [74]:
def sqrt(x):
    '''Returns the square root of a number.'''
    if x < 0:
        raise ValueError('x must be non-negative')
        try:
            return x ** 0.5
        except TypeError:
            print('x must be an int or float')

```python
In [76]: sqrt(-2)
Out[76]:
```
```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-76-84fde6a6eea1> in <module>()
----> 1 sqrt(-2)

<ipython-input-75-0c401d007842> in sqrt(x)
      2     """Returns the square root of a number."""
      3     if x < 0:
----> 4         raise ValueError('x must be non-negative')
      5     try:
      6         return x ** 0.5

ValueError: x must be non-negative
```

### №28 Pop quiz about errors

Take a look at the following function calls to `len()`:
```python
len('There is a beast in every man and it stirs when you put a sword in his hand.')

len(['robb', 'sansa', 'arya', 'eddard', 'jon'])

len(525600)

len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey'))
```

Which of the function calls raises an error and what type of error is raised?

* The call `len('There is a beast in every man and it stirs when you put a sword in his hand.')` raises a `TypeError`
* The call `len(['robb', 'sansa', 'arya', 'eddard', 'jon'])` raises an `IndexError`
* *The call `len(525600)` raises a `TypeError`*
* The call `len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey'))` raises a `NameError`

### №29 Error handling with try-except

* Initialize the variables `echo_word` and `shout_words` to empty strings
* Add the keywords try and except in the appropriate locations for the exception handling block
* Use the * operator to concatenate echo copies of `word1`. Assign the result to `echo_word`
* Concatenate the string `'!!!'` to `echo_word`. Assign the result to `shout_words`

In [75]:
def shout_echo(word1, echo=1):
    '''Concatenate echo copies of word1 and three
    exclamation marks at the end of the string.'''
    echo_word = ''
    shout_words = ''

    try:
        echo_word = word1 * echo
        shout_words = echo_word + '!!!'
    except:
        print("word1 must be a string and echo must be an integer.")

    return shout_words

shout_echo('particle', echo='accelerator')

word1 must be a string and echo must be an integer.


''

### №30 Error handling by raising an error

* Complete the `if` statement by checking if the value of echo is less than `0`
* In the body of the `if` statement, add a raise statement that raises a `ValueError` with message `'echo must be greater than 0'` when the value supplied by the user to `echo` is less than 0

In [76]:
def shout_echo(word1, echo=1):
    '''Concatenate echo copies of word1 and three
    exclamation marks at the end of the string.'''
    if echo < 0:
        raise ValueError('echo must be greater than 0')
    echo_word = word1 * echo
    shout_word = echo_word + '!!!'
    return shout_word

shout_echo('particle', echo=5)

'particleparticleparticleparticleparticle!!!'

## 9. Bringing it all together

### Errors and exceptions (3)

In [77]:
def sqrt(x):
    try:
        return x ** 0.5
    except:
        print('x must be an int or float')

sqrt(4)

2.0

In [78]:
sqrt('hi')

x must be an int or float


### Errors and exceptions (4)

In [79]:
def sqrt(x):
    if x < 0:
        raise ValueError('x must be non-negative')
        try:
            return x ** 0.5
        except TypeError:
            print('x must be an int or float')

### №31 Bringing it all together (1)

* In the `filter()` call, pass a lambda function and the sequence of tweets as strings, `tweets_df['text']`. The lambda function should check if the first 2 characters in a tweet x are `'RT'`. Assign the resulting filter object to `result`. To get the first 2 characters in a tweet `x`, use `x[0:2]`. To check equality, use a Boolean filter with `==`
* Convert `result` to a list and print out the list

In [80]:
result = filter(lambda x: x[0:2] == 'RT', tweets_df['text'])
res_list = list(result)

for tweet in res_list:
    print(tweet)

RT @bpolitics: .@krollbondrating's Christopher Whalen says Clinton is the weakest Dem candidate in 50 years https://t.co/pLk7rvoRSn https:/…
RT @HeidiAlpine: @dmartosko Cruz video found.....racing from the scene.... #cruzsexscandal https://t.co/zuAPZfQDk3
RT @AlanLohner: The anti-American D.C. elites despise Trump for his America-first foreign policy. Trump threatens their gravy train. https:…
RT @BIackPplTweets: Young Donald trump meets his neighbor  https://t.co/RFlu17Z1eE
RT @trumpresearch: @WaitingInBagdad @thehill Trump supporters have selective amnisia.
RT @HouseCracka: 29,000+ PEOPLE WATCHING TRUMP LIVE ON ONE STREAM!!!

https://t.co/7QCFz9ehNe
RT @urfavandtrump: RT for Brendon Urie
Fav for Donald Trump https://t.co/PZ5vS94lOg
RT @trapgrampa: This is how I see #Trump every time he speaks. https://t.co/fYSiHNS0nT
RT @trumpresearch: @WaitingInBagdad @thehill Trump supporters have selective amnisia.
RT @Pjw20161951: NO KIDDING: #SleazyDonald just attacked Scott Walker for NOT RAISI

### №32 Bringing it all together (2)

* Add a `try` block so that when the function is called with the correct arguments, it processes the DataFrame and returns a dictionary of results
* Add an `except` block so that when the function is called incorrectly, it displays the following error message: 'The DataFrame does not have a `' + col_name + ' column.'`

In [81]:
def count_entries(df, col_name='lang'):
    '''Return a dictionary with counts of
    occurrences as value for each key.'''
    cols_count = {}
    try:
        col = df[col_name]
        for entry in col:
            if entry in cols_count.keys():
                cols_count[entry] += 1
            else:
                cols_count[entry] = 1
        return cols_count
    except:
        print('The DataFrame does not have a ' + col_name + ' column.')

result1 = count_entries(tweets_df, 'lang')
print(result1)

{'en': 97, 'et': 1, 'und': 2}


### №33 Bringing it all together (3)

* If `col_name` is not a column in the DataFrame `df`, raise a `ValueError 'The DataFrame does not have a ' + col_name + ' column.'`
* Call your new function `count_entries()` to analyze the `'lang'` column of `tweets_df`. Store the result in `result1`
* Print `result1`

In [82]:
def count_entries(df, col_name='lang'):
    '''Return a dictionary with counts of
    occurrences as value for each key.'''
    if col_name not in df.columns:
        raise ValueError('The DataFrame does not have a ' + col_name + ' column.')

    cols_count = {}
    col = df[col_name]
    
    for entry in col:
        if entry in cols_count.keys():
            cols_count[entry] += 1
        else:
            cols_count[entry] = 1
        
    return cols_count

result1 = count_entries(tweets_df, 'lang')
print(result1)

{'en': 97, 'et': 1, 'und': 2}


### №34 Bringing it all together: testing your error handling skills

Try calling `count_entries(tweets_df, 'lang')` to confirm that the function behaves as it should. Then call `count_entries(tweets_df, 'lang1')`: what is the last line of the output?

* `ValueError: The DataFrame does not have the requested column.`
* *`ValueError: The DataFrame does not have a lang1 column.`*
* `TypeError: The DataFrame does not have the requested column.`

In [83]:
count_entries(tweets_df, 'lang')
count_entries(tweets_df, 'lang1')

ValueError: The DataFrame does not have a lang1 column.