# **DataCamp.Course_045_Python Data Science Toolbox (Part 1)**

### **Course Description**

It's time to push forward and develop your Python chops even further. There are tons of fantastic functions in Python and its library ecosystem. However, as a data scientist, you'll constantly need to write your own functions to solve problems that are dictated by your data. You will learn the art of function writing in this first Python Data Science Toolbox course. You'll come out of this course being able to write your very own custom functions, complete with multiple parameters and multiple return values, along with default arguments and variable-length arguments. You'll gain insight into scoping in Python and be able to write lambda functions and handle errors in your function writing practice. And you'll wrap up each chapter by using your new skills to write functions that analyze Twitter DataFrames.

## **Writing your own functions (Module 01-045)**

#### **User-defined functions**

1. You’ll learn:

- Define functions without parameters
- Define functions with one parameter
- Define functions that return a value
- Later: multiple arguments, multiple return values

2. Built-in functions

str()
x = str(5)
print(x)
`'5'`
print(type(x))
`<class 'str'>`

3. Defining a function

def square(): # <- Function header
    new_value = 4 ** 2 # <- Function body
    print(new_value)
square()
`16`

4. Function parameters

def square(value):
    new_value = value ** 2
    print(new_value)
square(4)
`16`
square(5)
`25`

5. Return values from functions

Return a value from a function using return

def square(value):
    new_value = value ** 2
    return new_value

num = square(4)
print(num)
`16`

6. Docstrings

- Docstrings describe what your function does
- Serve as documentation for your function
- Placed in the immediate line a/er the function header
- In between triple double quotes """

def square(value):
    """Return the square of a value."""
    new_value = value ** 2
    return new_value

**Strings in Python**

In the video, you learned of another standard Python datatype, strings. Recall that these represent textual data. To assign the string '`DataCamp`' to a variable `company`, you execute:

`company = 'DataCamp'`

You've also learned to use the operations + and * with strings. Unlike with numeric types such as ints and floats, the + operator concatenates strings together, while the * concatenates multiple copies of a string together. In this exercise, you will use the + and * operations on strings to answer the question below. Execute the following code in the shell:

`object1 = "data" + "analysis" + "visualization"`
`object2 = 1 * 3`
`object3 = "1" * 3`

What are the values in object1, object2, and object3, respectively?

R//
`object1 contains "dataanalysisvisualization", object2 contains 3, object3 contains "111".`

In [7]:
object1="data"+"analysis"+"visualization"
object2=1*3
object3="1"*3
print(object1,object2,object3)

dataanalysisvisualization 3 111


**Recapping built-in functions**

In the video, Hugo briefly examined the return behavior of the built-in functions `print()` and `str()`. Here, you will use both functions and examine their return values. A variable `x` has been preloaded for this exercise. Run the code below in the console. Pay close attention to the results to answer the question that follows.

    Assign str(x) to a variable y1: y1 = str(x)
    Assign print(x) to a variable y2: y2 = print(x)
    Check the types of the variables x, y1, and y2.

What are the types of x, y1, and y2?

`x is a float, y1 is a str, and y2 is a NoneType.`

**Write a simple function**

In the last video, Hugo described the basics of how to define a function. You will now write your own function!

Define a function, `shout(),` which simply prints out a string with three exclamation marks `'!!!'` at the end. The code for the `square()` function that we wrote earlier is found below. You can use it as a pattern to define `shout()`.

def square():
    new_value = 4 ** 2
    return new_value

Note that the function body is indented 4 spaces already for you. Function bodies need to be indented by a consistent number of spaces and the choice of 4 is common.

This course touches on a lot of concepts you may have forgotten, so if you ever need a quick refresher, download the Python for Data Science Cheat Sheet and keep it handy!

STEPS
    Complete the function header by adding the appropriate function name, shout.
    In the function body, concatenate the string, 'congratulations' with another string, '!!!'. Assign the result to shout_word.
    Print the value of shout_word.
    Call the shout function.

In [8]:
# Define the function shout
def shout():
    """Print a string with three exclamation marks"""
    # Concatenate the strings: shout_word
    shout_word = 'congratulations' + '!!!'
    

    # Print shout_word
    print(shout_word)

# Call shout
shout()

congratulations!!!


**Single-parameter functions**

Congratulations! You have successfully defined and called your own function! That's pretty cool.

In the previous exercise, you defined and called the function `shout()`, which printed out a string concatenated with `'!!!'`. You will now update `shout()` by adding a parameter so that it can accept and process any string argument passed to it. Also note that `shout(word)`, the part of the header that specifies the function name and parameter(s), is known as the signature of the function. You may encounter this term in the wild!

STEPS
    Complete the function header by adding the parameter name, word.
    Assign the result of concatenating word with '!!!' to shout_word.
    Print the value of shout_word.
    Call the shout() function, passing to it the string, 'congratulations'.

In [9]:
# Define shout with the parameter, word
def shout(word):
    """Print a string with three exclamation marks"""
    # Concatenate the strings: shout_word
    shout_word = word + '!!!'

    # Print shout_word
    print(shout_word)

# Call shout with the string 'congratulations'
shout('congratulations')

congratulations!!!


Functions that return single values

You're getting very good at this! Try your hand at another modification to the `shout()` function so that it now returns a single value instead of printing within the function. Recall that the `return` keyword lets you return values from functions. Parts of the function `shout()`, which you wrote earlier, are shown. Returning values is generally more desirable than printing them out because, as you saw earlier, a `print()` call assigned to a variable has type `NoneType`.

STEPS
    In the function body, concatenate the string in word with '!!!' and assign to shout_word.
    Replace the print() statement with the appropriate return statement.
    Call the shout() function, passing to it the string, 'congratulations', and assigning the call to the variable, yell.
    To check if yell contains the value returned by shout(), print the value of yell.

In [10]:
# Define shout with the parameter, word
def shout(word):
    """Return a string with three exclamation marks"""
    # Concatenate the strings: shout_word
    shout_word = word + '!!!'
    
    # Replace print with return
    return shout_word

# Pass 'congratulations' to shout: yell
yell = shout('congratulations')

# Print yell
print(yell)

congratulations!!!


#### **Multiple parameters and return values**

1. Multiple function parameters

- Accept more than 1 parameter:

def raise_to_power(value1, value2):
    """Raise value1 to the power of value2."""
    new_value = value1 ** value2
    return new_value

- Call function: # of arguments = # of parameters

result = raise_to_power(2, 3)
print(result)
`8`

2. A quick jump into tuples

- Make functions return multiple values: Tuples!
- Tuples:

    - Like a list - can contain multiple values
    - Immutable - can’t modify values!
    - Constructed using parentheses ()

even_nums = (2, 4, 6)
print(type(even_nums))
`<class 'tuple'>`

3. Unpacking tuples

- Unpack a tuple into several variables:

even_nums = (2, 4, 6)
a, b, c = even_nums

print(a)
`2`
print(b)
`4`
print(c)
`6`

4. Accessing tuple elements

- Access tuple elements like you do with lists:

even_nums = (2, 4, 6)
print(even_nums[1])
`4`

second_num = even_nums[1]
print(second_num)
`4`

- Uses zero-indexing

5. Returning multiple values

def raise_both(value1, value2):
    """Raise value1 to the power of value2 and vice versa."""

    new_value1 = value1 ** value2
    new_value2 = value2 ** value1

    new_tuple = (new_value1, new_value2)

    return new_tuple
    
result = raise_both(2, 3)
print(result)
`(8, 9)`

**Functions with multiple parameters**

Hugo discussed the use of multiple parameters in defining functions in the last lecture. You are now going to use what you've learned to modi fy the `shout()` function further. Here, you will modify `shout()` to accept two arguments. Parts of the function `shout()`, which you wrote earlier, are shown.

STEPS
    Modify the function header such that it accepts two parameters, word1 and word2, in that order.
    Concatenate each of word1 and word2 with '!!!' and assign to shout1 and shout2, respectively.
    Concatenate shout1 and shout2 together, in that order, and assign to new_shout.
    Pass the strings 'congratulations' and 'you', in that order, to a call to shout(). Assign the return value to yell.

In [11]:
# Define shout with parameters word1 and word2
def shout(word1, word2):
    """Concatenate strings with three exclamation marks"""
    # Concatenate word1 with '!!!': shout1
    shout1 = word1 + '!!!'
    
    # Concatenate word2 with '!!!': shout2
    shout2 = word2 + '!!!'
    
    # Concatenate shout1 with shout2: new_shout
    new_shout = shout1 + shout2

    # Return new_shout
    return new_shout

# Pass 'congratulations' and 'you' to shout(): yell
yell = shout('congratulations', 'you')

# Print yell
print(yell)

congratulations!!!you!!!


**A brief introduction to tuples**

Alongside learning about functions, you've also learned about tuples! Here, you will practice what you've learned about tuples: how to construct, unpack, and access tuple elements. Recall how Hugo unpacked the tuple `even_nums` in the video:

`a, b, c = even_nums`

A three-element tuple named nums has been preloaded for this exercise. Before completing the script, perform the following:

    Print out the value of `nums` in the IPython shell. Note the elements in the tuple.
    In the IPython shell, try to change the first element of `nums` to the value 2 by doing an assignment: nums[0] = 2. What happens?


STEPS
    Unpack nums to the variables num1, num2, and num3.
    Construct a new tuple, even_nums composed of the same elements in nums, but with the 1st element replaced with the value, 2.

In [15]:
nums = (3,4,6)

print(nums)

# nums[0] = 2 TypeError: 'tuple' object does not support item assignment

# Unpack nums into num1, num2, and num3
num1, num2, num3 = nums

# Construct even_nums
even_nums = (2, num2, num3)

(3, 4, 6)


**Functions that return multiple values**

In the previous exercise, you constructed tuples, assigned tuples to variables, and unpacked tuples. Here you will return multiple values from a function using tuples. Let's now update our `shout()` function to return multiple values. Instead of returning just one string, we will return two strings with the string `!!!` concatenated to each.

Note that the return statement `return x, y` has the same result as `return (x, y)`: the former actually packs `x` and `y` into a tuple under the hood!

STEPS
    Modify the function header such that the function name is now shout_all, and it accepts two parameters, word1 and word2, in that order.
    Concatenate the string '!!!' to each of word1 and word2 and assign to shout1 and shout2, respectively.
    Construct a tuple shout_words, composed of shout1 and shout2.
    Call shout_all() with the strings 'congratulations' and 'you' and assign the result to yell1 and yell2 (remember, shout_all() returns 2 variables!).

In [16]:
# Define shout_all with parameters word1 and word2
def shout_all(word1, word2):
    
    # Concatenate word1 with '!!!': shout1
    shout1 = word1 + '!!!'
    
    # Concatenate word2 with '!!!': shout2
    shout2 = word2 + '!!!'
    
    # Construct a tuple with shout1 and shout2: shout_words
    shout_words = (shout1, shout2)

    # Return shout_words
    return shout_words

# Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
yell1, yell2 = shout_all('congratulations', 'you')

# Print yell1 and yell2
print(yell1)
print(yell2)

congratulations!!!
you!!!


#### **Bringing it all together**

1. You’ve learned:

- How to write functions
    Accept multiple parameters
    Return multiple values
- Up next: Functions for analyzing Twitter data

2. Basic ingredients of a function

- Function Header
def raise_both(value1, value2):

- Function body
    """Raise value1 to the power of value2
    and vice versa."""
    new_value1 = value1 ** value2
    new_value2 = value2 ** value1
    new_tuple = (new_value1, new_value2)
    return new_tuple

**Bringing it all together (1)**

You've got your first taste of writing your own functions in the previous exercises. You've learned how to add parameters to your own function definitions, return a value or multiple values with tuples, and how to call the functions you've defined.

In this and the following exercise, you will bring together all these concepts and apply them to a simple data science problem. You will load a dataset and develop functionalities to extract simple insights from the data.

For this exercise, your goal is to recall how to load a dataset into a DataFrame. The dataset contains Twitter data and you will iterate over entries in a column to build a dictionary in which the keys are the names of languages and the values are the number of tweets in the given language. The file `tweets.csv` is available in your current directory.

Be aware that this is real data from Twitter and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).

STEPS
    Import the pandas package with the alias pd.
    Import the file 'tweets.csv' using the pandas function read_csv(). Assign the resulting DataFrame to df.
    Complete the for loop by iterating over col, the 'lang' column in the DataFrame df.
    Complete the bodies of the if-else statements in the for loop: if the key is in the dictionary langs_count, add 1 to the value corresponding to this key in the dictionary, else add the key to langs_count and set the corresponding value to 1. Use the loop variable entry in your code.

In [32]:
# Import pandas
import pandas as pd

# Import Twitter data as DataFrame: df
tweets_df = pd.read_csv(r'G:\My Drive\Data Science\Datacamp_Notebook\Datacamp_Notebook\datasets\tweets.csv')

# Initialize an empty dictionary: langs_count
langs_count = {}

# Extract column from DataFrame: col
col = tweets_df['lang']

# Iterate over lang column in DataFrame
for entry in col:

    # If the language is in langs_count, add 1 
    if entry in langs_count.keys():
        langs_count[entry] += 1
    # Else add the language to langs_count, set the value to 1
    else:
        langs_count[entry] = 1

# Print the populated dictionary
print(langs_count)

{'en': 97, 'et': 1, 'und': 2}


**Bringing it all together (2)**

Great job! You've now defined the functionality for iterating over entries in a column and building a dictionary with keys the names of languages and values the number of tweets in the given language.

In this exercise, you will define a function with the functionality you developed in the previous exercise, return the resulting dictionary from within the function, and call the function with the appropriate arguments.

For your convenience, the pandas package has been imported as `pd` and the `'tweets.csv'` file has been imported into the `tweets_df` variable.

In [33]:
# Define count_entries()
def count_entries(df, col_name):
    """Return a dictionary with counts of 
    occurrences as value for each key."""

    # Initialize an empty dictionary: langs_count
    langs_count = {}
    
    # Extract column from DataFrame: col
    col = df[col_name]
    
    # Iterate over lang column in DataFrame
    for entry in col:

        # If the language is in langs_count, add 1
        if entry in langs_count.keys():
            langs_count[entry] += 1
        # Else add the language to langs_count, set the value to 1
        else:
            langs_count[entry] = 1

    # Return the langs_count dictionary
    return langs_count

# Call count_entries(): result
result = count_entries(tweets_df, 'lang')

# Print the result
print(result)

{'en': 97, 'et': 1, 'und': 2}


## **Break: repaso de while loop and for loops**

Python For Loops

A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string).

This is less like the for keyword in other programming languages, and works more like an iterator method as found in other object-orientated programming languages.

With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.

*Basic Syntax of the Python for loop*

The basic syntax of the for loop in Python looks something similar to the one mentioned below.

for itarator_variable in sequence_name:
	Statements
	. . .
	Statements

Let me explain the syntax of the Python for loop better.

- The first word of the statement starts with the keyword “for” which signifies the beginning of the for loop.
- Then we have the iterator variable which iterates over the sequence and can be used within the loop to perform various functions
- The next is the “in” keyword in Python which tells the iterator variable to loop for elements within the sequence
- And finally, we have the sequence variable which can either be a list, a tuple, or any other kind of iterator.
- The statements part of the loop is where you can play around with the iterator variable and perform various function


In [1]:
# Example 1
fruits = ["apple", "banana", "cherry"]
for x in fruits:
  print(x)

apple
banana
cherry


In [2]:
# Looping Through a String
for x in "banana":
  print(x)

b
a
n
a
n
a


In [3]:
#The break Statement. 
# Exit the loop when x is "banana":
fruits = ["apple", "banana", "cherry"]
for x in fruits:
  print(x)
  if x == "banana":
    break

apple
banana


In [4]:
#The break Statement. 
# Exit the loop when x is "banana", but this time the break comes before the print:
fruits = ["apple", "banana", "cherry"]
for x in fruits:
  if x == "banana":
    break
  print(x)

apple


In [5]:
# The continue Statement

# With the continue statement we can stop the current iteration of the loop, and continue with the next:

# do not print: banana

fruits = ["apple", "banana", "cherry"]
for x in fruits:
  if x == "banana":
    continue
  print(x)

apple
cherry


In [10]:
# The range() Function

# To loop through a set of code a specified number of times, we can use the range() function, 
# The range() function returns a sequence of numbers, starting from 0 by default, and increments 
# by 1 (by default), and ends at a specified number.

for x in range(6):
  print(x)

print("   ")

# The range() function defaults to 0 as a starting value, however it is possible to specify the 
# starting value by adding a parameter: range(2, 6), which means values from 2 to 6 (but not including 6):

for x in range(2, 6):
  print(x)

print("   ")

#The range() function defaults to increment the sequence by 1, however it is possible to specify the 
# increment value by adding a third parameter: range(2, 30, 3):

for x in range(2, 30, 3):
  print(x)


0
1
2
3
4
5
   
2
3
4
5
   
2
5
8
11
14
17
20
23
26
29


In [19]:
# usufull application of range()

number = int(input("Enter an integer: "))

for count in range(1, 11):
    product = number * count
    print(number, "x", count, "=", product)

12 x 1 = 12
12 x 2 = 24
12 x 3 = 36
12 x 4 = 48
12 x 5 = 60
12 x 6 = 72
12 x 7 = 84
12 x 8 = 96
12 x 9 = 108
12 x 10 = 120


In [11]:
# Else in For Loop

# The else keyword in a for loop specifies a block of code to be executed when the loop is finished:

# Example: Print all numbers from 0 to 5, and print a message when the loop has ended:

for x in range(6):
  print(x)
else:
  print("Finally finished!") 

0
1
2
3
4
5
Finally finished!


In [13]:
# Example 2: Break the loop when x is 3, and see what happens with the else block:

for x in range(6):
  if x == 3: break
  print(x)
else:
  print("Finally finished!") 

# Note: The else block will NOT be executed if the loop is stopped by a break statement.

0
1
2


In [14]:
# Nested Loops

# A nested loop is a loop inside a loop.

# The "inner loop" will be executed one time for each iteration of the "outer loop":

adj = ["red", "big", "tasty"]
fruits = ["apple", "banana", "cherry"]

for x in adj:
  for y in fruits:
    print(x, y) 

red apple
red banana
red cherry
big apple
big banana
big cherry
tasty apple
tasty banana
tasty cherry


In [15]:
# The pass Statement

# for loops cannot be empty, but if you for some reason have a for loop with no content, 
# put in the pass statement to avoid getting an error.

for x in [0, 1, 2]:
  pass

In [20]:
nums = [1, 2, -3, 4, -5, 6]

sum_positives = 0

for num in nums:
    if num < 0:
        continue
    sum_positives += num

print(f'Sum of Positive Numbers: {sum_positives}')

Sum of Positive Numbers: 13


In [21]:
nums = [1, 2, 3, 4, 5, 6]

n = 2

found = False
for num in nums:
    if n == num:
        found = True
        break

print(f'List contains {n}: {found}')

# Output
# List contains 2: True

List contains 2: True


#### **Congratulations!**

Next chapters:
- Functions with default arguments
- Functions that accept an arbitrary number of parameters
- Nested functions
- Error-handling within functions
- More function use in data science!

## **Default arguments, variable-length arguments and scope (Module 02-045)**

#### **Scope and user-defined functions**

1. 

## **Lambda functions and error-handling (Module 03-045)**

In [1]:
print('perrenque!')

perrenque!
