# Conditionals, Loops, Functions, Re Module (cont'd)

## Functions


In Python, a function is a block of code that performs a specific task and can be called from other parts of a program. Functions provide a way to organize code into reusable modules, making code more efficient, readable, and easier to maintain.

Functions in Python are defined using the def keyword, followed by the function name and parentheses. Any parameters that the function requires are listed inside the parentheses, and the function code is indented below the function definition. Here's an example of a simple function that takes two arguments and returns their sum:

In [1]:
def add_numbers(a, b):
    sum = a + b
    return sum


In this example, we have defined a function called add_numbers that takes two arguments, a and b. Inside the function, we have created a variable called sum that contains the sum of a and b. Finally, we have used the return statement to return the value of sum to the caller.

To call a function in Python, we simply use its name followed by parentheses, passing any required arguments inside the parentheses. Here's an example of how we could call the add_numbers function:

In [2]:
result = add_numbers(2, 3)
print(result)


5


In this example, we have called the add_numbers function with arguments 2 and 3. The function returns the value 5, which we have assigned to a variable called result. Finally, we have printed the value of result to the console, which should output 5.

Functions in Python can have optional parameters, default parameter values, and can return multiple values using tuples. They are an essential tool for any Python programmer and provide a way to write efficient and reusable code.

There are built-in functions as well in python. These are functions that come pre-installed with the Python package. We have already used a couple of them in this course. Here is a list of some of the built-in functions in python

print() - This function is used to print out a message or value to the console.

len() - This function is used to find the length of a string, list, tuple, or any other sequence.

range() - This function is used to create a range of numbers that can be used in a loop.

type() - This function is used to determine the data type of a given value.

str() - This function is used to convert a value to a string.

int() - This function is used to convert a value to an integer.

float() - This function is used to convert a value to a floating-point number.

abs() - This function is used to find the absolute value of a number.

sum() - This function is used to find the sum of all the values in a list.

max() and min() - These functions are used to find the maximum and minimum values in a list.

Let us look at more examples of simple functions:

greet(): This function takes no arguments and simply prints a greeting message.

In [None]:
def greet():
    print("Hello, how are you?")


add(): This function takes two arguments - a and b - and returns the sum of the two arguments.

In [None]:
def add(a, b):
    return a + b


is_even(): This function takes one argument - a number - and checks whether the number is even or odd. It returns True if the number is even and False if it is odd.

In [None]:
def is_even(num):
    if num % 2 == 0:
        return True
    else:
        return False


double_list(): This function takes one argument - a list of numbers - and returns a new list with each number in the original list doubled.

In [None]:
def double_list(numbers):
    doubled_numbers = []
    for num in numbers:
        doubled_numbers.append(num * 2)
    return doubled_numbers


calculate_area(): This function takes two arguments - the base and height of a triangle - and returns the area of the triangle.

In [None]:
def calculate_area(base, height):
    area = 0.5 * base * height
    return area


Now, for examples that are more involved

#### Example 1

calculate_bmi(): This function takes in two arguments - weight in kilograms and height in meters - and returns the Body Mass Index (BMI) of an individual.

In [None]:
def calculate_bmi(weight, height):
    bmi = weight / (height ** 2)
    return bmi


#### Example 2:

convert_currency(): This function takes in three arguments - the amount of money to be converted, the currency of the original amount, and the currency to which the amount needs to be converted - and returns the converted amount.

In [1]:
def convert_currency(amount, original_currency, target_currency):
    # assume the conversion rates are stored in a dictionary
    conversion_rates = {'USD': 1.0, 'EUR': 0.83, 'GBP': 0.72}
    converted_amount = amount * conversion_rates[original_currency] / conversion_rates[target_currency]
    return converted_amount


#### Example 3:

calculate_tip(): This function takes in two arguments - the total bill amount and the percentage of the tip - and returns the amount of tip to be paid.

In [2]:
def calculate_tip(total_bill, tip_percent):
    tip_amount = total_bill * tip_percent / 100
    return tip_amount


#### Example 4:

calculate_distance(): This function takes in four arguments - the latitude and longitude of two locations - and returns the distance between the two locations in kilometers.

In [None]:
import geopy.distance

def calculate_distance(lat1, lon1, lat2, lon2):
    location1 = (lat1, lon1)
    location2 = (lat2, lon2)
    distance_km = geopy.distance.distance(location1, location2).km
    return distance_km


## Lambda Functions

Lambda functions, also known as anonymous functions, are a type of function in Python that are defined using a single line of code and do not have a name.

Lambda functions are typically used when a small, one-time function is needed, and defining a full function is unnecessary. They can take any number of arguments, but only return one value.

Here is an example of a lambda function that takes two arguments and returns their sum:

In [None]:
add = lambda x, y: x + y
result = add(2, 3)
print(result) # Output: 5


In this example, lambda is used to define a function that takes two arguments x and y, and returns their sum x + y. The function is then assigned to the variable add, and called with arguments 2 and 3. The result is printed, which is 5.

### Application of Functions in Pandas


Python functions are used extensively in Pandas for data manipulation and analysis in Python. Pandas provides a wide range of built-in functions for common operations on data, but sometimes you may need to write your own custom functions to perform specific data manipulations or calculations.

So far, we have learned how to create our own custom functions and now we can apply this knowledge to dataframes. Here are a few examples of how Python functions can be used in Pandas:

**Apply function to a column or row:** <br>

You can use the apply() function to apply a Python function to each element of a Pandas Series or DataFrame column or row. For example, you could define a function that calculates the square of a number and apply it to a column in a DataFrame:

In [3]:
import pandas as pd

df = pd.DataFrame({'numbers': [1, 2, 3, 4, 5]})
square = lambda x: x ** 2
df['squared'] = df['numbers'].apply(square)


The above code will create a new column in the DataFrame called "squared" that contains the square of each number in the "numbers" column.

You can use Python functions with conditional statements to manipulate data in a Pandas DataFrame. For example, you could define a function that sets values in a column to "high" or "low" based on a threshold value and use it with the apply() function:

In [4]:
import pandas as pd

df = pd.DataFrame({'values': [1, 2, 3, 4, 5]})
threshold = 3
def set_threshold(x):
    if x >= threshold:
        return 'high'
    else:
        return 'low'
df['threshold'] = df['values'].apply(set_threshold)


This will create a new column in the DataFrame called "threshold" that contains "high" or "low" based on whether the corresponding value in the "values" column is above or below the threshold.

**Apply functionto the entire dataframe:** <br>

Here, we make use of the applymap() function to apply a python function to each element of all rows of the Dataframe.

applymap is a method in the pandas library that can be used to apply a function to each element of a pandas DataFrame. Here's an example of how to use applymap in pandas:

Suppose you have a DataFrame df as follows:

In [5]:
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})


You can use applymap to apply a function to each element of the DataFrame. For example, let's say we want to square each element of the DataFrame. We can define a lambda function to do this:

In [6]:
square = lambda x: x ** 2

Then, we can use applymap to apply the square function to each element of the DataFrame:

In [7]:
df_squared = df.applymap(square)
df_squared

Unnamed: 0,A,B
0,1,16
1,4,25
2,9,36


### Classwork

1. **calculate_tax()**: This function takes in two arguments - the total purchase amount and the tax rate - and returns the amount of tax to be paid.

2. **generate_password()**: This function generates a random password of a specified length and complexity. You can use the "random" module in python. (This requires some research into how to use the random module)

3. Give code examples of using the built-in functions given in this notebook on ordinary numbers and also pandas tables. You can create the tables on your own or you can use the tables we have used so far in class.

4. Write a program that takes a list of integers and returns the sum of all even numbers in the list using a loop.

5. Write a Python program that takes a string as input and outputs the number of vowels in the string using a loop.

6. Write a python function that reverses a string. For example, "Olujare" becomes "erajulO"


### List Comprehension

List comprehension is a concise and elegant way to create a new list in Python by applying an expression or function to each element in an existing list. It is a powerful and efficient tool for data processing and is often used as a replacement for loops.

The basic syntax of list comprehension is as follows:

In [2]:
iterable = []

new_list = [expression for item in iterable]


Here, expression is an expression or function that is applied to each element in the iterable, item is the variable that represents each element in the iterable, and iterable is a list, tuple, set, or any other iterable object.

For example, the following code creates a new list of squared numbers using list comprehension:

In [None]:
numbers = [1, 2, 3, 4, 5]

squares = [x**2 for x in numbers]

print(squares) # Output: [1, 4, 9, 16, 25]


This code creates a new list called squares by squaring each element in the numbers list using the expression x**2. List comprehension is often more concise and readable than equivalent code that uses loops, and it can also be faster and more efficient in many cases.

List comprehension also allows for filtering of elements by including an optional if clause at the end of the expression. For example, the following code creates a new list of even numbers using list comprehension:

In [None]:
numbers = [1, 2, 3, 4, 5]

even_numbers = [x for x in numbers if x % 2 == 0]

print(even_numbers) # Output: [2, 4]


This code creates a new list called even_numbers that includes only the even numbers from the numbers list by checking the remainder when each number is divided by 2.

### Assignment

1. There are tuple comprehension, dictionary comprehension, and set comprehension. Create code examples for each.

2. Create a program that generates a multiplication table for a given number using a nested loop in Python.

3. Create a program that takes a list of strings and sorts them alphabetically using a loop and the built-in string comparison functions in Python. (It is possible to compare strings using the > or < signs. For example "j" > "i" outputs True in python)

4. Explain the code snippets given in the cell below. Use comments to explain what the code is doing at each line.

In [None]:
max_num = int(input("Enter the maximum number: "))

for num in range(2, max_num + 1):

    is_prime = True
    
    for i in range(2, num):
        if num % i == 0:
            is_prime = False
            break
    
    if is_prime:
        print(num, end=" ")


5. Write a Python function that takes in a list of numbers and returns the sum of all the even numbers in the list.

6. Write a Python function that takes in a string and returns True if the string is a palindrome (reads the same backwards as forwards), and False otherwise.

7. Write a Python function that takes in a list of numbers and returns a new list with only the numbers that are greater than the average of all the numbers in the original list.

8. Implement question 7 with list comprehension.

## Re Module


The re module is a built-in module in Python that provides support for regular expressions, which are a powerful and flexible way to match and manipulate text. The re module allows you to use regular expressions in your Python code, providing a wide range of functions for searching, replacing, and manipulating text.

Some of the key functions provided by the re module include:

Search: You can use the re.search() function to search for a pattern in a string and return the first occurrence of the match.<br>

Match: You can use the re.match() function to match a pattern at the beginning of a string. <br>

Substitution: You can use the re.sub() function to search and replace a pattern in a string with a replacement string. <br>

Splitting: You can use the re.split() function to split a string into a list based on a specified pattern. <br>

Findall: You can use the re.findall() function to find all occurrences of a pattern in a string and return them as a list. <br>

Groups: You can use the groups() method to extract specific parts of a matched pattern. <br>

Compile: You can use the re.compile() function to compile a regular expression pattern into a regex object for better performance if you plan to use the same pattern multiple times.<br>

<br>
<br>
Regular expressions are a powerful tool for text processing and can be used in a wide range of applications, including data cleaning, natural language processing, and web scraping. The re module is an essential tool for anyone working with text data in Python, and it provides a powerful and flexible way to work with regular expressions.

[Geeks for Geeks re module article](https://www.geeksforgeeks.org/regular-expression-python-examples-set-1/)

[Google re module article](https://developers.google.com/edu/python/regular-expressions)

[guru99 website](https://www.guru99.com/python-regular-expressions-complete-tutorial.html)


### Official Documentation for the Re module:
[Official Re Documentation](https://docs.python.org/3/library/re.html)

### Re.search():

Here are five examples of how to use re.search() in Python:

#### Matching a specific pattern in a string:

In [None]:
import re

text = "The quick brown fox jumps over the lazy dog."
pattern = "brown"
match = re.search(pattern, text)
if match:
    print("Pattern found in the text!")
else:
    print("Pattern not found.")


#### Matching a pattern with a variable in a string:

In [None]:
import re

text = "My favorite number is 42."
number = 42
pattern = str(number)
match = re.search(pattern, text)
if match:
    print("Pattern found in the text!")
else:
    print("Pattern not found.")


#### Matching a pattern with a regular expression:

In [13]:
import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r"\b[A-Z]{5}\b"
match = re.search(pattern, text)
if match:
    print("Pattern found in the text!")
else:
    print("Pattern not found.")


Pattern not found.


#### Matching a pattern with a case-insensitive search:

In [None]:
import re

text = "The quick brown fox jumps over the Lazy dog."
pattern = "lazy"
match = re.search(pattern, text, re.IGNORECASE)
if match:
    print("Pattern found in the text!")
else:
    print("Pattern not found.")


#### Matching a pattern with a multiline string:

In [None]:
import re

text = "The quick brown fox\njumps over the lazy dog."
pattern = "fox.*dog"
match = re.search(pattern, text, re.MULTILINE)
if match:
    print("Pattern found in the text!")
else:
    print("Pattern not found.")


In each of these examples, re.search() is used to search for a pattern in a string. The first argument to re.search() is the pattern to search for, and the second argument is the string to search in. Optional arguments can be used to specify additional search options such as case-insensitivity or multi-line search. If the pattern is found, re.search() returns a match object, which can be used to retrieve information about the match.

### Common Regex Symbols

Before we continue, we saw the usage of regex patterns in example 3. This concept has not yet been covered. Therefore, we will take some time to go through them.


#### Common Regex symbols:

. - Matches any character except a newline.<br>
^ - Matches the start of a string.<br>
$ - Matches the end of a string.<br>
'\*' - Matches zero or more occurrences of the preceding character.<br>
'+' - Matches one or more occurrences of the preceding character.<br>
? - Matches zero or one occurrence of the preceding character.<br>
{m} - Matches exactly m occurrences of the preceding character.<br>
{m,n} - Matches between m and n occurrences of the preceding character.<br>
[] - Matches any character inside the square brackets. For example, [abc] matches any character that is either a, b, or c.<br>
() - Groups characters together. For example, (ab)* matches zero or more occurrences of the characters "ab".<br>
\ - Escapes special characters, allowing them to be treated as literal characters. For example, \. matches a period character.<br>
<br>
<br>
These are just a few of the most commonly used symbols in regex patterns. There are many more, and their meaning can vary depending on the context and the regex flavor being used. It's important to carefully study the documentation for the specific regex engine you're using to fully understand how to use these symbols effectively in your code.


### Re.compile()

In this syntax, "regex_pattern" is the regular expression pattern that you want to compile. Once the pattern is compiled, you can use the resulting regular expression object with other functions in the re module, such as search(), findall(), sub(), and so on.

Here is an example that demonstrates how to use re.compile() to search for a pattern in a string:

In [None]:

pattern = re.compile("apple")
string = "I like to eat apples"

match = pattern.search(string)

if match:
    print("Match found!")
else:
    print("No match found.")


In this example, we compile the regular expression pattern "apple" using re.compile(). We then search for this pattern in the string "I like to eat apples" using the search() function, which returns a match object if the pattern is found. Finally, we check if a match was found and print a message accordingly.

#### Creating compile objects

In [None]:
import re

#Matches any string that starts with "apple"
pattern = re.compile("apple")

# Matches any string that starts with "hello"
pattern1 = re.compile("^hello")

# Matches any string that ends with "world"
pattern2 = re.compile("world$")

# Matches any string that contains one or more digits
pattern3 = re.compile("\d+")

# Matches any string that contains one or more whitespace characters
pattern4 = re.compile("\s+")

# Matches any string that starts with a capital letter and contains only letters and whitespace characters
pattern5 = re.compile("^[A-Z][a-zA-Z\s]*$")

# Matches any string that contains the word "cat", regardless of case
pattern6 = re.compile("cat", re.IGNORECASE)

# Matches any string that contains the characters "a" and "b", in that order
pattern7 = re.compile("a.*b")

# Matches any string that contains exactly three digits in a row
pattern8 = re.compile("\d{3}")

# Matches any string that starts with "abc" or "def"
pattern9 = re.compile("^(abc|def)")

# Matches any string that ends with either "pdf" or "docx"
pattern10 = re.compile("(pdf|docx)$")


Let us use the remaining regex objects we have created to find pieces of text within the writeup given below:

In [None]:
text = """

The first code snippet most people type is 'hello world!'. They usually print these words and in that moment, they realize
that they have entered into a much different world than they were used to.

The abc's of coding are categorically easy concepts to grasp. The definitions of functions using the 'def' keyword is one of the
most beautiful thing.

After writing over 100,000 lines of code(I wonder who was counting), lead data scientist, Jumoke Kasali, says that 'The world 
of coding and data science and inextricably linked for life'

"""

In [None]:
pattern1.search(text)

In [None]:
pattern2.search(text)

In [None]:
pattern3.search(text)

In [None]:
pattern4.search(text)

In [None]:
pattern5.search(text)

In [None]:
pattern6.search(text)

In [None]:
pattern7.search(text)

In [None]:
pattern8.search(text)

In [None]:
pattern9.search(text)

In [None]:
pattern10.search(text)

### Phew...

We repeated code quite a bit there, can you come up with a function or loop implementation to give us the same results generated above?

### Re.sub():

re.sub() is a function in Python's built-in regular expression module, which is used to substitute one string with another in a given text. The syntax for re.sub() is as follows:

> re.sub(pattern, repl, string, count=0, flags=0)

- pattern: A regular expression pattern to be searched in the string.
- repl: A string that replaces the pattern found in string.
- string: A string in which the pattern needs to be searched and replaced.
- count (optional): An integer to specify the maximum number of substitutions to be made. Default is 0 which means all occurrences will be replaced.
- flags (optional): An integer to specify different flags such as re.IGNORECASE, re.MULTILINE, etc.

**Example 1: Replacing a string in a sentence**

In [None]:

text = "I am so happy today! Let's celebrate happiness together."
new_text = re.sub("happy", "joyful", text)

print("Original Text: ", text)
print("New Text: ", new_text)


**Example 2: Removing digits from a string**

Suppose you have a string that contains some digits, and you want to remove them. You can use the re.sub() function with a regular expression pattern as follows:


In [None]:

text = "My phone number is 123-456-7890. Call me anytime."
new_text = re.sub("\d", "", text)

print("Original Text: ", text)
print("New Text: ", new_text)


The next portion of this class will be done on the internet over the website link given below:

[Next Portion of the Class!](https://developers.google.com/edu/python/regular-expressions)

### Common Matching Challenges: Matching Phone Numbers, Emails, and License plate numbers

#### Phone Numbers:

In [1]:
import re

# Define the pattern for matching phone numbers
pattern = r'\d{3}-\d{3}-\d{4}'

# Example text
text = 'John Doe\'s phone number is 123-456-7890.'

# Find all occurrences of the pattern in the text
matches = re.findall(pattern, text)

# Print the matches
print(matches)


['123-456-7890']


In this example, we import the re module and define the pattern for matching phone numbers as \d{3}-\d{3}-\d{4}, which matches phone numbers in the format ###-###-####. We then define an example text string containing a phone number and use the findall() function to find all occurrences of the pattern in the text. Finally, we print the matches.

#### Emails:

In [None]:
import re

# Define the pattern for matching emails
pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

# Example text
text = 'John Doe\'s email is john.doe@example.com.'

# Find all occurrences of the pattern in the text
matches = re.findall(pattern, text)

# Print the matches
print(matches)


**Line by line explanation of the above code:**
- \b: A word boundary to ensure that we match the entire email address.
- [A-Za-z0-9._%+-]+: One or more characters that can be letters (uppercase or lowercase), digits, dots, underscores, percent signs, plus signs, or hyphens.
- @: The at symbol, which separates the username and domain name in an email address.
- [A-Za-z0-9.-]+: One or more characters that can be letters (uppercase or lowercase), digits, dots, or hyphens, representing the domain name.
- \.: A literal period, which separates the domain name from the top-level domain.
- [A-Z|a-z]{2,}: Two or more characters that can be uppercase or lowercase letters, representing the top-level domain.
- \b: Another word boundary.
- text = 'John Doe\'s email is john.doe@example.com.': This line defines an example text that we want to search for email addresses.

- matches = re.findall(pattern, text): This line uses the findall() method from the re module to find all occurrences of the pattern in the text. The method returns a list of all matches.

- print(matches): This line prints the list of matches, which will be any email addresses found in the text.

In this example, we define the pattern for matching emails as \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b, which matches email addresses in the format username@domain.com. We then define an example text string containing an email address and use the findall() function to find all occurrences of the pattern in the text. Finally, we print the matches.

#### License Plate Number:

In [None]:
import re

# Define the pattern for matching license plate numbers
pattern = r'[A-Z]{2}\d{2}\s[A-Z]{3}'

# Example text
text = 'The license plate number is AB12 XYZ.'

# Find all occurrences of the pattern in the text
matches = re.findall(pattern, text)

# Print the matches
print(matches)


**Line by line explanation of the code:**

- pattern = r'[A-Z]{2}\d{2}\s[A-Z]{3}': This line defines a regular expression pattern that matches license plate numbers. The pattern consists of:
> - [A-Z]{2}: two uppercase letters, which represent the state or country code of the license plate.
> - \d{2}: two digits, which represent the numerical part of the license plate.
> - \s: a whitespace character, which separates the numerical part from the alphabetical part of the license plate.
> - [A-Z]{3}: three uppercase letters, which represent the alphabetical part of the license plate.

In this example, we define the pattern for matching license plate numbers as [A-Z]{2}\d{2}\s[A-Z]{3}, which matches license plate numbers in the format AA99 AAA. We then define an example text string containing a license plate number and use the findall() function to find all occurrences of the pattern in the text. Finally, we print the matches.



These examples show how to use regular expressions in Python to match phone numbers, emails, and license plate numbers. By adjusting the regular expression pattern, you can match different formats and patterns in text.

# Assignment

1. Keyword Matching:
- Read the "re_text.txt" file using the Open function.
- Search for the following keywords in the text and record the number of times those keywords occur in the text:

lions, physical characteristics, behavior, habitat, conservation, prides, apex predators, sub-Saharan Africa, Asiatic lions, savannas, grasslands, scrubland, vulnerable species, decreasing population trend, habitat loss, poaching, human-lion conflicts, ecosystem, organizations, awareness.


- Add the following text to the re_text.txt file on a new paragraph:

"In conclusion, lions are remarkable animals that have captured the imaginations of people around the world. Their physical characteristics, behavior, and habitat make them a unique and important part of the natural world. However, they are facing numerous threats to their survival, and conservation efforts are needed to ensure that these majestic creatures continue to thrive in the wild."

2. Identity Matching:
- Read the "phone_email_license.txt" file and then locate and record the number of occurrences of phone numbers emails, and license plate numbers.
- Replace all the phone numbers in the "phone_email_license.txt" file with your own phone number and write the result into a different .txt file


3. Write a function that takes in text and matches dates of the year. The date format is: 05/15/2022

4. Write a function that takes in text and matches websites within the text. The website format is: https://www.example.com

# Further Practice


The link below will provide several coding problems that you can use to perfect your programming skills:

https://www.hackerrank.com/domains/python

**Enjoy!**