### What is a Function

- A function is a block of code that only runs when it's called.
- You can pass data (called parameters) into a function.
- The function can return data as a result.

#### Importance
Enable us to resuse the code and make it more modular, important for complex data analysis and plotting routines.



#### Types of Functions

| Type of Function             | Example Function              | Section            |
|------------------------------|-------------------------------|--------------------|
| Built-In functions           | `max()`                       | 1. Getting Started |
| User-defined functions       | `def my_function(): pass`     | 16. Functions      |
| Lambda functions             | `lambda x: x + 1`             | 17. Lambda         |
| Standard Library functions   | `math.sqrt()`                 | 18. Modules        |
| Third-Party Library Functions| `numpy.array()`               | 19. Library        |

Note: We won't be covering Generator, Asynchronous, or Recursive Functions as they are out of scope of Data Analytics.

#### Built-in Functions

Standard within python. We've already used a few:

* `print()`: Displays output
* `type()`: Checks the data type of objects
* `range()`: Generates a sequence of numbers, useful in loops
* `len()`: Counts the number of elements in a data structure

[Here are all the built-in functions in Python](https://docs.python.org/3/library/functions.html).

In [1]:
skill_list = ['Python', 'SQL', 'Excel']
print(skill_list)

['Python', 'SQL', 'Excel']


In [2]:
type(skill_list)

list

In [3]:
len(skill_list)

3

In [4]:
range(0,5)

range(0, 5)

In [5]:
data_salaries = [95000, 100000, 85000, 97000, 140000]

In [6]:
min(data_salaries)

85000

In [7]:
max(data_salaries)

140000

In [8]:
sum(data_salaries)

517000

In [9]:
sorted(data_salaries)

[85000, 95000, 97000, 100000, 140000]

## User-Defined Functions

These are created by the user with your name and syntax of choice: `calculate_something_special()`.

#### WARNING 
Do not name your function the same as standard Python objects.

For example, this is a bad idea:

```python
def print(input):
    return "Hello" + input
```

In this case the built-in `print()` function would be overridden.

In [10]:
base_salary = 100000
bonus_rate = 0.1

total_salary = base_salary * (1 + bonus_rate)

total_salary

110000.00000000001

In [12]:
def calculate_salary():
    base_salary = 100000
    bonus_rate = 0.1

    total_salary = base_salary * (1 + bonus_rate)
    
    return total_salary

In [13]:
calculate_salary()

110000.00000000001

In [18]:
# define the arguments inside the function
def calculate_salary(base_salary, bonus_rate):

    total_salary = base_salary * (1 + bonus_rate)
    
    return total_salary


In [19]:
calculate_salary(110000, 0.2)

132000.0

In [20]:
#we can add an optional argument (eg: if we have a standard bonus rate)
def calculate_salary(base_salary, bonus_rate=.1):

    total_salary = base_salary * (1 + bonus_rate)
    
    return total_salary


In [21]:
calculate_salary(100000)

110000.00000000001

In [None]:
#can be overwritten
calculate_salary(100000, 0.2)

120000.0

### Practice

Create a function job_title_contains that takes a job title and a keyword as arguments, and returns True if the job title contains the keyword, otherwise returns False. 

To confirm the function works, set the job_title to 'Data Scientist' and the keyword to 'Data'.

In [27]:
job_title = 'Data Scientist'
keyword = 'Data'

def job_title_contains(job_title, keyword):
    if keyword in job_title:
        return True
    else:
        return False

In [24]:
job_title_contains(job_title, keyword)

True

In [25]:
job_title_contains('Data Analyst', 'Analyst')

True

In [28]:
job_title_contains('Data Analyts', 'Engineer')

False

In [29]:
# simpler way
def job_title_contains(job_title, keyword):
    return keyword in job_title


In [30]:
job_title_contains(job_title, keyword)

True

In [31]:
job_title_contains('Data Analyst', 'Engineer')

False

Create a function average_salary that takes a list of salaries and returns the average salary. With the salaries set as [95000, 120000, 105000, 90000, 130000].

In [32]:
salaries = [95000, 120000, 105000, 90000, 130000]

def average_salary(salaries):
    return sum(salaries)/len(salaries)


In [33]:
average_salary(salaries)

108000.0

Create a function salary_statistics that takes a list of salaries and returns a dictionary with the minimum, maximum, and average salary. The list of salaries is set to [95000, 120000, 105000, 90000, 130000].

In [34]:
def salary_statistics(salaries):
    #Return a dictionary of statistics
    statistics = {'minimum':min(salaries), 'maximum':max(salaries), 'average':sum(salaries)/len(salaries) }
    return statistics

In [35]:
salaries = [95000, 120000, 105000, 90000, 130000]
salary_statistics(salaries)

{'minimum': 90000, 'maximum': 130000, 'average': 108000.0}

In [36]:
# Different way

def salary_statistics(salaries):
    return {
        'min': min(salaries),
        'max': max(salaries),
        'average': sum(salaries) / len(salaries)
    }

Create a function job_posting_summary that takes a list of job postings, where each posting is a dictionary with keys 'title', 'location', and 'salary', and returns a summary dictionary with the total number of postings, the average salary, and a list of unique locations. The job_postings is set to [{'title': 'Data Scientist', 'location': 'New York', 'salary': 95000}, {'title': 'Data Analyst', 'location': 'San Francisco', 'salary': 85000}, {'title': 'Machine Learning Engineer', 'location': 'New York', 'salary': 115000}].

In [41]:
job_postings = [
    {'title': 'Data Scientist', 'location': 'New York', 'salary': 95000},
    {'title': 'Data Analyst', 'location': 'San Francisco', 'salary': 85000},
    {'title': 'Machine Learning Engineer', 'location': 'New York', 'salary': 115000}
]

# what we need to get: {total_postings: x, average_salary: y, unique_locations: z}

def job_posting_summary(job_postings):
    total_postings = len(job_postings)
    average_salary = sum(posting['salary'] for posting in job_postings) / len(job_postings)
    unique_locations = set(posting['location'] for posting in job_postings)
    return {
        'total_postings': total_postings,
        'average_salary': average_salary,
        'unique_locations': unique_locations
    }

job_posting_summary(job_postings)



{'total_postings': 3,
 'average_salary': 98333.33333333333,
 'unique_locations': {'New York', 'San Francisco'}}

In [42]:
# other way, very similar
def job_posting_summary(job_postings):
    total_postings = len(job_postings)
    total_salary = sum(posting['salary'] for posting in job_postings)
    average_salary = total_salary / total_postings
    unique_locations = list(set(posting['location'] for posting in job_postings))
    return {
        'total_postings': total_postings,
        'average_salary': average_salary,
        'unique_locations': unique_locations
    }

job_postings = [
    {'title': 'Data Scientist', 'location': 'New York', 'salary': 95000},
    {'title': 'Data Analyst', 'location': 'San Francisco', 'salary': 85000},
    {'title': 'Machine Learning Engineer', 'location': 'New York', 'salary': 115000}
]
job_posting_summary(job_postings)

{'total_postings': 3,
 'average_salary': 98333.33333333333,
 'unique_locations': ['San Francisco', 'New York']}

### Extra Practice
🧪 Function Practice: Level 1

1. Greet someone </br>
Write a function greet(name) that prints: </br>
Hello, <name>!

In [8]:
def greet(name):
    print(f'Hello {name}')
   

In [10]:
greet('Amy')

Hello Amy


2. Check even/odd </br>
Write a function is_even(number) that returns True if the number is even, otherwise False.

In [16]:
def is_even(number):
    if number % 2 == 0:
        result = True
    else:
        result = False
    return result
    

In [17]:
is_even(2)

True

In [18]:
is_even(3)

False

3. Convert Celsius to Fahrenheit</br>
Write a function c_to_f(celsius) that returns the Fahrenheit equivalent.

(F = C × 9/5 + 32)

In [19]:
def c_to_f(celsius):
    fahrenheit = celsius * 9 / 5 +32
    return fahrenheit

In [20]:
c_to_f(36)

96.8

4. Find the longer word </br>
Write a function longer_word(word1, word2) that returns the longer of the two words. If they’re equal, return either.

In [24]:
def longer_word(word1, word2):
    if len(word1) > len(word2):
       long_word = word1
    else:
        long_word = word2
    return long_word

In [25]:
longer_word('mamma', 'mia')

'mamma'

In [26]:
longer_word('ala','portocala')

'portocala'

5. Get initials </br>
Write a function get_initials(first_name, last_name) that returns the initials as a string, like "J.D." for "John Doe".

In [31]:
def get_initials(first_name, last_name):
    initials = first_name[0] + "." + last_name[0] + "."
    return initials

In [32]:
get_initials('Sarah', 'Connor')

'S.C.'

🔹 Function Practice: Level 2</br>
1. Count vowels<br>
Write a function count_vowels(word) that returns how many vowels (a, e, i, o, u) are in the given word.

In [33]:
def count_vowels(word):
    count = 0
    for letter in word:
        if letter in 'aeiouAEIOU':
            count += 1
    return count



In [35]:
count_vowels('Alabama')

4

2. Format full name</br>
Function: format_name(first, last)

🔍 Hint: Use the .title() string method on each name to capitalize correctly.

✅ Combine them with a space in between.

In [36]:
def format_name(first, last):
    full_name = first.title() + " " + last.title()
    return full_name

In [37]:
format_name('ANNA', 'obrien')

'Anna Obrien'

3. Check for palindrome</br>
Function: is_palindrome(word)

🔍 Hint: Reverse the word using slicing (word[::-1]).

✅ Compare the original to the reversed version.



In [38]:
def is_palindrome(word):
    reverse = word[::-1]
    if word == reverse:
        palindrome = 'Yes'
    else:
        palindrome = 'No'
    return palindrome

In [39]:
is_palindrome('level')

'Yes'

In [40]:
is_palindrome('word')

'No'

4. Filter long words</br>
Write a function long_words(words, length) that takes a list of words and returns a list of only the words longer than the given length.

In [None]:
def long_words(words, length):
    long_word = [word for word in words if len(word) > length]
    return long_word

In [42]:
words = ['pen', 'notebook', 'phone', 'table',' chair']
length = 5
long_words(words,length)

['notebook', ' chair']

In [43]:
long_words(["data", "science", "AI", "analytics"], 4)

['science', 'analytics']

5. Sum salaries
Write a function total_salary(employees) where employees is a list of dictionaries like:

[{'name': 'Alice', 'salary': 70000}, {'name': 'Bob', 'salary': 80000}]

In [50]:
def total_salary(employees):
    salaries = sum(employee['salary'] for employee in employees)
    return salaries

In [51]:
employees = [{'name': 'Alice', 'salary': 70000}, {'name': 'Bob', 'salary': 80000}]
total_salary(employees)

150000