# Functions

In [2]:
# Starting data for examples
skill_list = ['Python', 'SQL', 'Excel']

3

## Types of Functions

| Type of Function             | Example Function              | Section            |
|------------------------------|-------------------------------|--------------------|
| Built-In functions           | `max()`                       | 1. Getting Started |
| User-defined functions       | `def my_function(): pass`     | 16. Functions      |
| Lambda functions             | `lambda x: x + 1`             | 17. Lambda         |
| Standard Library functions   | `math.sqrt()`                 | 18. Modules        |
| Third-Party Library Functions| `numpy.array()`               | 19. Library        |

Note: We won't be covering Generator, Asynchronous, or Recursive Functions as they are out of scope of Data Analytics.*italicized text*

## Built-in Functions

Standard within python. We've already used a few:

* `print()`: Displays output
* `type()`: Checks the data type of objects
* `range()`: Generates a sequence of numbers, useful in loops
* `len()`: Counts the number of elements in a data structure

[Here are all the built-in functions in Python](https://docs.python.org/3/library/functions.html).

In [4]:
# Example Built-in: print function
print(skill_list)

['Python', 'SQL', 'Excel']


In [None]:
# Example Built-in: len function
len(skill_list)

In [3]:
# code under the hood similiar to len function
count = 0
for skill in skill_list:
  count += 1

print(count)

3


In [13]:
help(help)

Help on _Helper in module _sitebuiltins object:

class _Helper(builtins.object)
 |  Define the builtin 'help'.
 |
 |  This is a wrapper around pydoc.help that provides a helpful message
 |  when 'help' is typed at the Python interactive prompt.
 |
 |  Calling help() at the Python prompt starts an interactive help session.
 |  Calling help(thing) prints help for the python object 'thing'.
 |
 |  Methods defined here:
 |
 |  __call__(self, *args, **kwds)
 |      Call self as a function.
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables
 |
 |  __weakref__
 |      list of weak references to the object



In [17]:
type(help)

_sitebuiltins._Helper

In [18]:
range(0,5)

range(0, 5)

### Full List of Built-In Functions

[Here are all the built-in functions in Python](https://docs.python.org/3/library/functions.html).

In [20]:
import types

# prints list the built-in functions
# Note: this code is an example of using list comprehension
print([func for func in dir(__builtins__) if isinstance(getattr(__builtins__, func), types.BuiltinFunctionType)])

['__build_class__', '__import__', 'abs', 'aiter', 'all', 'anext', 'any', 'ascii', 'bin', 'breakpoint', 'callable', 'chr', 'compile', 'delattr', 'dir', 'divmod', 'eval', 'exec', 'format', 'getattr', 'globals', 'hasattr', 'hash', 'hex', 'id', 'isinstance', 'issubclass', 'iter', 'len', 'locals', 'max', 'min', 'next', 'oct', 'open', 'ord', 'pow', 'print', 'repr', 'round', 'setattr', 'sorted', 'sum', 'vars']


## Data Analytic Functions

Here are some that are useful for data analytics:

* `sum()`, `min()`, `max()`: Basic statistical operations
* `sorted()`: Sorts data

In [21]:
# Given data
data_salaries = [95000, 100000, 85000, 97000, 140000]

In [25]:
minimum = min(data_salaries)
maxium = max(data_salaries)
total = sum(data_salaries)
sorted_list = sorted(data_salaries)

print("salaries:", data_salaries)
print("min:", minimum)
print("max:", maxium)
print("sum:", total)
print("sorted salaries", sorted_list)

salaries: [95000, 100000, 85000, 97000, 140000]
min: 85000
max: 140000
sum: 517000
sorted salaries [85000, 95000, 97000, 100000, 140000]


## User-Defined Functions

In [26]:
# Given data and formulas

base_salary = 100000
bonus_rate = 0.1

total_salary = base_salary * (1 + bonus_rate)

total_salary

110000.00000000001

In [44]:
# User-Defined function for total salary calculation
def calculate_salary(base_salary, bonus_rate = 0.1):

  total_salary = base_salary * (1 + bonus_rate)
  return total_salary

In [46]:
salary_1 = 100000
rate_1 = 0.2


print(calculate_salary(salary_1, rate_1))

120000.0


# Problems

## Job Title Contains (1.16.1) - Problem

In [63]:
def job_title_contains(job_title, keyword):
  return keyword in job_title

In [66]:

job_title = 'Data Scientist'
keyword = 'Data'

job_title_contains(job_title=job_title, keyword=keyword)

True

##  Average Salary (1.16.2) - Problem

In [70]:
def average_salary(salary_list):
  return sum(salary_list) / len(salary_list)

In [71]:
salaries = [95000, 120000, 105000, 90000, 130000]

average_salary(salaries)

108000.0

## Salary Statistics (1.16.3) - Problem

In [79]:
def salary_statistics(salary_list):

  minimum = min(salary_list)
  maxium = max(salary_list)
  average = average_salary(salary_list)

  statistics = {
      "min":minimum,
      "max":maxium,
      "avg":average
  }

  return statistics

In [80]:
salaries = [95000, 120000, 105000, 90000, 130000]

salary_statistics(salaries)

{'min': 90000, 'max': 130000, 'avg': 108000.0}

## Job Posting Summary (1.16.4) - Problem

In [120]:
# takes in a dictionary of job postings and returns a summary with count,
# avg salary and list of unique locations
def job_posting_summary(postings):

  total_postings = len(postings)
  total_salary = sum([posting['salary'] for posting in postings])
  average_salary = total_salary / total_postings
  unique_locations = list({posting['location'] for posting in postings})

  summary = {
      "Count":total_postings,
      "Avg Salary":average_salary,
      "Unique Locations": unique_locations
  }

  return summary

In [121]:
# Given Data
job_postings = [
    {'title': 'Data Scientist', 'location': 'New York', 'salary': 95000},
    {'title': 'Data Analyst', 'location': 'San Francisco', 'salary': 85000},
    {'title': 'Machine Learning Engineer', 'location': 'New York', 'salary': 115000}
]

# Call User-Defined Function with given data
job_posting_summary(job_postings)


{'Count': 3,
 'Avg Salary': 98333.33333333333,
 'Unique Locations': ['San Francisco', 'New York']}