<a href="https://colab.research.google.com/github/haris18896/Python-Data-Analysis/blob/main/07_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Functions

## What is a Function

### Notes

* A **function** is a block of code that only runs when it's called.
* You can pass data (called **parameters**) into a function.
* The function can return data as a result.

## Importance

Enable us to resuse the code and make it more modular, important for complex data analysis and plotting routines.


## Types of Functions

| Type of Function             | Example Function              | Section            |
|------------------------------|-------------------------------|--------------------|
| Built-In functions           | `max()`                       | 1. Getting Started |
| User-defined functions       | `def my_function(): pass`     | 16. Functions      |
| Lambda functions             | `lambda x: x + 1`             | 17. Lambda         |
| Standard Library functions   | `math.sqrt()`                 | 18. Modules        |
| Third-Party Library Functions| `numpy.array()`               | 19. Library        |

Note: We won't be covering Generator, Asynchronous, or Recursive Functions as they are out of scope of Data Analytics.


# [Builtin functions](https://docs.python.org/3/library/functions.html)




In [1]:
help(all)

Help on built-in function all in module builtins:

all(iterable, /)
    Return True if bool(x) is True for all values x in the iterable.
    
    If the iterable is empty, return True.



In [2]:
import types

print([func for func in dir(__builtins__) if isinstance(getattr(__builtins__, func), types.BuiltinFunctionType)])

['__build_class__', '__import__', 'abs', 'aiter', 'all', 'anext', 'any', 'ascii', 'bin', 'breakpoint', 'callable', 'chr', 'compile', 'delattr', 'dir', 'divmod', 'eval', 'exec', 'format', 'getattr', 'globals', 'hasattr', 'hash', 'hex', 'id', 'isinstance', 'issubclass', 'iter', 'len', 'locals', 'max', 'min', 'next', 'oct', 'open', 'ord', 'pow', 'print', 'repr', 'round', 'setattr', 'sorted', 'sum', 'vars']


In [3]:
salary_list = [10000, 120000, 130000, 50000, 197692]

def calculate_salary(salary, rate=.1):
  total_salary = salary * (1 + rate)

  return total_salary

total_salary_list = [calculate_salary(salary) for salary in salary_list]

total_salary_list

[11000.0, 132000.0, 143000.0, 55000.00000000001, 217461.2]

# Lambda

* anonymus functions
* lambda x: x + 1

In [4]:
mul_two = lambda x: x*2
mul_two(2)

4

In [5]:
(lambda x: x*2)(3)

6

In [6]:
(lambda x, y : x * 2 + y*3)(3, 4)

18

In [7]:
(lambda *args: sum(args))(1,2,3,4,5)

15

In [8]:
(lambda **kwargs: sum(kwargs.values()))(a=1, b=2, c=3)

6

In [9]:
(lambda **kwargs: kwargs.values())(a=1, b=2, c=3)

dict_values([1, 2, 3])

In [10]:
(lambda salary, rate : salary * (1 + rate))(1000, 0.1)

1100.0

In [11]:
total_salary_list = [(lambda x: x * (1 + 0.1))(salary) for salary in salary_list]

total_salary_list

[11000.0, 132000.0, 143000.0, 55000.00000000001, 217461.2]

In [12]:
job_data = [
    {
        'job_title': "Data Scientist",
        "job_skills": ["Python", "Machine Learning", "Statistics"],
        "remote": True
    },
     {
         'job_title': "Data Scientist",
        "job_skills": ["SQL", "Data Visualization", "Data Cleaning"],
        "remote": False
    },
    {
        'job_title':"Machine Learning Engineer",
        "job_skills": ["Python", "Machine Learning", "Cloud Computing"],
        "remote": True
    },
     {
        'job_title':"Data Engineer",
        "job_skills": ["Python", "SQL", "Data Warehousing"],
        "remote": False
    },
    {
        'job_title' : "Business Intelligence Analyst",
        "job_skills": ["Excel", "Power BI", "Data Analysis"],
        "remote": True
    }
]


help(filter)

Help on class filter in module builtins:

class filter(object)
 |  filter(function or None, iterable) --> filter object
 |  
 |  Return an iterator yielding those items of iterable for which function(item)
 |  is true. If function is None, return the items that are true.
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



In [13]:
list(filter(lambda job: job['remote'], job_data))

[{'job_title': 'Data Scientist',
  'job_skills': ['Python', 'Machine Learning', 'Statistics'],
  'remote': True},
 {'job_title': 'Machine Learning Engineer',
  'job_skills': ['Python', 'Machine Learning', 'Cloud Computing'],
  'remote': True},
 {'job_title': 'Business Intelligence Analyst',
  'job_skills': ['Excel', 'Power BI', 'Data Analysis'],
  'remote': True}]

In [14]:
list(filter(lambda job: job['remote'] and 'Python' in job["job_skills"], job_data))

[{'job_title': 'Data Scientist',
  'job_skills': ['Python', 'Machine Learning', 'Statistics'],
  'remote': True},
 {'job_title': 'Machine Learning Engineer',
  'job_skills': ['Python', 'Machine Learning', 'Cloud Computing'],
  'remote': True}]

# Module

In [15]:
import my_module

my_module.skill_list



['python', 'sql', 'java']

In [16]:
my_module.skill('python')

'python is my favourite skill'

In [17]:
from job_analyzer import calculate_salary, calculate_bonus

calculate_salary(100)

# calculate_bonus(1100, 1000)

110.00000000000001

In [18]:
help(calculate_salary)

Help on function calculate_salary in module job_analyzer:

calculate_salary(salary, rate=0.1)
    Calculate the total salary based on the base salary and bonus
    
    Args:
    salary (Float): base Salary.
    rate (Float): The bonus rate. Default is .1
    
    Returns:
    float: The total salary



In [20]:
salary_list = [9800, 1000, 5670, 1234, 4321]

import statistics

statistics.mean(salary_list)

4405

In [21]:
help(statistics)

Help on module statistics:

NAME
    statistics - Basic statistics module.

MODULE REFERENCE
    https://docs.python.org/3.10/library/statistics.html
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module provides functions for calculating statistics of data, including
    averages, variance, and standard deviation.
    
    Calculating averages
    --------------------
    
    Function            Description
    mean                Arithmetic mean (average) of data.
    fmean               Fast, floating point arithmetic mean.
    geometric_mean      Geometric mean of data.
    harmonic_mean       Harmonic mean of data.
    median              Median (middle value) of data.
    median_low  

In [22]:
from statistics import mean, median, mode

mean(salary_list)

4405

In [23]:
median(salary_list)

4321

In [24]:
mode(salary_list)

9800

In [28]:
data_science_jobs = [
    {'job_title': 'Data Scientist', 'job_skills': "['Python', 'SQL', 'Machine Learning']", 'job_date': '2023-05-12'},
    {'job_title': 'Machine Learning Engineer', 'job_skills': "['Python', 'TensorFlow', 'Deep Learning']", 'job_date': '2023-05-15'},
    {'job_title': 'Data Analyst', 'job_skills': "['SQL', 'R', 'Tableau']", 'job_date': '2023-05-10'},
    {'job_title': 'Business Intelligence Developer', 'job_skills': "['SQL', 'PowerBI', 'Data Warehousing']", 'job_date': '2023-05-08'},
    {'job_title': 'Data Engineer', 'job_skills': "['Python', 'Spark', 'Hadoop']", 'job_date': '2023-05-18'},
    {'job_title': 'AI Specialist', 'job_skills': "['Python', 'PyTorch', 'AI Ethics']", 'job_date': '2023-05-20'}
]

In [37]:
from datetime import datetime, date

datetime.now()


datetime.datetime(2024, 6, 9, 11, 16, 57, 21021)

In [34]:
type(data_science_jobs[0]["job_date"])

str

In [36]:
print(datetime.strptime(data_science_jobs[0]["job_date"], "%Y-%m-%d"))

2023-05-12 00:00:00


In [39]:
for job in data_science_jobs:
  job["job_date"] = datetime.strptime(job["job_date"], "%Y-%m-%d")

In [40]:
data_science_jobs

[{'job_title': 'Data Scientist',
  'job_skills': "['Python', 'SQL', 'Machine Learning']",
  'job_date': datetime.datetime(2023, 5, 12, 0, 0)},
 {'job_title': 'Machine Learning Engineer',
  'job_skills': "['Python', 'TensorFlow', 'Deep Learning']",
  'job_date': datetime.datetime(2023, 5, 15, 0, 0)},
 {'job_title': 'Data Analyst',
  'job_skills': "['SQL', 'R', 'Tableau']",
  'job_date': datetime.datetime(2023, 5, 10, 0, 0)},
 {'job_title': 'Business Intelligence Developer',
  'job_skills': "['SQL', 'PowerBI', 'Data Warehousing']",
  'job_date': datetime.datetime(2023, 5, 8, 0, 0)},
 {'job_title': 'Data Engineer',
  'job_skills': "['Python', 'Spark', 'Hadoop']",
  'job_date': datetime.datetime(2023, 5, 18, 0, 0)},
 {'job_title': 'AI Specialist',
  'job_skills': "['Python', 'PyTorch', 'AI Ethics']",
  'job_date': datetime.datetime(2023, 5, 20, 0, 0)}]

# Abstract Syntax Tree

In [43]:
import ast

for job in data_science_jobs:
  job['job_skills'] = ast.literal_eval(job["job_skills"])

data_science_jobs

[{'job_title': 'Data Scientist',
  'job_skills': ['Python', 'SQL', 'Machine Learning'],
  'job_date': datetime.datetime(2023, 5, 12, 0, 0)},
 {'job_title': 'Machine Learning Engineer',
  'job_skills': ['Python', 'TensorFlow', 'Deep Learning'],
  'job_date': datetime.datetime(2023, 5, 15, 0, 0)},
 {'job_title': 'Data Analyst',
  'job_skills': ['SQL', 'R', 'Tableau'],
  'job_date': datetime.datetime(2023, 5, 10, 0, 0)},
 {'job_title': 'Business Intelligence Developer',
  'job_skills': ['SQL', 'PowerBI', 'Data Warehousing'],
  'job_date': datetime.datetime(2023, 5, 8, 0, 0)},
 {'job_title': 'Data Engineer',
  'job_skills': ['Python', 'Spark', 'Hadoop'],
  'job_date': datetime.datetime(2023, 5, 18, 0, 0)},
 {'job_title': 'AI Specialist',
  'job_skills': ['Python', 'PyTorch', 'AI Ethics'],
  'job_date': datetime.datetime(2023, 5, 20, 0, 0)}]

# Library