# **Python Functions**

## **Lambda Function _λ_**
- It's like a shortcut
- Needed for a short period of time _(one line of code)_

``lambda parameters: expression``

In [1]:
def double(x):
    return x * 2
double(7)

14

Lambda way

In [2]:
dob = lambda x: x * 2
dob(7)

14

In [3]:
age = lambda ag:True if ag >= 18 else False
age(19)

True

## **Map Funtion**
- Applies a _function_ to each _item_ in an _iterable_
- Can be list, tuples, etc.

``map(funtion, iterable)``

Example 1: Convert dolar prices to euros

In [4]:
store = [
 ('shirt', 20.29),
 ('pants', 22.99),
 ('socks', 10.90),
 ('jacket', 34.50)
]
# Lambda function
to_euros = lambda data: (data[0], data[1] * 0.93)
# Map function
store_euros = list(map(to_euros, store))
for i in store_euros:
    print(i)

('shirt', 18.8697)
('pants', 21.3807)
('socks', 10.137)
('jacket', 32.085)


Example 2: Convert list of integers to a set of floats

In [7]:
codi = input().split()
codi

['10', '20', '30', '40', '50']

In [8]:
a = set(map(float, codi))
a

{10.0, 20.0, 30.0, 40.0, 50.0}

## **Filter Function**
- Collection of elements from a iterable for which a funtion returns ``True``.

``filter(funtion, iterable)``

In [61]:
friend = [
    ('Jose', 19),
    ('Hannah', 23),
    ('Elie', 14),
    ('Barney', 17),
    ('Ted', 21),
    ('Robbin', 18)
]
adult = lambda data: data[1] >= 18
drink = list(filter(adult, friend))
drink

[('Jose', 19), ('Hannah', 23), ('Ted', 21), ('Robbin', 18)]

## **Zip Function**
- Agregate n elements, can be (list, tuple, dict, set, etc) and return a 'whatever' of **_tuples_**.

``zip(iterables)``

In [62]:
nick = ['drako', 'yogui', 'osogu', 'godotony']
passw = ['p@ssword', 'coco1', 'guest', 'o1o1']
web = ('ghub', 'reddit', 'yt')

userss = set(zip(nick, passw))
userss

{('drako', 'p@ssword'),
 ('godotony', 'o1o1'),
 ('osogu', 'guest'),
 ('yogui', 'coco1')}

- If an iterable doesn't have the same size, then length of the output depends on the shortest iterable.
- But this can be resolvable with ``zip_longest``.

In [None]:
userss = tuple(zip(nick, passw, web))
userss # length 3, bc web only have 3

(('drako', 'p@ssword', 'ghub'),
 ('yogui', 'coco1', 'reddit'),
 ('osogu', 'guest', 'yt'))

In [None]:
from itertools import zip_longest
userss = tuple(zip_longest(nick, passw, web, fillvalue = 'Idk'))
userss

(('drako', 'p@ssword', 'ghub'),
 ('yogui', 'coco1', 'reddit'),
 ('osogu', 'guest', 'yt'),
 ('godotony', 'o1o1', 'Idk'))

## **Statistical Funtions**

### **Min / Max**

In [9]:
import pandas as pd
import numpy as np

array_custom = np.array([
    [10, 15, 7, 22, 13],
    [27, 18, 30, 25, 11],
    [14, 23, 12, 20, 8],
    [31, 19, 26, 17, 16]    
])
col_names = ["Column1", "Column2", "Column3", "Column4", "Column5"]
df = pd.DataFrame(array_custom, columns = col_names)
df

Unnamed: 0,Column1,Column2,Column3,Column4,Column5
0,10,15,7,22,13
1,27,18,30,25,11
2,14,23,12,20,8
3,31,19,26,17,16


In [10]:
minimum_element = np.amin(array_custom)
df.min()
print(
    f'np array amin() -> {minimum_element}\n\
    pd DataFrame min():\n{df.min()}\n\
    min().min() df -> {df.min().min()}'
)

np array amin() -> 7
    pd DataFrame min():
Column1    10
Column2    15
Column3     7
Column4    17
Column5     8
dtype: int32
    min().min() df -> 7


### **Axis 0 and 1**

In [11]:
print(
    f'axis = 0 (column): {np.amin(array_custom, axis=0)}\
    \naxis = 1 (row): {np.amin(array_custom, axis=1)}'
)

axis = 0 (column): [10 15  7 17  8]    
axis = 1 (row): [ 7 11  8 16]


### **Median**
Sorts the array or df Ascendingly and then finds the median.
- if it's even -> returns 1 num.
- if it's odd -> returns mean of 2 median nums.

In [12]:
print(np.median(array_custom))
print(np.median(array_custom,axis=1))
print(np.median(array_custom,axis=0))


17.5
[13. 25. 14. 19.]
[20.5 18.5 19.  21.  12. ]


### **Variance**
First of all, the $ ^2$ in $s$ or $\sigma$ does not mean anything, it's just part of the notation.

When it's sample data _``ddof = 1``_:  <span style="margin-right: 240px;"></span> All data population _``ddof = 0``_:

#### $s^2 = \frac{\sum(i - .mean)^2}{N elem - 1} \hspace{13em} \sigma^2 = \frac{\sum(i - .mean)^2}{N elem}$ 

Pandas uses _Bessel Method (N - 1)_, is used when **working with a sample of the population** rather the complete population.

By default Pandas works with ``ddof = 1``, ddof means _degrees of freedom_ with ``1`` means that it's working with a sample.

In [40]:
df.var()

Column1    101.666667
Column2     10.916667
Column3    120.916667
Column4     11.333333
Column5     11.333333
dtype: float64

Numpy, by default works with ``ddof = 0`` which is the _Population Method (N)_, it means you have **all possible data** and not just a sample of it.

If you know all elements of the population, the population method provides the "true" variance of the population.

In [44]:
entire_ppl = np.var(array_custom, axis=0, ddof=0) # ddof = 0 can be ommited 
sample = np.var(array_custom,axis=0, ddof=1)
print(f'entire population: {entire_ppl}\nsample population: {sample}')

entire population: [76.25    8.1875 90.6875  8.5     8.5   ]
sample population: [101.66666667  10.91666667 120.91666667  11.33333333  11.33333333]


### **Standar Diviation**
Basically ``std`` is:

$ \hspace{15em} Std = \sqrt{variance}$

Where:
- Variance can be $s^2$ or $\sigma^2$.
- Here is applies the same, ``ddof`` is sample: ``1`` or all data:``0``.

In [60]:
print(df.values.std())
print(df.std(axis=1))

6.989992846920518
0    5.683309
1    7.661593
2    6.066300
3    6.457554
dtype: float64


In [59]:
print(np.std(array_custom,ddof=0))
print(np.std(array_custom, axis=1, ddof=1))

6.989992846920518
[5.6833089  7.66159252 6.06630036 6.45755372]
