| Function | Description                                           | Syntax                                      | Example                                     | Output           |
|----------|-------------------------------------------------------|---------------------------------------------|---------------------------------------------|------------------|
| `map()`  | Applies a function to each item in an iterable        | `map(function, DataFrame Series)`                   | `map(lambda x: x**2, df['col'])`         | `[1, 4, 9, 16]`   |
| `filter()` | Filters items in an iterable based on a condition | `filter(function, DataFrame Series)`                | `filter(lambda x: x % 2 == 0, df.loc['row'])` | `[2, 4]`         |
| `lambda` | Anonymous function (short inline function)           | `lambda arguments: expression`              | `lambda x: x + 10` `lambda x,y: x + y`                           | Returns function |



<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: 22px;
            color: #00FF00;
            padding-top: 0px;
        }
        li {
            font-size: 18px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>operations</strong></p>
    <li><strong>conditional selection</strong></li>
    <li><strong>apply -> normal function, lambda</strong></li>
    <li><strong>drop column</strong></li>
    <li><strong>sort column</strong></li>
    <li><strong>isnull</strong></li>
    <li><strong>pivot table</strong></li>      
</body>
</html>

In [1]:
import numpy as np
import pandas as pd

In [2]:
dict = {
    'col1': [1,2,3,4],
    'col2': [444,555,666,444],
    'col3': ['abc', 'def', 'ghi', 'xyz']
}
df = pd.DataFrame(dict)
print(df)
print(df.head())

   col1  col2 col3
0     1   444  abc
1     2   555  def
2     3   666  ghi
3     4   444  xyz
   col1  col2 col3
0     1   444  abc
1     2   555  def
2     3   666  ghi
3     4   444  xyz


In [3]:
print(df['col2'].unique())
print(len(df['col2'].unique()))
print(df['col2'].value_counts())

[444 555 666]
3
col2
444    2
555    1
666    1
Name: count, dtype: int64


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: 22px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>conditional selection</strong></p>
</body>
</html>

In [4]:
print(df)
print(df['col2'])
print(df[df['col1'] > 2])
df[(df['col1'] > 2) & (df['col2'] == 444)]
df[(df['col3'] == 'ghi') & (df['col2'] > 0)]

   col1  col2 col3
0     1   444  abc
1     2   555  def
2     3   666  ghi
3     4   444  xyz
0    444
1    555
2    666
3    444
Name: col2, dtype: int64
   col1  col2 col3
2     3   666  ghi
3     4   444  xyz


Unnamed: 0,col1,col2,col3
2,3,666,ghi


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        li {
            font-size: 18px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <li><strong>lambda(x : x**2)</strong></li>
    <li><strong>map(lambda(x : x**2), [10,11,12,13])</strong></li>
    <li><strong>filter(lambda(x : x%2==0), [10,11,12,13])</strong></li>
</body>
</html>

In [5]:
y = lambda t: t*2
print(y(2))
x = lambda s, t: s * t
print(x(5,10))

4
50


In [6]:
list_value = [10, 11, 12, 13, 14]
lambda_cal = lambda x: x**4
print(list(map(lambda_cal, list_value)))

[10000, 14641, 20736, 28561, 38416]


In [7]:
list_value = [10, 11, 12, 13, 14]
lambda_cal = lambda x: x%2 == 0
print(list(filter(lambda_cal, list_value)))

[10, 12, 14]


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: 22px;
            color: #00FF00;
            padding-top: 0px;
        }
        li {
            font-size: 18px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>apply function</strong></p>
    <li><strong>function activities</strong></li>
    <li><strong>lambda, map, filter activities</strong></li>
</body>
</html>

In [8]:
print(df)
def col1_cal(x):
    return x * 2

col2_cal = lambda x: x*3

print(df['col1'].apply(col1_cal))
print(df['col2'].apply(col2_cal))
df

   col1  col2 col3
0     1   444  abc
1     2   555  def
2     3   666  ghi
3     4   444  xyz
0    2
1    4
2    6
3    8
Name: col1, dtype: int64
0    1332
1    1665
2    1998
3    1332
Name: col2, dtype: int64


Unnamed: 0,col1,col2,col3
0,1,444,abc
1,2,555,def
2,3,666,ghi
3,4,444,xyz


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: 22px;
            color: #00FF00;
            padding-top: 0px;
        }
        li {
            font-size: 18px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p>column data update using -></p>
    <li><strong>map(lambda argument: expression, series)</strong></li>
    <li><strong>filter(lambda argument: expression, series)</strong></li>
</body>
</html>

In [9]:
print(df)
col_update = list(map(lambda x: x*2, df['col2']))
print(col_update)
df['col2'] = col_update
print(df)

   col1  col2 col3
0     1   444  abc
1     2   555  def
2     3   666  ghi
3     4   444  xyz
[888, 1110, 1332, 888]
   col1  col2 col3
0     1   888  abc
1     2  1110  def
2     3  1332  ghi
3     4   888  xyz


In [10]:
print(df)
col_update = list(filter(lambda x: x%2==0, df['col2']))
print(col_update)
df['col2'] = col_update
print(df)

   col1  col2 col3
0     1   888  abc
1     2  1110  def
2     3  1332  ghi
3     4   888  xyz
[888, 1110, 1332, 888]
   col1  col2 col3
0     1   888  abc
1     2  1110  def
2     3  1332  ghi
3     4   888  xyz


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: 22px;
            color: #00FF00;
            padding-top: 0px;
        }
        li {
            font-size: 18px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p>row data update using -></p>
    <li><strong>map(lambda argument: expression, series)</strong></li>
    <li><strong>filter(lambda argument: expression, series)</strong></li>
</body>
</html>

In [11]:
print(df)
row_update = list(map(lambda x: x*2, df.loc[1]))
print(row_update)
df.loc[1] = row_update
print(df)

   col1  col2 col3
0     1   888  abc
1     2  1110  def
2     3  1332  ghi
3     4   888  xyz
[np.int64(4), np.int64(2220), 'defdef']
   col1  col2    col3
0     1   888     abc
1     4  2220  defdef
2     3  1332     ghi
3     4   888     xyz


In [12]:
df1 = pd.DataFrame({
    'col1': [1, 2, 3, 4],
    'col2': [444, 555, 666, 444]
})
print(df1)
row_update = list(filter(lambda x: x%2==0, df1.loc[1]))
print(row_update)
df1.loc[1] = row_update
print(df1)

   col1  col2
0     1   444
1     2   555
2     3   666
3     4   444
[2]
   col1  col2
0     1   444
1     2     2
2     3   666
3     4   444


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>drop function</strong></p>
</body>
</html>

In [13]:
print(df)

   col1  col2    col3
0     1   888     abc
1     4  2220  defdef
2     3  1332     ghi
3     4   888     xyz


In [14]:
print(df.drop('col1', axis=1)) # Drop a column
print(df.drop(0, axis=0)) # Drop a row

   col2    col3
0   888     abc
1  2220  defdef
2  1332     ghi
3   888     xyz
   col1  col2    col3
1     4  2220  defdef
2     3  1332     ghi
3     4   888     xyz


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>sort function</strong></p>
</body>
</html>

In [15]:
print(df)
print(df.columns)
print(df.index)

   col1  col2    col3
0     1   888     abc
1     4  2220  defdef
2     3  1332     ghi
3     4   888     xyz
Index(['col1', 'col2', 'col3'], dtype='object')
RangeIndex(start=0, stop=4, step=1)


In [16]:
print(df.sort_values('col2')) 

   col1  col2    col3
0     1   888     abc
3     4   888     xyz
2     3  1332     ghi
1     4  2220  defdef


In [17]:
df.isnull()

Unnamed: 0,col1,col2,col3
0,False,False,False
1,False,False,False
2,False,False,False
3,False,False,False


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>pivot table</strong></p>
</body>
</html>

| Name  | Department | Month | Sales |
| ----- | ---------- | ----- | ----- |
| Alice | Sales      | Jan   | 100   |
| Alice | Sales      | Feb   | 200   |
| Bob   | Marketing  | Jan   | 300   |
| Bob   | Marketing  | Feb   | 400   |

pivot = df.pivot_table(index='Department', columns='Month', values='Sales', aggfunc='sum')

| Department | Jan | Feb |
| ---------- | --- | --- |
| Marketing  | 300 | 400 |
| Sales      | 100 | 200 |



In [20]:
dict = {
    'A': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
    'B': ['one', 'one', 'two', 'two', 'one', 'one'],
    'C': ['X', 'Y', 'X', 'Y', 'X', 'Y'],
    'D': [1,3,2,5,4,1]
}

df = pd.DataFrame(dict)
df

Unnamed: 0,A,B,C,D
0,foo,one,X,1
1,foo,one,Y,3
2,foo,two,X,2
3,bar,two,Y,5
4,bar,one,X,4
5,bar,one,Y,1


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>pivot table: rows of data</strong></p>
</body>
</html>

In [30]:
print(df)
print('#############Average##############')
print(df.pivot_table(values='D', index='A'))
print(df.pivot_table(values='D', index='B'))
print(df.pivot_table(values='D', index='C'))
print('#############Summation##############')
print(df.pivot_table(values='D', index='A', aggfunc='sum'))
print(df.pivot_table(values='D', index='B', aggfunc='sum'))
print(df.pivot_table(values='D', index='C', aggfunc='sum'))


     A    B  C  D
0  foo  one  X  1
1  foo  one  Y  3
2  foo  two  X  2
3  bar  two  Y  5
4  bar  one  X  4
5  bar  one  Y  1
#############Average##############
            D
A            
bar  3.333333
foo  2.000000
        D
B        
one  2.25
two  3.50
          D
C          
X  2.333333
Y  3.000000
#############Summation##############
      D
A      
bar  10
foo   6
     D
B     
one  9
two  7
   D
C   
X  7
Y  9


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>pivot table: columns of data</strong></p>
</body>
</html>

In [36]:
print(df)
print('\n')
print(df.pivot_table(columns='A', values='D', aggfunc='sum'))
print(df.pivot_table(columns='B', values='D', aggfunc='sum'))
print(df.pivot_table(columns='C', values='D', aggfunc='sum'))

     A    B  C  D
0  foo  one  X  1
1  foo  one  Y  3
2  foo  two  X  2
3  bar  two  Y  5
4  bar  one  X  4
5  bar  one  Y  1


A  bar  foo
D   10    6
B  one  two
D    9    7
C  X  Y
D  7  9


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <style>
        p {
            font-size: px;
            color: #00FF00;
            padding-top: 0px;
        }
    </style>
</head>
<body>
    <p><strong>pivot table</strong></p>
</body>
</html>

In [40]:
print(df)
print(df.pivot_table(index='A', columns='C', values='D', aggfunc='sum'))
print(df.pivot_table(index=['A', 'B'], columns='C', values='D', aggfunc='sum'))

     A    B  C  D
0  foo  one  X  1
1  foo  one  Y  3
2  foo  two  X  2
3  bar  two  Y  5
4  bar  one  X  4
5  bar  one  Y  1
C    X  Y
A        
bar  4  6
foo  3  3
C          X    Y
A   B            
bar one  4.0  1.0
    two  NaN  5.0
foo one  1.0  3.0
    two  2.0  NaN
