# A brief summary of Pandas
<br>
<div style="opacity: 0.8; font-family: Consolas, Monaco, Lucida Console, Liberation Mono, DejaVu Sans Mono, Bitstream Vera Sans Mono, Courier New; font-size: 12px; font-style: italic;">
    ────────
    for more from the author, visit
    <a href="https://github.com/hazemanwer2000">github.com/hazemanwer2000</a>.
    ────────
</div>

## Table of Contents
* [`Series`](#series)
  * [Indexing](#indexing)
  * [Operations](#operations)
    * [Custom Operations](#custom-operations)
  * [Transformations](#transformations)
  * [Manipulations](#manipulations)
    * [Deleting](#deleting)
    * [Appending](#appending)
    * [Inserting](#inserting)
* [`DataFrame`](#dataframe)
  * [Indexing](#indexing)
  * [Manipulations](#manipulations)
    * [Deleting](#deleting)
    * [Appending](#appending)
    * [Inserting](#inserting)
  * [Ordering](#ordering)
  * [Merging](#merging)


Pandas is a python package that allows for easy handling and manipulation of tables.

In [None]:
import pandas as pd

## `Series` <a class="anchor" id="series"></a>

A `Series` is a 1-Dimensional indexed array.

In [None]:
# From list, using default indexing
pd.Series([1, 2, 3], dtype=int)

0    1
1    2
2    3
dtype: int64

In [None]:
# From list, using custom indexing
pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)

A    1
B    2
C    3
dtype: int64

In [9]:
# From dictionary
pd.Series({'X' : 1, 'Y' : 2, 'Z' : 3}, dtype=int)

X    1
Y    2
Z    3
dtype: int64

### Indexing <a class="anchor" id="indexing"></a>

In [35]:
# Access Element
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.loc['B']

np.int64(2)

In [37]:
# Slicing
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.iloc[1:]

B    2
C    3
dtype: int64

In [None]:
# Slicing (with Selection)
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.loc[['A', 'C']]

A    1
C    3
dtype: int64

*Note:* `loc` is used for index-based (i.e., like a `dict`) access, while `iloc` is used for sequential-based (i.e., like a `list`) access.

In [None]:
# Direct access to index(es) and value(es)
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
for idx, value in zip(s.index, s.values):
    print(f"{idx}: {value}")

A: 1
B: 2
C: 3


### Operations <a class="anchor" id="operations"></a>

In [None]:
# Element-wise operations between two 'Series'
#   Note: Operation applies to elements with equal indices
s1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s2 = pd.Series([4, 5, 6], index=['B', 'A', 'C'], dtype=int)
s1 + s2

A    6
B    6
C    9
dtype: int64

#### Custom Operations <a class="anchor" id="custom-operations"></a>

In [None]:
# Custom Element-wise operations between two 'Series'
#   Note: Operation applies to elements with equal indices
s1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s2 = pd.Series([4, 5, 6], index=['B', 'A', 'C'], dtype=int)
s1.combine(s2, lambda x, y: x + y)

A    6
B    6
C    9
dtype: int64

### Transformations <a class="anchor" id="transformations"></a>

In [25]:
# Mapping value(s)
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.map(lambda x: x + 5)

A    6
B    7
C    8
dtype: int64

### Manipulations <a class="anchor" id="manipulations"></a>

#### Deleting <a class="anchor" id="deleting"></a>

In [38]:
# Delete element(s) with specified indices, creates new series
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.drop(['A'])

B    2
C    3
dtype: int64

In [43]:
# Delete element(s) with specified sequential indices, creates new series
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.drop(s.index[1:])

A    1
dtype: int64

#### Appending <a class="anchor" id="appending"></a>

In [66]:
# ? Append an element
s = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s.loc['D'] = 4
s

A    1
B    2
C    3
D    4
dtype: int64

In [None]:
# ? Append a series
s1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s2 = pd.Series([4, 5, 6], index=['D', 'E', 'F'], dtype=int)
pd.concat([s1, s2])

A    1
B    2
C    3
D    4
E    5
F    6
dtype: int64

#### Inserting <a class="anchor" id="inserting"></a>

In [85]:
# Inserting is possible as a combination of slicing and concatenating
s1 = pd.Series([1, 2, 3], index=['A', 'B', 'C'], dtype=int)
s2 = pd.Series([4, 5, 6], index=['D', 'E', 'F'], dtype=int)
pd.concat([s1.iloc[:1], s2, s1.iloc[1:]])

A    1
D    4
E    5
F    6
B    2
C    3
dtype: int64

## `DataFrame` <a class="anchor" id="dataframe"></a>

A `DataFrame` is a table.

In [None]:
# From a dictionary of lists
pd.DataFrame({
    'name' : ['Alice', 'Bob'],
    'age'  : [10, 13],
}, dtype=object)

Unnamed: 0,name,age
0,Alice,10
1,Bob,13


In [None]:
# From a list of dictionaries
pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

Unnamed: 0,name,age
0,Alice,10
1,Bob,13


In [None]:
# From a list of lists, and column-names
pd.DataFrame([
    ['Alice', 10],
    ['Bob', 13],
], columns=['name', 'age'], dtype=object)

Unnamed: 0,name,age
0,Alice,10
1,Bob,13


In [None]:
# From a list of dictionaries
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
    {'name' : 'Simon', 'age' : 15},
], dtype=object)
df.shape

(3, 2)

### Indexing <a class="anchor" id="indexing"></a>

In [61]:
# ? Access column, as 'Series'
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
    {'name' : 'Simon', 'age' : 15},
], dtype=object)
df['name']

0    Alice
1      Bob
2    Simon
Name: name, dtype: object

In [64]:
# ? Access row, as 'Series'
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
    {'name' : 'Simon', 'age' : 15},
], dtype=object)
df.iloc[1]

name    Bob
age      13
Name: 1, dtype: object

In [65]:
# ? Slice 'DataFrame'
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10, 'race' : 'asian'},
    {'name' : 'Bob', 'age' : 13, 'race' : 'white'},
    {'name' : 'Simon', 'age' : 15, 'race' : 'black'},
], dtype=object)
df[['name', 'age']].iloc[[0, 2]]

Unnamed: 0,name,age
0,Alice,10
2,Simon,15


In [71]:
# ? Direct access to 'DataFrame'

df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

for column in df.columns:
    print(f"{column}")
    for value, index in zip(df[column].values, df.index):
        print(f"{index}: {value}")

name
0: Alice
1: Bob
age
0: 10
1: 13


In [121]:
# ? Loop on every row in 'DataFrame'

df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

for idx, row in df.iterrows():
    print(f"{idx}: {row['name']}({row['age']})")

0: Alice(10)
1: Bob(13)


### Manipulations <a class="anchor" id="manipulations"></a>

#### Deleting <a class="anchor" id="deleting"></a>

In [None]:
# Delete row(s), creates new data-frame
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)
df.drop([0], axis=0)

Unnamed: 0,name,age
1,Bob,13


In [75]:
# Delete column(s), creates new data-frame
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)
df.drop(['age'], axis=1)

Unnamed: 0,name
0,Alice
1,Bob


#### Appending <a class="anchor" id="appending"></a>

In [87]:
# Appending a new column
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

column = pd.Series([True, False])

df['passed?'] = column
df

Unnamed: 0,name,age,passed?
0,Alice,10,True
1,Bob,13,False


In [93]:
# Appending a new row
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

row = pd.Series(['Sam', 17], index=['name', 'age'])

df.loc[len(df.index)] = row
df

Unnamed: 0,name,age
0,Alice,10
1,Bob,13
2,Sam,17


In [None]:
# Concat two 'DataFrame'(s), vertically
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)
pd.concat([df, df], axis=0)

Unnamed: 0,name,age
0,Alice,10
1,Bob,13
0,Alice,10
1,Bob,13


In [None]:
# Concat two 'DataFrame'(s), horizontally
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)
pd.concat([df, df], axis=1)

Unnamed: 0,name,age,name.1,age.1
0,Alice,10,Alice,10
1,Bob,13,Bob,13


#### Inserting <a class="anchor" id="inserting"></a>

In [116]:
# Inserting is possible as a combination of slicing and concatenating
df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)
pd.concat([df.iloc[1:], df.iloc[:1]])

Unnamed: 0,name,age
1,Bob,13
0,Alice,10


### Ordering <a class="anchor" id="ordering"></a>

In [101]:
# Reset row indices

df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Bob', 'age' : 13},
    {'name' : 'Simon', 'age' : 15},
], dtype=object)
df = pd.concat([df.iloc[2:], df.iloc[:1]])

df.reset_index(drop=True)

Unnamed: 0,name,age
0,Simon,15
1,Alice,10


In [None]:
# Sort row(s)

df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Simone', 'age' : 15},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

df_sorted = df.sort_values(by='name', key=lambda col: col.map(len), ascending=False)
df_sorted.reset_index(drop=True)

Unnamed: 0,name,age
0,Simone,15
1,Alice,10
2,Bob,13


In [117]:
# Re-ordering column(s) via slicing

df = pd.DataFrame([
    {'name' : 'Alice', 'age' : 10},
    {'name' : 'Simone', 'age' : 15},
    {'name' : 'Bob', 'age' : 13},
], dtype=object)

df[['age', 'name']]

Unnamed: 0,age,name
0,10,Alice
1,15,Simone
2,13,Bob


### Merging <a class="anchor" id="merging"></a>

In [83]:
# Merge two 'DataFrame'(s), using a specific column

df_entries = pd.DataFrame([
    {'task' : 'A', 'start' : 1, 'duration': 1},
    {'task' : 'B', 'start' : 2, 'duration': 1},
    {'task' : 'A', 'start' : 3, 'duration': 1},
], dtype=object)

df_tasks = pd.DataFrame([
    {'task' : 'A', 'core': 0},
    {'task' : 'B', 'core': 1},
], dtype=object)

pd.merge(df_entries, df_tasks, on='task')

Unnamed: 0,task,start,duration,core
0,A,1,1,0
1,B,2,1,1
2,A,3,1,0
