##  📚 Essential Basic Functionality

> Pandas is a foundational library in Python for working with structured data. It provides fast, flexible, and expressive tools designed to make data analysis and manipulation easy and intuitive. There are several essential functionalities that are fundamental to using Pandas effectively.

In [79]:
import pandas as pd
import numpy as np

### Head and tail

- Head shows the first 5 rows by default.
- Tail shows the last 5 rows by default.

In [80]:
long_series = pd.Series(np.random.randn(10))
long_series

0   -0.689224
1    0.348209
2    1.399586
3    0.648451
4   -0.042166
5   -0.181030
6    0.689165
7   -1.166976
8   -0.139918
9    1.465724
dtype: float64

In [81]:
# Head
long_series.head()

0   -0.689224
1    0.348209
2    1.399586
3    0.648451
4   -0.042166
dtype: float64

In [82]:
# Tail
long_series.tail()

5   -0.181030
6    0.689165
7   -1.166976
8   -0.139918
9    1.465724
dtype: float64

### Attributes and underlying data

- shape: gives the axis dimensions of the object, consistent with ndarray
- Axis labels:
    - Series: index (only axis)
    - DataFrame: index and columns


In [83]:
index = pd.date_range('20230101', periods=10)
df = pd.DataFrame(np.random.randn(10, 4), index=index, columns=list ('ABCD'))
df

Unnamed: 0,A,B,C,D
2023-01-01,1.837064,1.042564,-0.76139,-0.618853
2023-01-02,-0.999651,0.484799,-1.147667,0.14621
2023-01-03,-0.379719,0.762485,1.552791,-0.581352
2023-01-04,-0.928584,0.24604,0.081626,0.104537
2023-01-05,-1.019793,0.040928,0.661453,-1.529816
2023-01-06,0.611108,0.37801,-1.732451,0.701533
2023-01-07,0.668816,-1.113682,-0.199704,-0.793241
2023-01-08,0.298669,-1.578383,0.861584,0.38965
2023-01-09,-2.111644,-1.438865,-0.376393,-2.032228
2023-01-10,-0.681041,1.640821,-0.950361,-1.429081


In [84]:
df.columns = [x.lower() for x in df.columns]
df

Unnamed: 0,a,b,c,d
2023-01-01,1.837064,1.042564,-0.76139,-0.618853
2023-01-02,-0.999651,0.484799,-1.147667,0.14621
2023-01-03,-0.379719,0.762485,1.552791,-0.581352
2023-01-04,-0.928584,0.24604,0.081626,0.104537
2023-01-05,-1.019793,0.040928,0.661453,-1.529816
2023-01-06,0.611108,0.37801,-1.732451,0.701533
2023-01-07,0.668816,-1.113682,-0.199704,-0.793241
2023-01-08,0.298669,-1.578383,0.861584,0.38965
2023-01-09,-2.111644,-1.438865,-0.376393,-2.032228
2023-01-10,-0.681041,1.640821,-0.950361,-1.429081


#### Numpy

- It is a reliable and consistent method to convert pandas objects to NumPy arrays, offering better control over data types and compatibility with extension types compared other methods.

In [85]:
df_numpy = df.to_numpy()
df_numpy

array([[ 1.83706382,  1.04256356, -0.7613898 , -0.61885317],
       [-0.99965119,  0.48479915, -1.14766743,  0.14621045],
       [-0.37971922,  0.76248453,  1.55279134, -0.58135211],
       [-0.92858406,  0.24603965,  0.08162567,  0.10453674],
       [-1.01979306,  0.04092769,  0.66145259, -1.52981608],
       [ 0.6111076 ,  0.37800997, -1.7324514 ,  0.7015334 ],
       [ 0.66881616, -1.11368163, -0.19970396, -0.7932413 ],
       [ 0.29866894, -1.57838284,  0.86158422,  0.38964993],
       [-2.1116439 , -1.43886516, -0.37639324, -2.03222798],
       [-0.68104087,  1.6408209 , -0.95036057, -1.42908137]])

## Matching / broadcasting behavior

In [86]:
df_mathing_broadcasting = pd.DataFrame(
    {
        "one": pd.Series(np.random.randint(0, 10, 3), index=["a", "b", "c"]),
        "two": pd.Series(np.random.randint(0, 10, 3), index=["a", "b", "c"]),
        "three": pd.Series(np.random.randint(0, 10, 4), index=["a", "b", "c", "d"]),

    }
)
df_mathing_broadcasting

Unnamed: 0,one,two,three
a,8.0,6.0,9
b,6.0,7.0,9
c,8.0,6.0,5
d,,,6


### Sub

- It is a method used to perform subtraction between Series or DataFrames.

In [87]:
df_sub_l1 = df_mathing_broadcasting.iloc[2]
df_sub_l1

one      8.0
two      6.0
three    5.0
Name: c, dtype: float64

In [88]:
df_mathing_broadcasting.sub(df_sub_l1 , axis="columns")

Unnamed: 0,one,two,three
a,0.0,0.0,4.0
b,-2.0,1.0,4.0
c,0.0,0.0,0.0
d,,,1.0
