# Pandas

A Python library for data manipulation and analysis. It offers data structures and operations specifically for data. Most Machine Learning Libraries (including SciKitLearn, Keros, TensorFlow) are optimised to work with Pandas objects.

The [Pandas documentation](https://pandas.pydata.org/docs/user_guide/10min.html) is very user friendly.

#### DataFrame

A two-dimensional data structure that holds data in a Numpy array and has the extra functionality of numbered rows and column labels.

In [1]:
# importing Numpy & Pandas libraries
import numpy as np
import pandas as pd

| Feature | List | NPArray | Dataframe |
| ------ | -----: | -----: | -----: |
| Builtin | Y | N | N |
| Multiple data types | Y | N | N |
| Multiple dimensions | N | Y | Y |
| Arithmetic Operations | N | Y | Y |
| Process intensive | Y | N | N |
| Optimised for ML | N | Y | Y |
| Lables | N | N | Y |

In [2]:
# Create 2 lists
list_1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
list_2 = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
dataframe = pd.DataFrame(
    {
        "Date & Time": pd.Timestamp.now(),
        "List 1": list_1,
        "List 2": list_2,
    }
)
print(dataframe)


                 Date & Time  List 1  List 2
0 2025-05-13 04:02:51.765039       0       9
1 2025-05-13 04:02:51.765039       1       8
2 2025-05-13 04:02:51.765039       2       7
3 2025-05-13 04:02:51.765039       3       6
4 2025-05-13 04:02:51.765039       4       5
5 2025-05-13 04:02:51.765039       5       4
6 2025-05-13 04:02:51.765039       6       3
7 2025-05-13 04:02:51.765039       7       2
8 2025-05-13 04:02:51.765039       8       1
9 2025-05-13 04:02:51.765039       9       0


#### View the data

In [3]:
# Print the data types for each column
dtypes = dataframe.dtypes
print(dtypes)

Date & Time    datetime64[us]
List 1                  int64
List 2                  int64
dtype: object


In [4]:
# Print first 5 rows of the dataframe
print(dataframe.head())

                 Date & Time  List 1  List 2
0 2025-05-13 04:02:51.765039       0       9
1 2025-05-13 04:02:51.765039       1       8
2 2025-05-13 04:02:51.765039       2       7
3 2025-05-13 04:02:51.765039       3       6
4 2025-05-13 04:02:51.765039       4       5


In [5]:
# Print last 5 rows of the dataframe
print(dataframe.tail())

                 Date & Time  List 1  List 2
5 2025-05-13 04:02:51.765039       5       4
6 2025-05-13 04:02:51.765039       6       3
7 2025-05-13 04:02:51.765039       7       2
8 2025-05-13 04:02:51.765039       8       1
9 2025-05-13 04:02:51.765039       9       0


In [6]:
# Print quick statistics of the dataframe
print(dataframe.describe())

                      Date & Time    List 1    List 2
count                          10  10.00000  10.00000
mean   2025-05-13 04:02:51.765039   4.50000   4.50000
min    2025-05-13 04:02:51.765039   0.00000   0.00000
25%    2025-05-13 04:02:51.765039   2.25000   2.25000
50%    2025-05-13 04:02:51.765039   4.50000   4.50000
75%    2025-05-13 04:02:51.765039   6.75000   6.75000
max    2025-05-13 04:02:51.765039   9.00000   9.00000
std                           NaN   3.02765   3.02765


In [7]:
# Print the shape of the dataframe
shape = dataframe.shape
print(shape)

(10, 3)


In [9]:
# Print the number of dimensions of the dataframe
dimensions = dataframe.ndim
print(dimensions)

2


In [10]:
# Print the total number of elements in the dataframe
size = dataframe.size
print(size) 

30


In [11]:
# Print the number of rows in the dataframe
rows = len(dataframe)
print(rows) 

10


In [12]:
# Print the number of columns in the dataframe
columns = len(dataframe.columns)
print(columns)  

3
