# 01 - Creating a DataFrame

In this notebook, we’ll create a simple DataFrame using Pandas and NumPy.  
A DataFrame is like an Excel table – it has rows and columns.


In [1]:
import pandas as pd
import numpy as np

## Step 1: Create a NumPy Array

We'll first create a 5x4 matrix using NumPy's `arange` and `reshape`.

In [3]:
np.arange(0, 20).reshape(5, 4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

## Step 2: Create a DataFrame

We now convert the NumPy array into a Pandas DataFrame,  
and add custom row and column labels.

In [5]:
df = pd.DataFrame(
    data = np.arange(0, 20).reshape(5, 4),
    index = ["Row1", "Row2", "Row3", "Row4", "Row5"],
    columns = ["Col1", "Col2", "Col3", "Col4"]
)
df

Unnamed: 0,Col1,Col2,Col3,Col4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


## Step 3: View the Data

We can view the top and bottom rows of the DataFrame using `head()` and `tail()`.


In [6]:
df.head()

Unnamed: 0,Col1,Col2,Col3,Col4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


In [7]:
df.tail()

Unnamed: 0,Col1,Col2,Col3,Col4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


## Step 4: Check DataFrame Properties

In [8]:
type(df)

pandas.core.frame.DataFrame

In [9]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, Row1 to Row5
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Col1    5 non-null      int32
 1   Col2    5 non-null      int32
 2   Col3    5 non-null      int32
 3   Col4    5 non-null      int32
dtypes: int32(4)
memory usage: 120.0+ bytes


In [10]:
df.describe()

Unnamed: 0,Col1,Col2,Col3,Col4
count,5.0,5.0,5.0,5.0
mean,8.0,9.0,10.0,11.0
std,6.324555,6.324555,6.324555,6.324555
min,0.0,1.0,2.0,3.0
25%,4.0,5.0,6.0,7.0
50%,8.0,9.0,10.0,11.0
75%,12.0,13.0,14.0,15.0
max,16.0,17.0,18.0,19.0


### Summary

- DataFrames are 2D data structures in Pandas.
- We can create them from NumPy arrays.
- Use `head()`, `info()`, and `describe()` to explore the data.


## What is a Pandas Series?

- A Series is a **one-dimensional labeled array**.
- You can think of it as a single column with optional labels (index).


In [12]:
# Create a simple Series
s = pd.Series([10, 20, 30, 40, 50])
s

0    10
1    20
2    30
3    40
4    50
dtype: int64

In [13]:
type(s)


pandas.core.series.Series