In [1]:
import pandas as pd

In [2]:
# Reading CSV file
df_customer = pd.read_csv('customer.csv')

In [3]:
# show in a dataframe
df_customer

Unnamed: 0.1,Unnamed: 0,ID,Name,Age,City,Purchase_Amount
0,0,1,Customer_1,44,Chicago,175.08
1,1,2,Customer_2,53,New York,301.74
2,2,3,Customer_3,25,Houston,353.85
3,3,4,Customer_4,26,Phoenix,735.5
4,4,5,Customer_5,47,Houston,219.66
5,5,6,Customer_6,69,Houston,837.42
6,6,7,Customer_7,36,Phoenix,681.78
7,7,8,Customer_8,44,Houston,926.9
8,8,9,Customer_9,52,Houston,525.23
9,9,10,Customer_10,40,New York,471.29


# 1. Attributes

### `.shape`
The `.shape` attribute returns the **number of rows and columns** in a DataFrame as a tuple:

In [5]:
# Shape attribute 
df_customer.shape

(10, 6)

### `.index` 
The `.index` attribute returns the **row labels** (index) of a DataFrame.

In [6]:
df_customer.index

RangeIndex(start=0, stop=10, step=1)

### `.columns`
The `.columns` attribute returns **the name of the columns** in a DataFrame

In [7]:
df_customer.columns

Index(['Unnamed: 0', 'ID', 'Name', 'Age', 'City', 'Purchase_Amount'], dtype='object')

### `.dtypes`
- The `.dtypes` returns **the data type of each columns** of a DataFrame
- By default, **object are string** here

In [10]:
df_customer.dtypes  # By default, object are string 

Unnamed: 0           int64
ID                   int64
Name                object
Age                  int64
City                object
Purchase_Amount    float64
dtype: object

# 2. Methods

### `.info()` Method in Pandas
The `.info()` method provides a **summary of a DataFrame**, including:
- The number of non-null entries in each column
- Column names and data types
- Memory usage
- Index range

This is especially useful for **quick inspection of the dataset's structure**.

In [11]:
df_customer.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 6 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Unnamed: 0       10 non-null     int64  
 1   ID               10 non-null     int64  
 2   Name             10 non-null     object 
 3   Age              10 non-null     int64  
 4   City             10 non-null     object 
 5   Purchase_Amount  10 non-null     float64
dtypes: float64(1), int64(3), object(2)
memory usage: 612.0+ bytes


### `.describe()` 

The `.describe()` method generates **descriptive statistics** for **numerical columns** in a DataFrame.

It provides a quick summary of important statistics, including:

- Count (number of non-null entries)
- Mean (average)
- Standard deviation (std)
- Minimum (min)
- 25th percentile (25%)
- Median (50%)
- 75th percentile (75%)
- Maximum (max)

In [12]:
df_customer.describe()

Unnamed: 0.1,Unnamed: 0,ID,Age,Purchase_Amount
count,10.0,10.0,10.0,10.0
mean,4.5,5.5,43.6,522.845
std,3.02765,3.02765,13.091134,263.664841
min,0.0,1.0,25.0,175.08
25%,2.25,3.25,37.0,314.7675
50%,4.5,5.5,44.0,498.26
75%,6.75,7.75,50.75,722.07
max,9.0,10.0,69.0,926.9


# 3.Functions

### `len(DataFrame)` 
Obtaining **the length** of the DataFrame **(number of rows)**

In [14]:
len(df_customer)

10

### `max(DataFrame.index)`
Obtaining **the highest index** of the DataFrame

In [15]:
max(df_customer.index)

9

### `min(DataFrame.index)`
Obtaining **the lowest index** of the DataFrame

In [16]:
min(df_customer.index)

0

### `type(df)`
Obtaining **the type** of the DataFrame

In [17]:
type(df_customer)

pandas.core.frame.DataFrame

### `round(df, decimal place)'
**Rounding the float values** of the dataset to a specified number of decimal places.

In [19]:
round(df_customer,2)

Unnamed: 0.1,Unnamed: 0,ID,Name,Age,City,Purchase_Amount
0,0,1,Customer_1,44,Chicago,175.08
1,1,2,Customer_2,53,New York,301.74
2,2,3,Customer_3,25,Houston,353.85
3,3,4,Customer_4,26,Phoenix,735.5
4,4,5,Customer_5,47,Houston,219.66
5,5,6,Customer_6,69,Houston,837.42
6,6,7,Customer_7,36,Phoenix,681.78
7,7,8,Customer_8,44,Houston,926.9
8,8,9,Customer_9,52,Houston,525.23
9,9,10,Customer_10,40,New York,471.29
