# üßÆ Important Functions and Attributes in Pandas
**Author:** Hamna Munir  
**Repository:** Python-Libraries-for-AI-ML  
**Topic:** 03_Important_Functions_and_Attributes

This notebook covers essential **Pandas functions and attributes** to efficiently explore, manipulate, and analyze datasets for AI/ML tasks.

---

## üìò Key Concepts
- Pandas DataFrames provide powerful tools for **data inspection** and **data manipulation**.
- Important attributes help understand **structure and type** of data.
- Functions allow filtering, aggregation, transformation, and merging datasets.

---

## ----------------------------------------------------------
## Importing Pandas
## ----------------------------------------------------------
Import pandas and check the version.

In [1]:
import pandas as pd

print("Pandas version:", pd.__version__)

Pandas version: 1.5.3


## üß© Creating Series & DataFrame
The two main data structures in Pandas are:
- **Series** ‚Üí One-dimensional labeled array
- **DataFrame** ‚Üí Two-dimensional table

Let's create both and explore their attributes.

In [2]:
import pandas as pd

# Series
s = pd.Series([10, 20, 30, 40])
print("Series:\n", s)

# DataFrame
data = {
    'Name': ['Ali', 'Sara', 'Umar'],
    'Age': [22, 25, 28],
    'Salary': [50000, 60000, 70000]
}
df = pd.DataFrame(data)
print("\nDataFrame:\n", df)

Series:
0    10
1    20
2    30
3    40
dtype: int64

DataFrame:
   Name  Age  Salary
0  Ali   22  50000
1  Sara  25  60000
2  Umar  28  70000


## üîç Important Attributes
DataFrames have attributes to inspect structure and types:

- `df.shape` ‚Üí Rows & Columns
- `df.ndim` ‚Üí Dimensions
- `df.size` ‚Üí Total elements
- `df.columns` ‚Üí Column names
- `df.dtypes` ‚Üí Data types of columns
- `df.index` ‚Üí Row labels/index

In [3]:
print("Shape:", df.shape)
print("Dimensions:", df.ndim)
print("Size:", df.size)
print("Columns:", df.columns)
print("Data Types:\n", df.dtypes)
print("Index:", df.index)

Shape: (3, 3)
Dimensions: 2
Size: 9
Columns: Index(['Name', 'Age', 'Salary'], dtype='object')
Data Types:
Name      object
Age        int64
Salary     int64
dtype: object
Index: RangeIndex(start=0, stop=3, step=1)


## ‚ûï Basic Operations
You can perform column-wise operations easily in Pandas.

In [4]:
print("Average Age:", df['Age'].mean())
print("Age After Adding 5:\n", df['Age'] + 5)

print("Salary in Thousands:\n", df['Salary']/1000)

Average Age: 25.0
Age After Adding 5:
0    27
1    30
2    33
Name: Age, dtype: int64
Salary in Thousands:
0    50.0
1    60.0
2    70.0
Name: Salary, dtype: float64


## üîß Common Functions
Useful functions to explore and analyze data:

- `head()` / `tail()` ‚Üí Preview rows
- `info()` ‚Üí Dataset info
- `describe()` ‚Üí Statistical summary
- `unique()` / `nunique()` ‚Üí Unique values
- `value_counts()` ‚Üí Count unique values
- `sort_values()` ‚Üí Sort data
- `apply()` ‚Üí Apply custom function

In [5]:
print("First 2 rows:\n", df.head(2))
print("\nLast row:\n", df['Name'].tail(1))
print("\nDataset Info:\n")
df.info()
print("\nStatistical Summary:\n", df.describe())
print("\nUnique Ages:", df['Age'].unique())
print("Number of Unique Ages:", df['Age'].nunique())

First 2 rows:
   Name  Age  Salary
0  Ali   22  50000
1  Sara  25  60000

Last row:
2    Umar
Name: Name, dtype: object

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Name    3 non-null      object
 1   Age     3 non-null      int64
 2   Salary  3 non-null      int64
dtypes: int64(2), object(1)
memory usage: 200.0+ bytes

Statistical Summary:
             Age        Salary
count   3.000000      3.000000
mean   25.000000  60000.000000
std     3.000000  10000.000000
min    22.000000  50000.000000
25%    23.500000  55000.000000
50%    25.000000  60000.000000
75%    26.500000  65000.000000
max    28.000000  70000.000000

Unique Ages:[22 25 28]
Number of Unique Ages: 3


## üìù Summary
- Pandas is a key library for data manipulation and analysis.
- Important attributes (`shape`, `columns`, `dtypes`, `index`) help inspect data structure.
- Functions like `head()`, `tail()`, `info()`, `describe()`, `unique()`, `nunique()` aid exploration.
- Arithmetic operations and aggregation functions (`mean`, `sum`, `apply`) simplify calculations.

**Next:** Importing Data with `read_csv()` ‚Üí `04_Importing_Data_with_read_csv.ipynb`