<a href="https://colab.research.google.com/github/Tanu-N-Prabhu/Python/blob/master/10_Must_Know_Pandas_Tricks_for_Data_Science_Beginners.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 10 Must-Know Pandas Tricks for Data Science Beginners

If you’re learning Data Science, chances are you’ve already met Pandas, the go-to Python library for handling data. But beyond the basics of `read_csv` and `df.head()`, Pandas hides many powerful tricks that can save time, reduce bugs, and make your code cleaner.

In this article, I’ll share 10 Pandas tricks that every beginner should know, with simple examples you can try in Google Colab or Jupyter Notebook.

## 1. Quickly Check Data Info

In [1]:
import pandas as pd

df = pd.DataFrame({
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, None],
    "City": ["NY", "LA", "NY"]
})

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Name    3 non-null      object 
 1   Age     2 non-null      float64
 2   City    3 non-null      object 
dtypes: float64(1), object(2)
memory usage: 204.0+ bytes


> Shows column names, data types, and missing values.

## 2. Find Missing Values Fast

In [2]:
df.isnull().sum()

Unnamed: 0,0
Name,0
Age,1
City,0


> Tells you how many missing values each column has.

## 3. Drop Duplicates

In [3]:
df.drop_duplicates(inplace=True)
df

Unnamed: 0,Name,Age,City
0,Alice,25.0,NY
1,Bob,30.0,LA
2,Charlie,,NY


> Removes duplicate rows instantly.

## 4. Rename Columns Easily

In [4]:
df.rename(columns={"Name": "Full_Name"}, inplace=True)
df

Unnamed: 0,Full_Name,Age,City
0,Alice,25.0,NY
1,Bob,30.0,LA
2,Charlie,,NY


> No need to rewrite the whole column list.

## 5. Apply Functions to Columns Columns

In [5]:
df["Age_Future"] = df["Age"].apply(lambda x:x + 5 if pd.notnull(x) else x)
df

Unnamed: 0,Full_Name,Age,City,Age_Future
0,Alice,25.0,NY,30.0
1,Bob,30.0,LA,35.0
2,Charlie,,NY,


> Add 5 years to age (only if it's not missing)

## 6. Value Counts (Frequency Check)



In [6]:
df["City"].value_counts()

Unnamed: 0_level_0,count
City,Unnamed: 1_level_1
NY,2
LA,1


> Quickly see how often each category appears.

## 7. Filter with Conditions


In [7]:
df[df["Age"] > 25]

Unnamed: 0,Full_Name,Age,City,Age_Future
1,Bob,30.0,LA,35.0


> Select only rows where age is greater than 25.

## 8. Sort by Columns

In [8]:
df.sort_values(by = "Age", ascending=False)

Unnamed: 0,Full_Name,Age,City,Age_Future
1,Bob,30.0,LA,35.0
0,Alice,25.0,NY,30.0
2,Charlie,,NY,


> Sort your DataFrame by a column.

## 9. Group By + Aggregation

In [9]:
df.groupby("City")["Age"].mean()

Unnamed: 0_level_0,Age
City,Unnamed: 1_level_1
LA,30.0
NY,25.0


> Calculate average age per city.

## 10. Describe Your Data

In [10]:
df.describe()

Unnamed: 0,Age,Age_Future
count,2.0,2.0
mean,27.5,32.5
std,3.535534,3.535534
min,25.0,30.0
25%,26.25,31.25
50%,27.5,32.5
75%,28.75,33.75
max,30.0,35.0


> Quick Summary: mean, min, max, percentile. Its great for EDA

# Conclusion

Mastering Pandas isn’t just about memorizing functions, it’s about learning shortcuts and best practices that make your workflow smoother.

With these 10 tricks, you’ll spend less time debugging and more time analyzing your data like a pro.