### What is Pandas?
* Pandas is a Python library used to work with data easily.It's super useful when you’re working with tables, spreadsheets, CSV files, or any kind of structured data.

### What is it used for?
* Reading data (from CSV, Excel, SQL, JSON, etc.)

* Cleaning data (handling missing values, fixing columns)

* Analyzing data (filtering, grouping, summarizing)

* Visualizing simple data trends (with help of other libraries like Matplotlib)

* Manipulating data (adding/deleting columns, changing values)

### Pandas Library – Intermediate Overview
#### 1. DataFrame and Series (Recap)
* A Series is like a single column (1D).

* A DataFrame is like a full Excel sheet with rows and columns (2D).

#### 2. Indexing
Pandas lets you control row and column indexing:

* You can rename the index or set a specific column as the index.

* There are two main ways to access data:

* Label-based: using .loc (e.g., by row name)

* Position-based: using .iloc (e.g., by row number)

#### 3. Data Cleaning
Real-world data is often messy. Pandas helps with:

* Handling missing values

* Fixing data types (like string vs number)

* Removing duplicates

* Replacing or renaming values or columns

#### 4. Aggregation and Grouping
 Pandas allows you to group data based on a column and then perform calculations like:

* Sum, average, count, min, max
* This is helpful in getting insights from categories (e.g., sales by region).

#### 5. Merging and Joining
You can combine multiple DataFrames using:

* Merge (like SQL joins)

* Concatenation (stacking data vertically or horizontally)

* This is useful when working with data from different sources.

#### 6. Data Transformation
Pandas allows you to:

* Add or remove columns

* Apply functions to rows or columns

* Reorder or sort data

* Convert between data types

You can also reshape your data using methods like:

* pivot or pivot_table (turn columns into rows or vice versa)

* melt (make wide data long)

#### 7. Time Series Support
Pandas is very good for handling time-based data:

* Date/time parsing

* Resampling (e.g., from daily to monthly)

* Rolling averages and trends over time

#### 8. Working with Files
Pandas supports:

* Reading/writing CSV, Excel, SQL, JSON, Parquet, and more

* Useful for loading large datasets from disk or cloud



In [1]:
!pip install pandas

Collecting pandas
  Downloading pandas-2.3.1-cp312-cp312-win_amd64.whl.metadata (19 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.3.1-cp312-cp312-win_amd64.whl (11.0 MB)
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
   - -------------------------------------- 0.5/11.0 MB 2.8 MB/s eta 0:00:04
   ---- ----------------------------------- 1.3/11.0 MB 3.4 MB/s eta 0:00:03
   ------- -------------------------------- 2.1/11.0 MB 3.8 MB/s eta 0:00:03
   ----------- ---------------------------- 3.1/11.0 MB 3.8 MB/s eta 0:00:03
   ---------------- ----------------------- 4.5/11.0 MB 4.4 MB/s eta 0:00:02
   ------------------- -------------------- 5.2/11.0 MB 4.6 MB/s eta 0:00:02
   -------------------- ------------------- 5.8/11.0 MB 4.1 MB/s eta 0:00:02
   ----------------------- ------

In [3]:
import pandas as pd 

In [7]:
data = [1, 2, 3, 4, 5]
print("series\n" , pd.Series(data))

series
 0    1
1    2
2    3
3    4
4    5
dtype: int64


In [5]:
pd.DataFrame(data)

Unnamed: 0,0
0,1
1,2
2,3
3,4
4,5


In [31]:
## Create a series from dictionary 

dic = {
    "name":["pratik","Pushkar" , "Akshay" , "Vedant" , "Mayur" , "Jayesh"],
    "age" : [21,21,21,21,21,21],
    "City" : ["Pune","Pune","Pune","Pune","Pune","Pune"],
    "Address" : ["Kamgarnagar","Kamgarnagar","Kamgarnagar","nigdi","sara","sara"],
    "Sem" : ["VII","VII","VII","VII","VII","VII"]
}

print(pd.DataFrame(dic))

      name  age  City      Address  Sem
0   pratik   21  Pune  Kamgarnagar  VII
1  Pushkar   21  Pune  Kamgarnagar  VII
2   Akshay   21  Pune  Kamgarnagar  VII
3   Vedant   21  Pune        nigdi  VII
4    Mayur   21  Pune         sara  VII
5   Jayesh   21  Pune         sara  VII


In [34]:
df = pd.DataFrame(dic)
df

Unnamed: 0,name,age,City,Address,Sem
0,pratik,21,Pune,Kamgarnagar,VII
1,Pushkar,21,Pune,Kamgarnagar,VII
2,Akshay,21,Pune,Kamgarnagar,VII
3,Vedant,21,Pune,nigdi,VII
4,Mayur,21,Pune,sara,VII
5,Jayesh,21,Pune,sara,VII


In [9]:
dic1 = {
    "student1" : {"name" : "Pratik" , "age" : 21 , "city" : "Pune"},
    "student2" : {"name" : "Pushkar" , "age" : 21 , "city" : "Pune"},
    "student3" : {"name" : "Skshay" , "age" : 21 , "city" : "Pune"}
}

print(pd.Series(dic1))

student1     {'name': 'Pratik', 'age': 21, 'city': 'Pune'}
student2    {'name': 'Pushkar', 'age': 21, 'city': 'Pune'}
student3     {'name': 'Skshay', 'age': 21, 'city': 'Pune'}
dtype: object


In [25]:
df = pd.DataFrame([dic])