### Using Pandas to Create and Manipulate Series and DataFrames



#### Problem Statement:
Pandas is a powerful library for data manipulation in Python. In this lab, we will learn how to create and manipulate Pandas Series and DataFrames by performing various operations such as indexing, filtering, sorting, handling missing values, and grouping data.



#### Objective:
- Understand the basics of Pandas Series and DataFrames.
- Perform data selection, filtering, and sorting operations.
- Handle missing values in a DataFrame.
- Perform data aggregation using grouping.
- Export and import data using CSV files.



#### Requirements:
- Python installed (version 3.x recommended).
- Pandas and NumPy libraries installed (`pip install pandas numpy`).
- Jupyter Notebook (optional but recommended for running the lab).


**Part 1: Importing library and Creating a Pandas Series**

In [1]:
# Import necessary library
import pandas as pd
import numpy as np

In [2]:
# Creating a Pandas Series


In [3]:
# Accessing elements in Series
data = pd.Series([10, 20, 30, 40, 50], index=["a", "b", "c", "d", "e"])
result = data[["b", "d"]]
print(result)


b    20
d    40
dtype: int64


In [4]:
index=pd.Series([10,20,30,40,50],index=["a","b","c","d","e"])
index[:"d"]

a    10
b    20
c    30
d    40
dtype: int64

In [5]:
print("Accessing a single element in Series: 20")
print("Accessing multiple elements:")
print(index[['b', 'd']])


Accessing a single element in Series: 20
Accessing multiple elements:
b    20
d    40
dtype: int64


**Part 2: Creating a Pandas DataFrame**

In [6]:
# Creating a Pandas DataFram
data = pd.DataFrame({
    "Name": ["Alice", "Bob", "Charlie", "David"],
    "Age": [25, 30, 35, 40],
    "City": ["New York", "Los Angeles", "Chicago", "Houston"]
})


data


Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago
3,David,40,Houston


**Part 3: Manipulating DataFrames**


In [7]:
# Adding a new column df['Salary']
data = pd.DataFrame({
    "Name": ["Alice", "Bob", "Charlie", "David"],
    "Age": [25, 30, 35, 40],
    "City": ["New York", "Los Angeles", "Chicago", "Houston"]
})


data['Salary'] = [50000, 60000, 70000, 80000]


data


Unnamed: 0,Name,Age,City,Salary
0,Alice,25,New York,50000
1,Bob,30,Los Angeles,60000
2,Charlie,35,Chicago,70000
3,David,40,Houston,80000


In [8]:
# Selecting specific columns
names = data["Name"]

print(names)


0      Alice
1        Bob
2    Charlie
3      David
Name: Name, dtype: object


In [9]:
# Selecting multiple columns
columns = data[["Name", "Age"]]
columns


Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35
3,David,40


In [10]:
# Filtering rows based on condition
filtered_data = data[data["Age"] > 30]
filtered_data


Unnamed: 0,Name,Age,City,Salary
2,Charlie,35,Chicago,70000
3,David,40,Houston,80000


In [11]:
# Sorting DataFrame
sorted_data = data.sort_values(by="Age", ascending=False)

sorted_data

Unnamed: 0,Name,Age,City,Salary
3,David,40,Houston,80000
2,Charlie,35,Chicago,70000
1,Bob,30,Los Angeles,60000
0,Alice,25,New York,50000


In [12]:
+4

4