# Data Import and Manipulation Activity
In this activity, you will import a CSV file and an Excel file using Pandas, then perform basic data manipulations.

## Set Up a Virtual Environment

In the terminal, navigate to your project folder and run the following command to create a virtual environment:



```bash
python -m venv .venv
```

Activate the virtual environment by running:

- **Windows**: `.venv\Scripts\activate`
- **macOS/Linux**: `source .venv/bin/activate`

Once activated, you should see your virtual environment’s name in the terminal prompt.

## Select Kernel
Click on the kernel dropdown and choose the virtual environment you just activated (e.g., .venv).

## Install openpyxl

With the virtual environment activated, install the openpyxl library by running:

```bash
pip install openpyxl
```

### Part 1: Importing Data

In [1]:
import pandas as pd

# Load the CSV file
csv_data = pd.read_csv('../data/example_data.csv')
print('CSV Data:')
csv_data.head()

CSV Data:


Unnamed: 0,Name,Age,Score
0,Alice,23,88
1,Bob,34,92
2,Charlie,45,95
3,David,28,78
4,Eve,30,85


In [2]:
# Load the Excel file
excel_data = pd.read_excel('../data/example_data.xlsx')
print('Excel Data:')
excel_data.head()

Excel Data:


Unnamed: 0,Name,Age,Score
0,Alice,23,88
1,Bob,34,92
2,Charlie,45,95
3,David,28,78
4,Eve,30,85


### Part 2: Basic Data Manipulations

In [3]:
# Rename columns (replace 'OldColumnName' with actual column names)
csv_data_renamed = csv_data.rename(columns={'Name': 'Full Name'})
excel_data_renamed = excel_data.rename(columns={'Name': 'Full Name'})

In [4]:
# Filter rows (example: filtering where Age > 25)
csv_filtered = csv_data_renamed[csv_data_renamed['Age'] > 25]
excel_filtered = excel_data_renamed[excel_data_renamed['Age'] > 25]

In [5]:
# Sort the data by Score
csv_sorted = csv_filtered.sort_values(by='Score')
excel_sorted = excel_filtered.sort_values(by='Score')

In [6]:
print('CSV Data (Manipulated):')
csv_sorted

CSV Data (Manipulated):


Unnamed: 0,Full Name,Age,Score
3,David,28,78
4,Eve,30,85
1,Bob,34,92
2,Charlie,45,95


In [7]:
print('Excel Data (Manipulated):')
excel_sorted

Excel Data (Manipulated):


Unnamed: 0,Full Name,Age,Score
3,David,28,78
4,Eve,30,85
1,Bob,34,92
2,Charlie,45,95
