## Data Transformation and Manipulation in Python

### Objective:
By the end of this class, students will be able to:

- Read and write data from CSV files.
- Use basic operations with Pandas to transform and manipulate datasets.
- Apply Numpy for numerical operations.

### Requirements:

- Python installed with Pandas and Numpy libraries.
- Text editor or Jupyter Notebook.

### Instructions:
Follow the steps below to perform basic data manipulation tasks. Write code to solve each problem and submit the notebook or script.

### Part 1: Setup and Basic File I/O
**1. Install Required Libraries:**

 - Install Pandas and Numpy using pip (if not already installed): 

In [None]:
 # Install Pandas and Numpy using pip
 # pip install pandas

**2. Create a Sample CSV File:**
- Write a Python script that creates and writes the following data to a CSV file called students.csv:

<div>
<img src="attachment:01.jpg" width="500"/>
</div>

![01.jpeg](attachment:01.jpeg)

**Hint:** 
#### 1. Import the csv module:

- csv is a built-in Python module that provides functionality to read from and write to CSV files.

#### 2. Create the data:

- The data can be represented as a **list of lists**.

#### 3. Open the CSV file for writing:

- with open('students.csv', 'w', newline='') as file: opens a new CSV file called 'students.csv' in write mode ('w').
- The with statement ensures that the file is automatically closed when the writing operation is done.
- newline='' is used to avoid adding extra blank lines in between rows when writing to the file on some operating systems like Windows.

#### Create a writer object:

- csv.writer(file) creates a writer object that will allow us to write data into the CSV file.
- **Note:** The csv.writer() expects rows of data (like a list of lists) by default and not a dictionary. HOwever, if you want to define the data using a dictionary of lists you need to first, write the header (keys of the dictionary), and then, write the data rows (values from the dictionary).

#### Write rows to the CSV file:

- writer.writerows(data) writes all the rows (header and data) to the file at once. The method writerows() expects a list of lists, which is exactly how our data is structured.


In [None]:
# Import the built-in csv module


# Data for the students table

# Open the CSV file in write mode and create it


**Note:** After running this code, a file called students.csv will be created in your working directory containing the student data in CSV format.

### Part 2: Reading Data with Pandas
**1. Read the CSV File into a DataFrame:**

- Use Pandas to read the students.csv file into a DataFrame and print the data.

**Pandas DataFrame** is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. Pandas DataFrame consists of three principal components, the **data**, **rows**, and **columns**.

In [None]:


# Read the CSV file into a Pandas DataFrame

# Print the DataFrame to see the contents of the CSV file



**2. Display Basic Information About the Data:**

- Print the following:
    - The first 3 rows of the DataFrame.
    - The columns of the DataFrame.
    - Data types of each column.

In [None]:
# Display the first 3 rows of the DataFrame
# First 3 rows: By default, head() shows the first 5 rows; we pass 3 to limit to the first 3

# Print the column names of the DataFrame
# This shows the column names of the DataFrame

# Print the data types of each column
# This tells us the data type of each column (int64, object, etc.)


### Part 3: Data Transformation and Manipulation
**1. Add a New Column for Final Grades:**

- Assume that 10 extra points are awarded to each student. Create a new column called Final Grade which adds 10 points to each student's original grade.

**2. Filter and Sort Data:**

- Filter out students who have a final grade of 90 or above and display their details.
- Sort the DataFrame based on the Final Grade in descending order.

**3. Modify a Specific Column Value:**

- Change the major of the student "Tom" to "Chemistry" in the DataFrame.

**4. Group By and Aggregate:**

- Group the data by the Major column and calculate the average final grade for each major.

### Part 4: Numpy Integration

**1. Numpy Array Creation:**

- Convert the Final Grade column into a Numpy array and calculate the following statistics using Numpy:
    - Mean of final grades.
    - Standard deviation of final grades.
    - Maximum and minimum final grades.


### Part 5: Writing Data to a New File

**1. Save the Modified DataFrame to a New CSV:**

- Save the transformed DataFrame (with the Final Grade column and updated major for Tom) to a new CSV file called students_final.csv.

### Bonus Challenge (Optional):
- Create a Python function that:
    - Reads a CSV file.
    - Adds a column of final grades with a custom number of extra points provided by the user.
    - Returns the updated DataFrame.


### Good Luck!