# Python for Data Analysis - Week 2
## Practice Exercises: Pandas Fundamentals I (Part 1)

### Overview
This notebook contains practice exercises to help you master the pandas concepts covered in Week 2's lecture sessions. Each exercise is designed to reinforce a specific aspect of pandas, and solutions are provided at the end of each section.

### Instructions
1. Read each exercise carefully
2. Write your code in the provided cells
3. Run your code to check your solution
4. Compare your approach with the provided solution
5. If you're stuck, review the lecture materials or ask for help

Let's get started!

## Setup

First, let's import the necessary libraries and set up our environment.

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# For plotting in the notebook
%matplotlib inline

# Set display options
pd.set_option('display.max_columns', None)  # Show all columns
pd.set_option('display.max_rows', 15)       # Limit number of rows shown
pd.set_option('display.width', 1000)        # Set width of display

print("Libraries imported successfully!")

## Section 1: Creating and Exploring DataFrames

In this section, we'll practice creating DataFrames from different data sources and exploring their structure.

### Exercise 1.1: Creating a DataFrame from a Dictionary

Create a DataFrame called `students_df` with the following data:
- Columns: 'name', 'age', 'grade', 'subject'
- At least 5 students with different values

Then, display the DataFrame.

In [None]:
# Your code here


### Solution 1.1

In [None]:
# Creating a dictionary of data
data = {
    'name': ['John Smith', 'Emma Johnson', 'Michael Brown', 'Sophia Davis', 'James Wilson'],
    'age': [22, 20, 23, 21, 19],
    'grade': ['A', 'B+', 'C', 'A-', 'B'],
    'subject': ['Math', 'History', 'Science', 'English', 'Computer Science']
}

# Creating the DataFrame
students_df = pd.DataFrame(data)

# Displaying the DataFrame
students_df

### Exercise 1.2: Loading Data from a CSV File

Load the `numeric_data.csv` file from the Data directory into a DataFrame called `numeric_df`. Display the first 5 rows of the DataFrame and its information.

In [None]:
# Your code here


### Solution 1.2

In [None]:
# Load the CSV file
numeric_df = pd.read_csv('../Data/numeric_data.csv')

# Display the first 5 rows
print("First 5 rows:")
print(numeric_df.head())

# Display DataFrame information
print("\nDataFrame information:")
numeric_df.info()

### Exercise 1.3: Creating a DataFrame from a NumPy Array

Create a 10×4 NumPy array with random integers between 1 and 100. Then, convert it to a DataFrame with column names 'A', 'B', 'C', and 'D', and display the DataFrame.

In [None]:
# Your code here


### Solution 1.3

In [None]:
# Create a 10×4 NumPy array with random integers
np.random.seed(42)  # For reproducibility
array_data = np.random.randint(1, 100, size=(10, 4))

# Convert the array to a DataFrame
columns = ['A', 'B', 'C', 'D']
array_df = pd.DataFrame(array_data, columns=columns)

# Display the DataFrame
array_df

### Exercise 1.4: Exploring DataFrame Properties

Using the `numeric_df` DataFrame from Exercise 1.2, answer the following questions:

1. How many rows and columns does the DataFrame have?
2. What are the column names?
3. What are the data types of each column?
4. Generate descriptive statistics for the DataFrame.

In [None]:
# Your code here


### Solution 1.4

In [None]:
# 1. Number of rows and columns
rows, columns = numeric_df.shape
print(f"The DataFrame has {rows} rows and {columns} columns.")

# 2. Column names
print(f"\nColumn names: {numeric_df.columns.tolist()}")

# 3. Data types
print("\nData types:")
print(numeric_df.dtypes)

# 4. Descriptive statistics
print("\nDescriptive statistics:")
print(numeric_df.describe())