# Common operations and methods in Pandas

Pandas provides a plethora of built-in methods and functions for performing various data manipulation tasks. These include arithmetic operations, statistical aggregations, merging and joining datasets, handling missing data, and much more. Familiarizing yourself with these methods and understanding their behavior under different scenarios is crucial for efficient data analysis workflows.

# Example Usage:

In [1]:
import pandas as pd

# Sample data
data = {
    'A': [1, 2, 3, 4, 5],
    'B': [5, 4, 3, 2, 1],
    'C': [10, 20, 30, 40, 50]
}

In [2]:
# Creating a DataFrame
df = pd.DataFrame(data)

**DataFrame Creation:** We create a Pandas DataFrame named 'df' from the provided dictionary 'data'. This DataFrame contains three columns: A, B, and C, each with corresponding data values.

In [3]:
# Displaying the DataFrame
df

Unnamed: 0,A,B,C
0,1,5,10
1,2,4,20
2,3,3,30
3,4,2,40
4,5,1,50


**Displaying the DataFrame:** We print the DataFrame to visualize its contents.

In [4]:
# Addition
print(df['A'] + df['B'])  # Adds columns A and B element-wise


0    6
1    6
2    6
3    6
4    6
dtype: int64


In [5]:
# Subtraction
print(df['C'] - df['B'])  # Subtracts column B from column C element-wise


0     5
1    16
2    27
3    38
4    49
dtype: int64


In [6]:
# Multiplication
print(df['A'] * df['C'])  # Multiplies columns A and C element-wise


0     10
1     40
2     90
3    160
4    250
dtype: int64


In [7]:
# Division
print(df['C'] / df['A'])  # Divides column C by column A element-wise


0    10.0
1    10.0
2    10.0
3    10.0
4    10.0
dtype: float64


**Arithmetic operations:** We demonstrate various arithmetic operations on DataFrame columns, including addition, subtraction, multiplication, and division.

In [8]:
# Statistical aggregations
# Mean
print(df.mean())  # Calculates the mean of each column

A     3.0
B     3.0
C    30.0
dtype: float64


In [9]:
# Median
print(df.median())  # Calculates the median of each column

A     3.0
B     3.0
C    30.0
dtype: float64


**Satistical aggregations:** We calculate statistical aggregations such as mean and median for each column in the DataFrame.

In [10]:
# Merging and joining datasets

# Concatenating along columns
df_concat = pd.concat([df, df], axis=1)  # Concatenates the DataFrame with itself along columns
df_concat


Unnamed: 0,A,B,C,A.1,B.1,C.1
0,1,5,10,1,5,10
1,2,4,20,2,4,20
2,3,3,30,3,3,30
3,4,2,40,4,2,40
4,5,1,50,5,1,50


**Merging and joining datasets:** We concatenate the DataFrame with itself along columns to demonstrate merging and joining datasets.

In [11]:
# Handling missing data

# Introducing missing values
df.loc[2, 'B'] = pd.NA  # Introduces a missing value in column B at row 2
df.loc[3, 'C'] = pd.NA  # Introduces a missing value in column C at row 3
df


Unnamed: 0,A,B,C
0,1,5.0,10.0
1,2,4.0,20.0
2,3,,30.0
3,4,2.0,
4,5,1.0,50.0


**Handling missing data:** We introduce missing values in columns B and C of the DataFrame to simulate missing data scenarios.

In [12]:
# Filling missing values with mean
df_filled = df.fillna(df.mean())  # Fills missing values with the mean of each column
df_filled

Unnamed: 0,A,B,C
0,1,5.0,10.0
1,2,4.0,20.0
2,3,3.0,30.0
3,4,2.0,27.5
4,5,1.0,50.0


We fill missing values with the mean of each column using the fillna() method.

In [13]:
# Dropping rows with missing values
df_dropped = df.dropna()  # Drops rows with any missing values
df_dropped

Unnamed: 0,A,B,C
0,1,5.0,10.0
1,2,4.0,20.0
4,5,1.0,50.0


We drop rows with any missing values using the dropna() method.

# Assigment

In this exercise, you will practice performing data manipulation operations using Pandas DataFrame. You will work with a sample dataset containing information about products and apply various operations and methods to gain insights into the data.

**Dataset:**
You are provided with a dictionary containing product information:

In [None]:
# Sample data
data = {
    'Product_ID': [101, 102, 103, 104, 105],
    'Product_Name': ['Laptop', 'Smartphone', 'Tablet', 'Headphones', 'Smartwatch'],
    'Category': ['Electronics', 'Electronics', 'Electronics', 'Accessories', 'Electronics'],
    'Price': [1200, 800, 400, 100, 300],
    'Quantity_Sold': [50, 100, 80, 200, 120]
}

## Tasks:

**DataFrame Creation:**
- Create a Pandas DataFrame named 'products' from the provided dictionary 'data'.

**Arithmetic Operations:**
- Perform the following arithmetic operations and display the results:
- Calculate the total revenue (Price * Quantity_Sold) for each product.
- Calculate the total revenue generated by all products.

**Statistical Aggregations:**
- Calculate and display the following statistical aggregations for the 'Price' and 'Quantity_Sold' columns:
    - Mean
    - Median

**Merging and Joining:**
- Concatenate the 'products' DataFrame with itself along columns and display the result.

**Handling Missing Data:**
- Introduce missing values in the 'Price' column at index 2 and in the 'Quantity_Sold' column at index 3.
- Fill the missing values with the mean of each column and display the resulting DataFrame.
- Drop rows with any missing values and display the resulting DataFrame.