# Introduction to Pandas for Excel Users

Welcome to Pandas! This library will be your new best friend for data analysis in Python. If you're coming from Excel, you'll find many familiar concepts here.

## What you'll learn
- What is Pandas and why use it
- DataFrames (think Excel spreadsheets)
- Reading and writing Excel/CSV files
- Basic data operations

## Why Pandas?
- Handles much larger datasets than Excel
- Powerful data manipulation capabilities
- Automation of repetitive tasks
- Integration with other Python libraries


In [None]:
# First, let's import pandas
import pandas as pd

# Create a simple DataFrame (like an Excel sheet)
data = {
    'Product': ['Apple', 'Banana', 'Orange', 'Mango'],
    'Price': [1.20, 0.80, 1.00, 2.00],
    'Quantity': [100, 150, 120, 75]
}

df = pd.DataFrame(data)
print('Our first DataFrame (like an Excel sheet):
')
print(df)

## Basic Operations
Let's look at some common operations that you might do in Excel:

In [None]:
# Calculate total value (like Excel's multiplication and SUM)
df['Total Value'] = df['Price'] * df['Quantity']

print('DataFrame with calculated total values:
')
print(df)

print('
Summary statistics (like Excel\'s descriptive statistics):
')
print(df.describe())

print('
Total revenue:', df['Total Value'].sum())

## Filtering Data
In Excel, you use filters. In Pandas, it's even more powerful:

In [None]:
# Filter products more expensive than $1 (like Excel's filter)
expensive_products = df[df['Price'] > 1.0]
print('Products more expensive than $1:
')
print(expensive_products)

# Filter products with high inventory value
high_value = df[df['Total Value'] > 100]
print('
Products with total value over $100:
')
print(high_value)

## Saving and Reading Files
Pandas makes it easy to work with Excel and CSV files:

In [None]:
# Save to CSV (like saving an Excel file)
df.to_csv('inventory.csv', index=False)

# Read it back
df_from_csv = pd.read_csv('inventory.csv')
print('Data read from CSV:
')
print(df_from_csv)

## Exercise
Create a DataFrame that:
1. Contains monthly sales data for different products
2. Calculate the total revenue per product
3. Find the best-selling product
4. Save the results to a CSV file

In [None]:
# Your code here
# Example solution:
sales_data = {
    'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'January': [5, 20, 15, 10],
    'February': [7, 25, 18, 12],
    'March': [6, 22, 16, 8]
}

# Create DataFrame
sales_df = pd.DataFrame(sales_data)

# Add price information
prices = {'Laptop': 1000, 'Mouse': 25, 'Keyboard': 50, 'Monitor': 200}
sales_df['Price'] = sales_df['Product'].map(prices)

# Calculate total units sold
sales_df['Total Units'] = sales_df['January'] + sales_df['February'] + sales_df['March']

# Calculate total revenue
sales_df['Total Revenue'] = sales_df['Total Units'] * sales_df['Price']

print('Sales Analysis:
')
print(sales_df)

# Find best-selling product by revenue
best_seller = sales_df.loc[sales_df['Total Revenue'].idxmax()]
print('
Best selling product:
')
print(f'Product: {best_seller[\'Product\']}')
print(f'Total Revenue: ${best_seller[\'Total Revenue\']}')

# Save to CSV
sales_df.to_csv('sales_analysis.csv', index=False)