# Resource 06: Financial Data Analysis Workflow
This notebook walks through a typical workflow for analyzing financial datasets. You'll learn how to import, clean, transform, and analyze financial data.

## Step 1: Import the Data
Start by loading a dataset. We'll simulate loading a CSV of stock data.

In [None]:
import pandas as pd

# Simulated dataset
data = {
    'Date': pd.date_range(start='2023-01-01', periods=5, freq='M'),
    'Close': [150.0, 155.5, 160.0, 158.0, 162.0],
    'Volume': [1000000, 1100000, 1050000, 1200000, 1150000]
}
df = pd.DataFrame(data)
df

## Step 2: Understand Financial Data Structure
- **Date** is a time-series index
- **Close** is the stock closing price
- **Volume** is the number of shares traded

In [None]:
df.dtypes
df.set_index('Date', inplace=True)
df.head()

## Step 3: Handle Missing or Incomplete Data

In [None]:
# Introduce a missing value for demo
df.loc['2023-03-31', 'Close'] = None

# Detect and fill missing
print(df.isna())
df['Close'] = df['Close'].fillna(method='ffill')

## Step 4: Create Derived Variables
Calculate daily returns or ratios from existing columns.

In [None]:
df['Return'] = df['Close'].pct_change()
df

## Step 5: Basic Statistical Analysis
Run summary stats and correlation analysis.

In [None]:
print(df.describe())
print(df.corr())

## Step 6: Group Comparisons (Optional Extension)
Useful when comparing across firms, sectors, or time windows.

In [None]:
# Simulated multi-firm dataset
data2 = {
    'Firm': ['A', 'A', 'B', 'B'],
    'Quarter': ['Q1', 'Q2', 'Q1', 'Q2'],
    'Revenue': [100, 120, 90, 95]
}
df2 = pd.DataFrame(data2)
df2.groupby('Firm')['Revenue'].mean()

## Summary
- Load and inspect your data structure
- Address missing values with care
- Create derived variables like returns or ratios
- Use `.describe()` and `.corr()` for simple diagnostics

Next up: applying these workflows to actual course modules like executive compensation.