# Data Pipeline Analysis

This notebook demonstrates how to use the transformed data from our pipeline modules.

In [4]:
# Import transformed data from pipeline modules
from pipelines.ch_campaign_pipeline import transformed_data as campaign_data, raw_data as campaign_raw
from pipelines.customer_pipeline import transformed_data as customer_data, raw_data as customer_raw
from pipelines.ch_offer_pipeline import transformed_data as offer_data, raw_data as offer_raw

print("Available datasets:")
print(f"Campaign data: {campaign_data.shape}")
print(f"Customer data: {customer_data.shape}")
print(f"Offer data: {offer_data.shape}")

# Display first few rows of transformed campaign data
print("\nCampaign Data (Transformed):")
print(campaign_data.head())

print("\nCampaign Data (Raw):")
print(campaign_raw.head())

Available datasets:
Campaign data: (55, 100)
Customer data: (190339, 651)
Offer data: (19699, 50)

Campaign Data (Transformed):
   campaign_id  campaign_code_17Q409RLCross-Sell5194  \
0    -0.216068                                   0.0   
1    -0.021229                                   0.0   
2     0.023097                                   0.0   
3    -0.218991                                   0.0   
4    -0.297414                                   0.0   

   campaign_code_17Q409RLCross-Sell5196  campaign_code_17Q409RLCross-Sell5197  \
0                                   0.0                                   0.0   
1                                   0.0                                   0.0   
2                                   0.0                                   0.0   
3                                   0.0                                   0.0   
4                                   0.0                                   0.0   

   campaign_code_17Q409RLCross-Sell5217  campaig

## Available Pipeline Modules

Each CSV file now has a corresponding pipeline module that provides:

- `transformed_data`: Preprocessed data with scaled numeric features and one-hot encoded categorical features
- `transformation_pipeline`: The fitted scikit-learn pipeline for transforming new data
- `raw_data`: Original data loaded from CSV

**Pipeline Modules:**
- `ch_campaign_pipeline` - Campaign data
- `ch_cell_pipeline` - Cell data  
- `ch_offer_pipeline` - Offer data
- `contact_history_fact_pipeline` - Contact history facts
- `customer_pipeline` - Customer data
- `customer_fact_pipeline` - Customer facts
- `customer_score_pipeline` - Customer scores