# Module 4: Hourly Updates

## Purpose
This module runs hourly to monitor and adjust prices based on:
1. **UTH (Up-Till-Hour) Performance**: Cumulative qty/retailers from start of day until current hour
2. **Last Hour Performance**: Qty/retailers for the most recent hour only

## Schedule
- Runs hourly from 12 PM to 12 AM (midnight), **except** Module 3 hours (12 PM, 3 PM, 6 PM, 9 PM)
- Also runs once at 3 AM
- Active hours: 1 PM, 2 PM, 4 PM, 5 PM, 7 PM, 8 PM, 10 PM, 11 PM, 12 AM, 3 AM

## Data Flow
```
data_extraction.ipynb → pricing_with_discount.xlsx
                              ↓
                        Module 4 (this module)
                           ├── Load p80_qty, p70_retailers
                           ├── Query UTH performance (today)
                           ├── Query Last Hour performance (today)
                           ├── Query historical hour contributions
                           ├── Calculate targets & statuses
                           └── Generate actions (TBD)
```

## Status Outputs
- `uth_qty_status`: growing / dropping / on_track
- `uth_rets_status`: growing / dropping / on_track
- `last_hour_qty_status`: growing / dropping / on_track
- `last_hour_rets_status`: growing / dropping / on_track


In [1]:
# =============================================================================
# IMPORTS AND SETUP
# =============================================================================
import pandas as pd
import numpy as np
from datetime import datetime
import pytz
import sys
sys.path.append('..')

# Import queries module for Snowflake access
%run queries_module.ipynb

# Cairo timezone
CAIRO_TZ = pytz.timezone('Africa/Cairo')
CAIRO_NOW = datetime.now(CAIRO_TZ)
CURRENT_HOUR = CAIRO_NOW.hour

print(f"Module 4: Hourly Updates")
print(f"Current Cairo Time: {CAIRO_NOW.strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Current Hour: {CURRENT_HOUR}")


  warn_incompatible_dep(


/home/ec2-user/.Renviron
/home/ec2-user/service_account_key.json
Queries Module | Timezone: America/Los_Angeles
✅ UTH and Last Hour functions defined

QUERIES MODULE READY

Live Data Functions:
  • get_current_stocks()
  • get_packing_units()
  • get_current_prices()
  • get_current_wac()
  • get_current_cart_rules()

UTH Performance Functions:
  • get_uth_performance()         - UTH qty/retailers (Snowflake)
  • get_hourly_distribution()     - Historical hour contributions (Snowflake)
  • get_last_hour_performance()   - Last hour qty/retailers (DWH)

Note: Market prices use MODULE_1_INPUT data
Retailer Selection Queries defined ✓
  - get_churned_dropped_retailers()
  - get_category_not_product_retailers()
  - get_out_of_cycle_retailers()
  - get_view_no_orders_retailers()
  - get_excluded_retailers()
  - get_retailers_with_quantity_discount()
  - get_retailer_main_warehouse()
Module 4: Hourly Updates
Current Cairo Time: 2026-01-25 16:21:31
Current Hour: 16


In [2]:
# =============================================================================
# CONFIGURATION
# =============================================================================

# Input/Output files
INPUT_FILE = '../pricing_with_discount.xlsx'
OUTPUT_FILE = f'module_4_output_{CAIRO_NOW.strftime("%Y%m%d_%H%M")}.xlsx'

# Status thresholds (±10% of benchmark = On Track)
ON_TRACK_THRESHOLD = 0.10  # 10%
GROWING_THRESHOLD = 1 + ON_TRACK_THRESHOLD  # >110% = Growing
DROPPING_THRESHOLD = 1 - ON_TRACK_THRESHOLD  # <90% = Dropping

# Module 3 hours (skip these)
MODULE_3_HOURS = [12, 15, 18, 21]  # 12 PM, 3 PM, 6 PM, 9 PM

print(f"Input: {INPUT_FILE}")
print(f"Output: {OUTPUT_FILE}")
print(f"Status Thresholds: Dropping <{DROPPING_THRESHOLD*100:.0f}%, On Track ±{ON_TRACK_THRESHOLD*100:.0f}%, Growing >{GROWING_THRESHOLD*100:.0f}%")


Input: ../pricing_with_discount.xlsx
Output: module_4_output_20260125_1621.xlsx
Status Thresholds: Dropping <90%, On Track ±10%, Growing >110%


In [3]:
# =============================================================================
# LOAD DATA FROM DATA EXTRACTION
# =============================================================================
print("Loading data from data_extraction output...")
df = pd.read_excel(INPUT_FILE)
print(f"Loaded {len(df)} records")

# Ensure required columns exist with proper types
df['p80_daily_240d'] = pd.to_numeric(df.get('p80_daily_240d', 0), errors='coerce').fillna(0)
df['p70_daily_retailers_240d'] = pd.to_numeric(df.get('p70_daily_retailers_240d', 1), errors='coerce').fillna(1)
df['warehouse_id'] = df['warehouse_id'].astype(int)
df['product_id'] = df['product_id'].astype(int)
df['cohort_id'] = df['cohort_id'].astype(int) if 'cohort_id' in df.columns else None

# Get category for hourly distribution merge
if 'cat' not in df.columns and 'category' in df.columns:
    df['cat'] = df['category']

print(f"\nP80 Qty Stats: min={df['p80_daily_240d'].min():.1f}, max={df['p80_daily_240d'].max():.1f}, mean={df['p80_daily_240d'].mean():.1f}")
print(f"P70 Retailers Stats: min={df['p70_daily_retailers_240d'].min():.1f}, max={df['p70_daily_retailers_240d'].max():.1f}, mean={df['p70_daily_retailers_240d'].mean():.1f}")

# =============================================================================
# GET CURRENT CART RULES (Fresh from Snowflake)
# =============================================================================
df_cart_rules = get_current_cart_rules()

# Merge with main df (by cohort_id + product_id)
if 'cohort_id' in df.columns and len(df_cart_rules) > 0:
    df = df.drop(columns=['current_cart_rule'], errors='ignore')
    df = df.merge(df_cart_rules, on=['cohort_id', 'product_id'], how='left')
    df['current_cart_rule'] = df['current_cart_rule'].fillna(999)
    print(f"✅ Merged cart rules: {len(df)} records")
else:
    df['current_cart_rule'] = df.get('current_cart_rule', 999)


Loading data from data_extraction output...
Loaded 28382 records

P80 Qty Stats: min=0.0, max=1652.8, mean=9.9
P70 Retailers Stats: min=0.0, max=146.6, mean=2.6
Fetching current cart rules...
  Loaded 72968 records
✅ Merged cart rules: 28382 records


In [4]:
# =============================================================================
# QUERY 1: TODAY'S UTH (Up-Till-Hour) PERFORMANCE
# =============================================================================
# Gets cumulative qty and retailers from start of day until current hour
# Uses get_uth_performance() from queries_module

df_uth_today = get_uth_performance()


Fetching UTH performance from Snowflake...
  Loaded 6806 UTH records


In [5]:
# =============================================================================
# QUERY 2: TODAY'S LAST HOUR PERFORMANCE (from DWH)
# =============================================================================
# Gets qty and retailers for the PREVIOUS hour only (not cumulative)
# Uses get_last_hour_performance() from queries_module (DWH/PostgreSQL)

df_last_hour = get_last_hour_performance()


Fetching last hour performance from DWH...
  Loaded 1322 last hour records from DWH


In [6]:
# =============================================================================
# QUERY 3: HISTORICAL HOURLY DISTRIBUTION (Last 4 Months) - By Category & Warehouse
# =============================================================================
# Gets:
# - avg_uth_pct_qty/retailers: Average contribution of hours 0 to (current_hour-1) to daily total
# - avg_last_hour_pct_qty/retailers: Average contribution of (current_hour-1) alone to daily total
# Uses get_hourly_distribution() from queries_module

df_hourly_dist = get_hourly_distribution()
print(f"\nAvg UTH % (qty): {df_hourly_dist['avg_uth_pct_qty'].mean()*100:.1f}%")
print(f"Avg Last Hour % (qty): {df_hourly_dist['avg_last_hour_pct_qty'].mean()*100:.1f}%")


Fetching hourly distribution from Snowflake...
  Loaded 770 hourly distribution records

Avg UTH % (qty): 48.8%
Avg Last Hour % (qty): 6.1%


In [7]:
# =============================================================================
# MERGE DATA
# =============================================================================
print("Merging performance data with base data...")

# Merge UTH today data
if len(df_uth_today) > 0:
    df = df.merge(df_uth_today, on=['warehouse_id', 'product_id'], how='left')
else:
    df['uth_qty'] = 0
    df['uth_nmv'] = 0
    df['uth_retailers'] = 0

# Merge last hour data
if len(df_last_hour) > 0:
    df = df.merge(df_last_hour, on=['warehouse_id', 'product_id'], how='left')
else:
    df['last_hour_qty'] = 0
    df['last_hour_nmv'] = 0
    df['last_hour_retailers'] = 0

# Merge hourly distribution (by warehouse_id + cat)
if len(df_hourly_dist) > 0:
    df = df.merge(df_hourly_dist, on=['warehouse_id', 'cat'], how='left')
else:
    df['avg_uth_pct_qty'] = 0.5
    df['avg_uth_pct_retailers'] = 0.5
    df['avg_last_hour_pct_qty'] = 0.05
    df['avg_last_hour_pct_retailers'] = 0.05

# Fill NaN values
df['uth_qty'] = df['uth_qty'].fillna(0)
df['uth_nmv'] = df['uth_nmv'].fillna(0)
df['uth_retailers'] = df['uth_retailers'].fillna(0)
df['last_hour_qty'] = df['last_hour_qty'].fillna(0)
df['last_hour_nmv'] = df['last_hour_nmv'].fillna(0)
df['last_hour_retailers'] = df['last_hour_retailers'].fillna(0)
df['avg_uth_pct_qty'] = df['avg_uth_pct_qty'].fillna(0.5)
df['avg_uth_pct_retailers'] = df['avg_uth_pct_retailers'].fillna(0.5)
df['avg_last_hour_pct_qty'] = df['avg_last_hour_pct_qty'].fillna(0.05)
df['avg_last_hour_pct_retailers'] = df['avg_last_hour_pct_retailers'].fillna(0.05)

print(f"✅ Merged data: {len(df)} records")
print(f"\nUTH Qty Stats: min={df['uth_qty'].min():.0f}, max={df['uth_qty'].max():.0f}, mean={df['uth_qty'].mean():.1f}")
print(f"Last Hour Qty Stats: min={df['last_hour_qty'].min():.0f}, max={df['last_hour_qty'].max():.0f}, mean={df['last_hour_qty'].mean():.1f}")
print(f"Current Cart Rule Stats: min={df['current_cart_rule'].min():.0f}, max={df['current_cart_rule'].max():.0f}, mean={df['current_cart_rule'].mean():.1f}")


Merging performance data with base data...
✅ Merged data: 28382 records

UTH Qty Stats: min=0, max=341, mean=1.4
Last Hour Qty Stats: min=0, max=257, mean=0.1
Current Cart Rule Stats: min=1, max=10000, mean=88.1


In [8]:
# =============================================================================
# CALCULATE TARGETS AND STATUSES
# =============================================================================
print("Calculating UTH and Last Hour targets and statuses...")

def get_status(ratio):
    """
    Determine status based on ratio to target.
    - Growing: >110% of target
    - On Track: 90%-110% of target
    - Dropping: <90% of target
    """
    if ratio >= GROWING_THRESHOLD:
        return 'growing'
    elif ratio <= DROPPING_THRESHOLD:
        return 'dropping'
    else:
        return 'on_track'

# Calculate UTH targets
# UTH target = p80_qty * avg_uth_pct (historical % of day that should be done by now)
df['uth_qty_target'] = df['p80_daily_240d'] * df['avg_uth_pct_qty']
df['uth_rets_target'] = df['p70_daily_retailers_240d'] * df['avg_uth_pct_retailers']

# Calculate Last Hour targets
# Last hour target = p80_qty * avg_last_hour_pct (historical % of day for this hour)
df['last_hour_qty_target'] = df['p80_daily_240d'] * df['avg_last_hour_pct_qty']
df['last_hour_rets_target'] = df['p70_daily_retailers_240d'] * df['avg_last_hour_pct_retailers']

# Calculate ratios (actual / target)
df['uth_qty_ratio'] = df['uth_qty'] / df['uth_qty_target'].replace(0, 1)
df['uth_rets_ratio'] = df['uth_retailers'] / df['uth_rets_target'].replace(0, 1)
df['last_hour_qty_ratio'] = df['last_hour_qty'] / df['last_hour_qty_target'].replace(0, 1)
df['last_hour_rets_ratio'] = df['last_hour_retailers'] / df['last_hour_rets_target'].replace(0, 1)

# Calculate statuses
df['uth_qty_status'] = df['uth_qty_ratio'].apply(get_status)
df['uth_rets_status'] = df['uth_rets_ratio'].apply(get_status)
df['last_hour_qty_status'] = df['last_hour_qty_ratio'].apply(get_status)
df['last_hour_rets_status'] = df['last_hour_rets_ratio'].apply(get_status)

print(f"✅ Targets and statuses calculated")

# Summary
print(f"\n{'='*60}")
print("UTH STATUS DISTRIBUTION")
print(f"{'='*60}")
print(f"\nUTH Qty Status:")
print(df['uth_qty_status'].value_counts().to_string())
print(f"\nUTH Retailers Status:")
print(df['uth_rets_status'].value_counts().to_string())

print(f"\n{'='*60}")
print("LAST HOUR STATUS DISTRIBUTION")
print(f"{'='*60}")
print(f"\nLast Hour Qty Status:")
print(df['last_hour_qty_status'].value_counts().to_string())
print(f"\nLast Hour Retailers Status:")
print(df['last_hour_rets_status'].value_counts().to_string())


Calculating UTH and Last Hour targets and statuses...
✅ Targets and statuses calculated

UTH STATUS DISTRIBUTION

UTH Qty Status:
uth_qty_status
dropping    25392
growing      2172
on_track      818

UTH Retailers Status:
uth_rets_status
dropping    24267
growing      2699
on_track     1416

LAST HOUR STATUS DISTRIBUTION

Last Hour Qty Status:
last_hour_qty_status
dropping    27612
growing       664
on_track      106

Last Hour Retailers Status:
last_hour_rets_status
dropping    27335
growing       914
on_track      133


In [9]:
# =============================================================================
# SAMPLE OUTPUT - Current Status
# =============================================================================
# Show sample of data with all calculated fields

sample_cols = [
    'warehouse_id', 'product_id', 'sku',
    # P80/P70 benchmarks
    'p80_daily_240d', 'p70_daily_retailers_240d',
    # Current cart rule
    'current_cart_rule',
    # UTH performance
    'uth_qty', 'uth_qty_target', 'uth_qty_ratio', 'uth_qty_status',
    'uth_retailers', 'uth_rets_target', 'uth_rets_ratio', 'uth_rets_status',
    # Last hour performance
    'last_hour_qty', 'last_hour_qty_target', 'last_hour_qty_ratio', 'last_hour_qty_status',
    'last_hour_retailers', 'last_hour_rets_target', 'last_hour_rets_ratio', 'last_hour_rets_status'
]

# Filter to columns that exist
sample_cols = [c for c in sample_cols if c in df.columns]

print(f"\n{'='*60}")
print("SAMPLE DATA (First 10 rows with UTH > 0)")
print(f"{'='*60}")
sample = df[df['uth_qty'] > 0][sample_cols].head(10)
display(sample)



SAMPLE DATA (First 10 rows with UTH > 0)


Unnamed: 0,warehouse_id,product_id,sku,p80_daily_240d,p70_daily_retailers_240d,current_cart_rule,uth_qty,uth_qty_target,uth_qty_ratio,uth_qty_status,...,uth_rets_ratio,uth_rets_status,last_hour_qty,last_hour_qty_target,last_hour_qty_ratio,last_hour_qty_status,last_hour_retailers,last_hour_rets_target,last_hour_rets_ratio,last_hour_rets_status
1,401,972,نواعم بسكويت- 5 ج,19.0,3.399,695,2.0,9.438326,0.211902,dropping,...,0.596998,dropping,0.0,1.313408,0.0,dropping,0.0,0.235045,0.0,dropping
6,962,12003,الضحى مكرونة بصوص ماك آند تشيز - 175 جم,10.0,2.0,10,8.0,4.699908,1.702161,growing,...,2.143938,growing,0.0,0.68343,0.0,dropping,0.0,0.141416,0.0,dropping
10,236,60,كوفى ميكس بونجورنو فى الخمسينة - 6 جم,155.2,19.0,24,20.0,81.137965,0.246494,dropping,...,0.41578,dropping,8.0,9.395767,0.851447,dropping,2.0,1.184463,1.688529,growing
11,962,60,كوفى ميكس بونجورنو فى الخمسينة - 6 جم,152.0,17.0,24,2.0,77.171317,0.025916,dropping,...,0.23049,dropping,0.0,8.614587,0.0,dropping,0.0,1.042148,0.0,dropping
14,703,105,فجيتار كنور عادى - 35 جم,11.0,4.0,14,2.0,5.150579,0.388306,dropping,...,1.089137,on_track,0.0,0.687555,0.0,dropping,0.0,0.262991,0.0,dropping
18,632,204,نعناع ايزيس - 12 فتلة,0.0,0.0,25,1.0,0.0,1.0,on_track,...,1.0,on_track,0.0,0.0,0.0,dropping,0.0,0.0,0.0,dropping
21,1,12924,زيت ثمرات خليط - 700 مل,18.0,8.0,7,22.0,8.581029,2.563795,growing,...,1.565085,growing,0.0,1.120786,0.0,dropping,0.0,0.496428,0.0,dropping
27,339,858,هنادى زيت خليط- 2.1 لتر,3.2,2.0,7,1.0,1.519307,0.658195,dropping,...,1.085628,on_track,0.0,0.212883,0.0,dropping,0.0,0.139737,0.0,dropping
28,170,858,هنادى زيت خليط- 2.1 لتر,2.0,1.0,7,1.0,0.963123,1.038289,on_track,...,2.13464,growing,1.0,0.12319,8.117545,growing,1.0,0.063196,15.823782,growing
34,703,10447,هارفست فول مصفى بالليمون المعصفر خصم 15% - 400 جم,12.0,1.0,12,6.0,5.869494,1.022235,on_track,...,4.021469,growing,0.0,0.699715,0.0,dropping,0.0,0.058983,0.0,dropping


In [10]:
# =============================================================================
# ACTION ENGINE (TO BE DEFINED)
# =============================================================================
# This section will contain the action logic based on:
# - uth_qty_status, uth_rets_status
# - last_hour_qty_status, last_hour_rets_status
#
# Placeholder for now - actions will be defined by user

print(f"\n{'='*60}")
print("MODULE 4 - DATA PREPARATION COMPLETE")
print(f"{'='*60}")
print(f"\nReady for action definition. Available statuses:")
print(f"  - uth_qty_status: {df['uth_qty_status'].unique().tolist()}")
print(f"  - uth_rets_status: {df['uth_rets_status'].unique().tolist()}")
print(f"  - last_hour_qty_status: {df['last_hour_qty_status'].unique().tolist()}")
print(f"  - last_hour_rets_status: {df['last_hour_rets_status'].unique().tolist()}")
print(f"\nTotal records: {len(df)}")



MODULE 4 - DATA PREPARATION COMPLETE

Ready for action definition. Available statuses:
  - uth_qty_status: ['dropping', 'growing', 'on_track']
  - uth_rets_status: ['dropping', 'growing', 'on_track']
  - last_hour_qty_status: ['dropping', 'growing', 'on_track']
  - last_hour_rets_status: ['dropping', 'growing', 'on_track']

Total records: 28382
