# Task 1: Data Enrichment Notebook
## Ethiopia Financial Inclusion Forecasting

**Student:** Biniyam Mitiku  
**Date:** January 31, 2026

### Objectives:
1. Load original dataset
2. Add new observations from found sources
3. Add new events
4. Add new impact links
5. Save enriched dataset
6. Document everything

In [1]:
# Setup and imports
import pandas as pd
import numpy as np
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', 1000)

print("‚úÖ Setup complete")

‚úÖ Setup complete


In [5]:
print("üìÅ LOADING ALL DATA FILES")
print("=" * 50)

# Load all three files
sheet1_df = pd.read_csv('../data/raw/ethiopia_fi_unified_data.csv')
sheet2_df = pd.read_csv('../data/raw/ethiopia_fi_unified_data_sheet_2.csv')
ref_df = pd.read_csv('../data/raw/reference_codes.csv')

print(f"üìÑ Sheet 1 (Observations/Events/Targets): {sheet1_df.shape[0]} rows, {sheet1_df.shape[1]} columns")
print(f"üìÑ Sheet 2 (Impact Links): {sheet2_df.shape[0]} rows, {sheet2_df.shape[1]} columns")
print(f"üìÑ Reference Codes: {ref_df.shape[0]} rows, {ref_df.shape[1]} columns")

print("\nüîç SHEET 1 - Record Types:")
print(sheet1_df['record_type'].value_counts())

print("\nüîç SHEET 2 - Record Types:")
print(sheet2_df['record_type'].value_counts())

print("\nüìã Sample from Sheet 1:")
print(sheet1_df.head(3).to_string())

print("\nüìã Sample from Sheet 2:")
print(sheet2_df.head(3).to_string())

üìÅ LOADING ALL DATA FILES
üìÑ Sheet 1 (Observations/Events/Targets): 43 rows, 34 columns
üìÑ Sheet 2 (Impact Links): 14 rows, 35 columns
üìÑ Reference Codes: 71 rows, 4 columns

üîç SHEET 1 - Record Types:
record_type
observation    30
event          10
target          3
Name: count, dtype: int64

üîç SHEET 2 - Record Types:
record_type
impact_link    14
Name: count, dtype: int64

üìã Sample from Sheet 1:
  record_id  record_type category  pillar               indicator indicator_code indicator_direction  value_numeric value_text  value_type unit observation_date period_start period_end fiscal_year gender  location  region         source_name source_type                                             source_url confidence  related_indicator  relationship_type  impact_direction  impact_magnitude  impact_estimate  lag_months  evidence_basis comparable_country collected_by collection_date  original_text  notes
0  REC_0001  observation      NaN  ACCESS  Account Ownership Rate  ACC_OWN

In [9]:
print("Adding observations...")
new_observations = [
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "Account Ownership Rate",
    "indicator_code": "FI.ACC.OWN.TOTL",
    "value_numeric": 46.48,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2022-12-31",
    "source_name": "World Bank Global Findex Database",
    "source_url": "https://www.worldbank.org/en/publication/globalfindex",
    "source_type": "survey",
    "confidence": "high",

    "Comparable_country" : None,
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-01-31",
    "original_text":None,
    "notes": "Account ownership among population ages 15+ in Ethiopia; measured via Global Findex survey",
    }
    ,
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "Account Ownership Rate (Male)",
    "indicator_code": "FI.ACC.OWN.TOTL_M",
    "value_numeric": 55,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2022-12-31",
    "source_name": "World Bank World Development Indicators (WDI)",
    "source_url": "https://databank.worldbank.org/source/world-development-indicators",
    "source_type": "official_statistics",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-01-31",
    "notes": "Male population ages 15+ in Ethiopia; account ownership at a financial institution or mobile money provider",
    "gender":"Male"
    }
    ,
    {
     "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "Account Ownership Rate (Male)",
    "indicator_code": "FI.ACC.OWN.TOTL_M",
    "value_numeric": 23,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2014-12-31",
    "source_name": "World Bank World Development Indicators (WDI)",
    "source_url": "https://databank.worldbank.org/source/world-development-indicators",
    "source_type": "official_statistics",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-01-31",
    "notes": "Male population ages 15+ in Ethiopia; account ownership at a financial institution or mobile money provider",
    "gender":"Male"
    } 
    ,
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "Account Ownership Rate (Male)",
    "indicator_code": "FI.ACC.OWN.TOTL_M",
    "value_numeric": 41,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2014-12-31",
    "source_name": "World Bank World Development Indicators (WDI)",
    "source_url": "https://databank.worldbank.org/source/world-development-indicators",
    "source_type": "official_statistics",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-01-31",
    "notes": "Male population ages 15+ in Ethiopia; account ownership at a financial institution or mobile money provider",
    "gender":"Male"
    },
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "International exports in digitally-deliverable services",
    "indicator_code": "ACC_Digital_Delivery",
    "value_numeric": 184,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2010/01/31",
    "source_name": "Global Findex 2014",
    "source_url": "https://www.worldbank.org/en/publication/globalfindex",
    "source_type": "survey",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-02-01",
    "notes": "International exports in digitally-deliverable services have become a major driver of global trade.",
    "gender":"All"
    }
    ,
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "International exports in digitally-deliverable services",
    "indicator_code": "ACC_Digital_Delivery",
    "value_numeric": 165,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2011/1/31",
    "source_name": "Global Findex 2014",
    "source_url": "https://www.worldbank.org/en/publication/globalfindex",
    "source_type": "survey",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-02-01",
    "notes": "International exports in digitally-deliverable services have become a major driver of global trade.",
    "gender":"All"
    }
    ,
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "International exports in digitally-deliverable services",
    "indicator_code": "ACC_Digital_Delivery",
    "value_numeric": 975,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2023/12/31",
    "source_name": "Global Findex 2014",
    "source_url": "https://www.worldbank.org/en/publication/globalfindex",
    "source_type": "survey",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-02-01",
    "notes": "International exports in digitally-deliverable services have become a major driver of global trade.",
    "gender":"All"
    },
    {
    "record_type": "observation",
    "pillar": "ACCESS",
    "indicator": "Mobile cellular subscriptions (per 100 people)",
    "indicator_code": "Mob_Sub",
    "value_numeric": 57,
    "value_type": "percentage",
    "unit": "%",
    "observation_date": "2022/12/31",
    "source_name": "World Bank",
    "source_url": "https://data.humdata.org/dataset/world-bank-infrastructure-indicators-for-ethiopia",
    "source_type": "survey",
    "confidence": "high",
    "collected_by": "Biniyam Mitiku",
    "collection_date": "2026-02-01",
    "notes": "Mobile cellular subscriptions (per 100 people) indicate the penetration of mobile phone access within a population, reflecting connectivity and digital inclusion levels.",
    "gender":"All"
    }

]
print(f"üìä Defined {len(new_observations)} new observations")


Adding observations...
üìä Defined 8 new observations


In [10]:
# NEW EVENTS to add to Sheet 1
print("Adding new events...")
new_events = [
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "event",
        "category": "partnership",  # REQUIRED for events
        "pillar": None,  # IMPORTANT: Events should NOT have pillars
        "indicator": "Promote Access to Finance",
        "indicator_code": "EVT_Finance",
        "indicator_direction": "higher_better",
        "value_numeric": None,
        "value_text" : "implemented",
        "observation_date": "2023-08-08",
        "source_name": "NBE",
        "source_type": "news",
        "source_url": "https://nbe.gov.et/nbe_news/nbe-and-ifc-sign-an-agreement-to-promote-access-to-finance-for-the-underserved/",
        "confidence": "high",
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": "Promoting access to finance expands financial inclusion, empowers individuals and businesses, and drives sustainable economic growth"
    },
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "event",
        "category": "policy",  # REQUIRED for events
        "pillar": None,  # IMPORTANT: Events should NOT have pillars
        "indicator": "Interst rate",
        "indicator_code": "EVT_interestRate",
        "indicator_direction": "higher_better",
        "value_numeric": None,
        "value_text" : "implemented",
        "observation_date": "2024-07-09",
        "source_name": "NBE",
        "source_type": "report",
        "source_url": "https://nbe.gov.et/nbe_news/nbe-and-ifc-sign-an-agreement-to-promote-access-to-finance-for-the-underserved/",
        "confidence": "high",
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": None
    },
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "event",
        "category": "policy",  # REQUIRED for events
        "pillar": None,  # IMPORTANT: Events should NOT have pillars
        "indicator": "Account Ownership Gender Gap",
        "indicator_code": "EVT_Gen_Gap",
        "indicator_direction": "lower_better",
        "value_numeric": None,
        "value_text" : "launched",
        "observation_date": "2020-07-08",
        "source_name": "NBE",
        "source_type": "report",
        "source_url": "https://nbe.gov.et/nbe_news/nbe-and-ifc-sign-an-agreement-to-promote-access-to-finance-for-the-underserved/",
        "confidence": "high",
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": None
    }
]
print(f"üéØ Defined {len(new_events)} new events")


Adding new events...
üéØ Defined 3 new events


In [11]:
# NEW IMPACT LINKS to add to Sheet 2
print("Adding the new impact_links")
new_impact_links = [
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "impact_link",
        "category": None,  # Usually empty for impact_links
        "pillar": "USAGE",  # REQUIRED - which pillar is affected
        "indicator": None,  # Usually empty for impact_links
        "indicator_code": None,
        "indicator_direction": None,
        "value_numeric": None,
        "value_text": None,
        "value_type": None,
        "unit": None,
        "observation_date": None,  # Usually empty for impact_links
        "period_start": None,
        "period_end": None,
        "fiscal_year": None,
        "gender": None,
        "location": None,
        "region": None,
        "source_name": "Modeled Impact",
        "source_type": "model",
        "source_url": None,
        "confidence": "high",
        "related_indicator": "USG_P2P_COUNT",  # REQUIRED - which indicator is affected
        "relationship_type": "direct",
        "impact_direction": "increase",  # REQUIRED
        "impact_magnitude": "high",  # REQUIRED
        "impact_estimate": 15.0,  # Quantitative estimate
        "lag_months": 24,  # Months for effect to manifest
        "evidence_basis": "comparable_country",
        "comparable_country": "Kenya, Tanzania",
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": "Based on similar mobile money launches in East AfricaUsers may reduce reliance on P2P lending platforms since loans become less attractive compared to saving or alternative financing",
        "notes": None,
        "parent_id": "EVT_0012"  # REQUIRED - which event this links to
    },
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "impact_link",
        "category": None,  # Usually empty for impact_links
        "pillar": "ACCESS",  # REQUIRED - which pillar is affected
        "indicator": None,  # Usually empty for impact_links
        "indicator_code": None,
        "indicator_direction": None,
        "value_numeric": None,
        "value_text": None,
        "value_type": None,
        "unit": None,
        "observation_date": None,  # Usually empty for impact_links
        "period_start": None,
        "period_end": None,
        "fiscal_year": None,
        "gender": None,
        "location": None,
        "region": None,
        "source_name": None,
        "source_type": None,
        "source_url": None,
        "confidence": "high",
        "related_indicator": "ACC_MM_ACCOUNT",  # REQUIRED - which indicator is affected
        "relationship_type": "direct",
        "impact_direction": "increase",  # REQUIRED
        "impact_magnitude": "medium",  # REQUIRED
        "impact_estimate": 10.0,  # Quantitative estimate
        "lag_months": 6,  # Months for effect to manifest
        "evidence_basis": "empirical",
        "comparable_country": None,
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": "Expanding access to finance encourages more people, especially the unbanked, to open mobile money accounts",
        "parent_id": "EVT_0011"  # REQUIRED - which event this links to 
    },
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "impact_link",
        "category": None,  # Usually empty for impact_links
        "pillar": "ACCESS",  # REQUIRED - which pillar is affected
        "indicator": None,  # Usually empty for impact_links
        "indicator_code": None,
        "indicator_direction": None,
        "value_numeric": None,
        "value_text": None,
        "value_type": None,
        "unit": None,
        "observation_date": None,  # Usually empty for impact_links
        "period_start": None,
        "period_end": None,
        "fiscal_year": None,
        "gender": None,
        "location": None,
        "region": None,
        "source_name": None,
        "source_type": None,
        "source_url": None,
        "confidence": "high",
        "related_indicator": "ACC_OWNERSHIP",  # REQUIRED - which indicator is affected
        "relationship_type": "direct",
        "impact_direction": "increase",  # REQUIRED
        "impact_magnitude": "high",  # REQUIRED
        "impact_estimate": 20.0,  # Quantitative estimate
        "lag_months": 12,  # Months for effect to manifest
        "evidence_basis": "empirical",
        "comparable_country": None,
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": "Lowering barriers such as documentation requirements and high fees encourages more people to open accounts.",
        "parent_id": "EVT_0011"  # REQUIRED - which event this links to 
    },
    {
        "record_id": None,  # Will be auto-generated
        "record_type": "impact_link",
        "category": None,  # Usually empty for impact_links
        "pillar": "USAGE",  # REQUIRED - which pillar is affected
        "indicator": None,  # Usually empty for impact_links
        "indicator_code": None,
        "indicator_direction": None,
        "value_numeric": None,
        "value_text": None,
        "value_type": None,
        "unit": None,
        "observation_date": None,  # Usually empty for impact_links
        "period_start": None,
        "period_end": None,
        "fiscal_year": None,
        "gender": None,
        "location": None,
        "region": None,
        "source_name": None,
        "source_type": None,
        "source_url": None,
        "confidence": "high",
        "related_indicator": "GEN_GAP_ACC",  # REQUIRED - which indicator is affected
        "relationship_type": "direct",
        "impact_direction": "increase",  # REQUIRED
        "impact_magnitude": "high",  # REQUIRED
        "impact_estimate": 15.0,  # Quantitative estimate
        "lag_months": 7,  # Months for effect to manifest
        "evidence_basis": "empirical",
        "comparable_country": None,
        "collected_by": "Biniyam Mitiku",
        "collection_date": "2026-01-31",
        "original_text": None,
        "notes": "Access to credit enables women entrepreneurs to expand businesses and create jobs",
        "parent_id": "EVT_0013"  # REQUIRED - which event this links to 
 
    }
]
print(f"üîó Defined {len(new_impact_links)} new impact links")

Adding the new impact_links
üîó Defined 4 new impact links


In [16]:
def generate_record_id(base_df, record_type):
    """Generate next available record_id."""
    prefix_map = {
        'observation': 'OBS',
        'event': 'EVT',
        'impact_link': 'IMP',
        'target': 'TGT'
    }
    
    prefix = prefix_map.get(record_type, 'REC')
    
    # Get existing IDs for this record_type
    existing_ids = base_df[base_df['record_type'] == record_type]['record_id']
    
    # Extract numbers
    existing_nums = []
    for id_str in existing_ids.dropna():
        if str(id_str).startswith(prefix):
            try:
                num = int(str(id_str).split('_')[1])
                existing_nums.append(num)
            except:
                pass
    
    next_num = max(existing_nums) + 1 if existing_nums else 1
    return f"{prefix}_{next_num:04d}"

def fix_date_format(date_str):
    """Convert date to YYYY-MM-DD format."""
    if pd.isna(date_str):
        return None
    
    # Try different formats
    for fmt in ("%Y-%m-%d", "%m/%d/%Y", "%d/%m/%Y", "%Y/%m/%d"):
        try:
            return datetime.strptime(str(date_str), fmt).strftime("%Y-%m-%d")
        except:
            continue
    
    return str(date_str)  # Return as is if can't parse

In [18]:
print("üì• ADDING OBSERVATIONS TO SHEET 1")
print("=" * 40)

# Create copy for enrichment
sheet1_enriched = sheet1_df.copy()

# Process and add each observation
for obs in new_observations:
    # Fix date format
    obs['observation_date'] = fix_date_format(obs['observation_date'])
    obs['collection_date'] = fix_date_format(obs.get('collection_date'))
    
    # Generate record_id
    obs['record_id'] = generate_record_id(sheet1_enriched, 'observation')
    
    # Convert to DataFrame row
    obs_df = pd.DataFrame([obs])
    
    # Ensure all columns match
    for col in sheet1_enriched.columns:
        if col not in obs_df.columns:
            obs_df[col] = None
    
    # Reorder columns
    obs_df = obs_df[sheet1_enriched.columns]
    
    # Append
    sheet1_enriched = pd.concat([sheet1_enriched, obs_df], ignore_index=True)
    print(f"‚úÖ Added observation: {obs['indicator']} ({obs['observation_date']})")

print(f"\nüìä Sheet 1 now has {len(sheet1_enriched)} total records")
print(f"   Observations: {len(sheet1_enriched[sheet1_enriched['record_type'] == 'observation'])}")

üì• ADDING OBSERVATIONS TO SHEET 1
‚úÖ Added observation: Account Ownership Rate (2022-12-31)
‚úÖ Added observation: Account Ownership Rate (Male) (2022-12-31)
‚úÖ Added observation: Account Ownership Rate (Male) (2014-12-31)
‚úÖ Added observation: Account Ownership Rate (Male) (2014-12-31)
‚úÖ Added observation: International exports in digitally-deliverable services (2010-01-31)
‚úÖ Added observation: International exports in digitally-deliverable services (2011-01-31)
‚úÖ Added observation: International exports in digitally-deliverable services (2023-12-31)
‚úÖ Added observation: Mobile cellular subscriptions (per 100 people) (2022-12-31)

üìä Sheet 1 now has 51 total records
   Observations: 38


In [19]:
print("\nüéØ ADDING EVENTS TO SHEET 1")
print("=" * 40)

# Process and add each event
for event in new_events:
    # Fix date format
    event['observation_date'] = fix_date_format(event['observation_date'])
    event['collection_date'] = fix_date_format(event.get('collection_date'))
    
    # Generate record_id
    event['record_id'] = generate_record_id(sheet1_enriched, 'event')
    
    # Important: Events should NOT have pillar
    if 'pillar' in event:
        event.pop('pillar')
    
    # Convert to DataFrame row
    event_df = pd.DataFrame([event])
    
    # Ensure all columns match
    for col in sheet1_enriched.columns:
        if col not in event_df.columns:
            event_df[col] = None
    
    # Reorder columns
    event_df = event_df[sheet1_enriched.columns]
    
    # Append
    sheet1_enriched = pd.concat([sheet1_enriched, event_df], ignore_index=True)
    print(f"‚úÖ Added event: {event['indicator']} ({event['observation_date']})")

print(f"\nüìä Sheet 1 now has {len(sheet1_enriched)} total records")
print(f"   Events: {len(sheet1_enriched[sheet1_enriched['record_type'] == 'event'])}")


üéØ ADDING EVENTS TO SHEET 1
‚úÖ Added event: Promote Access to Finance (2023-08-08)
‚úÖ Added event: Interst rate (2024-07-09)
‚úÖ Added event: Account Ownership Gender Gap (2020-07-08)

üìä Sheet 1 now has 54 total records
   Events: 13


In [20]:
print("\nüîó ADDING IMPACT LINKS TO SHEET 2")
print("=" * 40)

# Create copy for enrichment
sheet2_enriched = sheet2_df.copy()

# Get event IDs for reference
event_ids = sheet1_enriched[sheet1_enriched['record_type'] == 'event']['record_id'].tolist()
print(f"Available event IDs for parent_id: {event_ids[-3:]}")  # Show last 3

# Process and add each impact link
for imp in new_impact_links:
    # Fix evidence_basis typo
    if imp.get('evidence_basis') == 'emprical':
        imp['evidence_basis'] = 'empirical'
    
    # Fix date format
    imp['collection_date'] = fix_date_format(imp.get('collection_date'))
    
    # Generate record_id
    imp['record_id'] = generate_record_id(sheet2_enriched, 'impact_link')
    
    # Convert to DataFrame row
    imp_df = pd.DataFrame([imp])
    
    # Ensure all columns match
    for col in sheet2_enriched.columns:
        if col not in imp_df.columns:
            imp_df[col] = None
    
    # Reorder columns
    imp_df = imp_df[sheet2_enriched.columns]
    
    # Append
    sheet2_enriched = pd.concat([sheet2_enriched, imp_df], ignore_index=True)
    print(f"‚úÖ Added impact link: {imp['related_indicator']} ‚Üí {imp['pillar']}")

print(f"\nüìä Sheet 2 now has {len(sheet2_enriched)} total records")


üîó ADDING IMPACT LINKS TO SHEET 2
Available event IDs for parent_id: ['EVT_0011', 'EVT_0012', 'EVT_0013']
‚úÖ Added impact link: USG_P2P_COUNT ‚Üí USAGE
‚úÖ Added impact link: ACC_MM_ACCOUNT ‚Üí ACCESS
‚úÖ Added impact link: ACC_OWNERSHIP ‚Üí ACCESS
‚úÖ Added impact link: GEN_GAP_ACC ‚Üí USAGE

üìä Sheet 2 now has 18 total records


In [21]:
print("\nüíæ SAVING ENRICHED DATASETS")
print("=" * 40)

import os

# Create processed directory
os.makedirs('../data/processed', exist_ok=True)

# Save Sheet 1
sheet1_path = '../data/processed/ethiopia_fi_enriched_sheet1.csv'
sheet1_enriched.to_csv(sheet1_path, index=False)
print(f"‚úÖ Saved enriched Sheet 1 to: {sheet1_path}")
print(f"   Total records: {len(sheet1_enriched)}")

# Save Sheet 2
sheet2_path = '../data/processed/ethiopia_fi_enriched_sheet2.csv'
sheet2_enriched.to_csv(sheet2_path, index=False)
print(f"‚úÖ Saved enriched Sheet 2 to: {sheet2_path}")
print(f"   Total records: {len(sheet2_enriched)}")

# Save combined version (for analysis)
combined_df = pd.concat([sheet1_enriched, sheet2_enriched], ignore_index=True)
combined_path = '../data/processed/ethiopia_fi_enriched_combined.csv'
combined_df.to_csv(combined_path, index=False)
print(f"‚úÖ Saved combined dataset to: {combined_path}")
print(f"   Total records: {len(combined_df)}")


üíæ SAVING ENRICHED DATASETS
‚úÖ Saved enriched Sheet 1 to: ../data/processed/ethiopia_fi_enriched_sheet1.csv
   Total records: 54
‚úÖ Saved enriched Sheet 2 to: ../data/processed/ethiopia_fi_enriched_sheet2.csv
   Total records: 18
‚úÖ Saved combined dataset to: ../data/processed/ethiopia_fi_enriched_combined.csv
   Total records: 72


In [26]:
print("\nüìà ENRICHMENT SUMMARY")
print("=" * 40)

print(f"Original Sheet 1: {len(sheet1_df)} records")
print(f"Enriched Sheet 1: {len(sheet1_enriched)} records")
print(f"Added to Sheet 1: {len(sheet1_enriched) - len(sheet1_df)} records")

print(f"\nOriginal Sheet 2: {len(sheet2_df)} records")
print(f"Enriched Sheet 2: {len(sheet2_enriched)} records")
print(f"Added to Sheet 2: {len(sheet2_enriched) - len(sheet2_df)} records")

print("\nüéØ YOUR ADDITIONS:")
print(f"  ‚Ä¢ Observations: {len(new_observations)}")
print(f"  ‚Ä¢ Events: {len(new_events)}")
print(f"  ‚Ä¢ Impact Links: {len(new_impact_links)}")
print(f"  ‚Ä¢ Total New Records: {len(new_observations) + len(new_events) + len(new_impact_links)}")


üìà ENRICHMENT SUMMARY
Original Sheet 1: 43 records
Enriched Sheet 1: 54 records
Added to Sheet 1: 11 records

Original Sheet 2: 14 records
Enriched Sheet 2: 18 records
Added to Sheet 2: 4 records

üéØ YOUR ADDITIONS:
  ‚Ä¢ Observations: 8
  ‚Ä¢ Events: 3
  ‚Ä¢ Impact Links: 4
  ‚Ä¢ Total New Records: 15
