# Agricultural Aid Monitoring Workflow - Modular Version

This notebook demonstrates the complete agricultural monitoring workflow using the modularized package structure.

## Overview
The workflow has been modularized into:
- **`agricultural_monitoring.config`** - Configuration and settings
- **`agricultural_monitoring.models`** - Data models and schemas
- **`agricultural_monitoring.extractors`** - Web content extraction
- **`agricultural_monitoring.processors`** - LLM processing and filtering
- **`agricultural_monitoring.workflows`** - Complete monitoring workflows
- **`agricultural_monitoring.monitoring`** - LangSmith observability

## 1. Setup and Imports

In [1]:
# Core imports
import sys
from pathlib import Path

# Add project root to path
project_root = Path().resolve()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# Import modular components
from agricultural_monitoring.workflows import AgriculturalMonitoringWorkflow
from agricultural_monitoring.config.settings import TARGET_URLS, setup_langsmith
from agricultural_monitoring.monitoring import run_monitored_workflow, LangSmithAnalyzer

# Setup environment
langsmith_enabled = setup_langsmith()

print("✅ Modular agricultural monitoring package loaded")
print(f"✅ LangSmith tracing: {'enabled' if langsmith_enabled else 'disabled'}")
print(f"✅ Target URLs: {len(TARGET_URLS)} configured")

# Display target URLs
for i, url in enumerate(TARGET_URLS, 1):
    print(f"   {i}. {url}")

✅ Modular agricultural monitoring package loaded
✅ LangSmith tracing: enabled
✅ Target URLs: 3 configured
   1. https://agriculture.gouv.fr/mots-cles/aides
   2. https://ain-rhone.msa.fr/lfp/soutien-exploitant
   3. https://www.franceagrimer.fr/rechercher-une-aide


* 'schema_extra' has been renamed to 'json_schema_extra'


## 2. Quick Start - Complete Workflow

In [2]:
# Create enhanced workflow with inline link processing
workflow = AgriculturalMonitoringWorkflow()

## 3. Test Single URL

In [4]:
# Test with one URL first
test_url = TARGET_URLS[2]
print(f"🧪 Testing workflow with: {test_url}")

result = workflow.monitor_url(test_url)

print(f"\n📊 Results:")
print(f"Web extraction: {result['web_content']['status']}")
print(f"LLM processing: {result['llm_extraction']['status']}")
print(f"Data normalization: {result['normalized_data']['metadata']['status']}")
print(f"Memory filtering: {result.get('memory_filtered_data', {}).get('status', 'N/A')}")

# Show found aids
memory_data = result.get('memory_filtered_data', {})
if memory_data.get('status') == 'success':
    aids_found = memory_data.get('aides', [])
    before_filtering = len(result['normalized_data']['aides'])
    after_filtering = len(aids_found)
    
    print(f"\n🔍 Filtering results:")
    print(f"   Before filtering: {before_filtering} aids")
    print(f"   After filtering: {after_filtering} aids")
    print(f"   Filtered out: {memory_data.get('filtered_count', 0)} aids")
    
    if aids_found:
        print(f"\n✅ New agricultural aids found:")
        for i, aide in enumerate(aids_found[:3], 1):
            print(f"   {i}. {aide['titre_aide']}")
            print(f"      {aide['description'][:100]}...")
    else:
        print(f"\n📝 No new aids found (all filtered by memory)")
else:
    print(f"\n❌ Workflow failed at some stage")

🧪 Testing workflow with: https://www.franceagrimer.fr/rechercher-une-aide
🔍 Starting monitoring workflow for: https://www.franceagrimer.fr/rechercher-une-aide
📥 Fetching content from: https://www.franceagrimer.fr/rechercher-une-aide (attempt 1)

📊 Results:
Web extraction: success
LLM processing: success
Data normalization: success
Memory filtering: success

🔍 Filtering results:
   Before filtering: 16 aids
   After filtering: 16 aids
   Filtered out: 0 aids

✅ New agricultural aids found:
   1. Aides à l'assistance technique
      Soutien aux acteurs agricoles pour l'assistance technique...
   2. Aide aux plus démunis
      Dispositif de soutien aux populations agricoles en situation de précarité...
   3. Aides à la promotion générique des produits
      Soutien à la promotion des produits agricoles...


## 4. Production Monitoring - All URLs

In [None]:
# Run production monitoring with all URLs
print("🚀 Running production monitoring with all target URLs...")
print("=" * 60)

# Use the monitoring function for comprehensive tracking
results = run_monitored_workflow(workflow, TARGET_URLS)

# Detailed results analysis
successful_results = results['successful_results']
failed_results = results['failed_results']
summary = results['summary']

print(f"\n📋 DETAILED RESULTS:")
for i, result in enumerate(successful_results + failed_results, 1):
    url = result.get('web_content', {}).get('url', 'Unknown')
    status = result.get('normalized_data', {}).get('metadata', {}).get('status', 'Unknown')
    
    print(f"\n{i}. {url}")
    print(f"   Status: {status}")
    
    if status == 'success':
        aids_before = len(result['normalized_data']['aides'])
        memory_data = result.get('memory_filtered_data', {})
        aids_after = memory_data.get('new_count', aids_before) if memory_data.get('status') == 'success' else aids_before
        
        print(f"   Web extraction: ✅")
        print(f"   Content length: {len(result['web_content']['content']):,} chars")
        print(f"   Links found: {result['web_content'].get('links_count', 0)}")
        print(f"   Aids before filtering: {aids_before}")
        print(f"   Aids after filtering: {aids_after}")
        
        # Show sample aids
        final_aids = memory_data.get('aides', result['normalized_data']['aides']) if memory_data.get('status') == 'success' else result['normalized_data']['aides']
        if final_aids:
            print(f"   Sample aids:")
            for aide in final_aids[:2]:
                print(f"     • {aide['titre_aide']}")
    else:
        error = result.get('normalized_data', {}).get('metadata', {}).get('error', 'Unknown error')
        print(f"   Error: {error}")