# SIRENE ETL Service Showcase

This notebook demonstrates the complete ETL (Extract, Transform, Load) capabilities of the SIRENE API client. We'll extract real company data from the SIRENE API, transform it into structured Pydantic models, and showcase the various features.

## Features Demonstrated

- **Data Extraction**: Complete SIREN history with all establishments
- **Data Transformation**: Raw API data ‚Üí Structured Pydantic models
- **Coordinate Conversion**: Lambert 93 ‚Üí WGS84 coordinates
- **Validation Modes**: Strict, Lenient, and Permissive validation
- **Error Handling**: Comprehensive error management
- **Data Relationships**: Company-facility relationships and temporal data


## Setup and Imports


In [1]:
from datetime import date
import json

# ETL Service imports
from sirene_api_client import (
    AuthenticatedClient,
    ETLConfig,
    ValidationMode,
    extract_and_transform_siren,
)

# Coordinate conversion utilities
from sirene_api_client.etl.coordinators import lambert93_to_wgs84

print("‚úÖ All imports successful!")
print(f"üìÖ Current date: {date.today()}")

‚úÖ All imports successful!
üìÖ Current date: 2025-10-15


## Configuration

Set up the API client and ETL configuration. **Note**: You'll need a valid INSEE API key to use the actual API.


In [3]:
# API Configuration
API_BASE_URL = "https://api.insee.fr/api-sirene/3.11"

# You can get a free API key from: https://api.insee.fr/catalogue/site/themes/wso2/subthemes/insee/pages/item-info.jag?name=Insee&version=v1&provider=insee
# For demo purposes, we'll use a placeholder - replace with your actual key
import os

from dotenv import load_dotenv

# Load variables from .env file (if present)
load_dotenv()

API_TOKEN = os.environ.get("SIRENE_API_TOKEN", "your-api-key-here")

# The Client class from sirene_api_client does not take an 'api_key' argument directly.
# Usually, authentication is handled via HTTP headers, and the Client might expect the API key to be provided via an Authorization header or an environment variable.
# Based on common usage and the typical `sirene_api_client` API, you should pass only `base_url` and ensure your API key is referenced by the library via the environment variable.

# So, instantiate the client like this:
client = AuthenticatedClient(
    base_url=API_BASE_URL,
    token=API_TOKEN,
    prefix="",  # No prefix for API key
    auth_header_name="X-INSEE-Api-Key-Integration",  # Correct header name
)


# ETL Configuration
config = ETLConfig(
    validation_mode=ValidationMode.STRICT,
    include_personal_data=False,
    coordinate_precision="approximate",
    max_retries=3,
    timeout_seconds=30,
)

print(f"üîß ETL Config: {config}")
print(f"üåê API Base URL: {API_BASE_URL}")
print(
    f"üîë API Key configured: {'Yes' if API_TOKEN != 'your-api-key-here' else 'No (using placeholder)'}"
)

üîß ETL Config: ETLConfig(validation_mode=<ValidationMode.STRICT: 'strict'>, include_personal_data=False, coordinate_precision='approximate', max_retries=3, timeout_seconds=30)
üåê API Base URL: https://api.insee.fr/api-sirene/3.11
üîë API Key configured: Yes


## Test SIREN Numbers

We'll use real SIREN numbers for demonstration. These are well-known French companies:


In [4]:
# Well-known French companies for testing
TEST_SIRENS = {
    "552049447": "Air France",
    "552049447": "Air France",  # Duplicate for demo
    "552049447": "Air France",  # Duplicate for demo
}

# For demo purposes, let's use a known valid SIREN
DEMO_SIREN = "552049447"  # Air France

print(f"üéØ Demo SIREN: {DEMO_SIREN}")
print(f"üìã Available test SIRENs: {list(TEST_SIRENS.keys())}")

üéØ Demo SIREN: 552049447
üìã Available test SIRENs: ['552049447']


## 1. Basic ETL Extraction

Let's start with a simple extraction and transformation:


In [5]:
async def demonstrate_basic_etl():
    """Demonstrate basic ETL extraction and transformation."""
    print(f"üöÄ Starting ETL extraction for SIREN: {DEMO_SIREN}")

    try:
        # Extract and transform data
        result = await extract_and_transform_siren(DEMO_SIREN, client, config)

        print("‚úÖ ETL extraction successful!")
        print("üìä Results summary:")
        print(f"   ‚Ä¢ Company: {result.company.name}")
        print(f"   ‚Ä¢ Facilities: {len(result.facilities)}")
        print(f"   ‚Ä¢ Legal periods: {len(result.legal_unit_periods)}")
        print(f"   ‚Ä¢ Establishment periods: {len(result.establishment_periods)}")
        print(f"   ‚Ä¢ Addresses: {len(result.addresses)}")
        print(f"   ‚Ä¢ Activity classifications: {len(result.activity_classifications)}")
        print(f"   ‚Ä¢ Registry records: {len(result.registry_records)}")

        return result

    except Exception as e:
        print(f"‚ùå ETL extraction failed: {e}")
        return None


# Run the demonstration
result = await demonstrate_basic_etl()

üöÄ Starting ETL extraction for SIREN: 552049447
‚úÖ ETL extraction successful!
üìä Results summary:
   ‚Ä¢ Company: SOCIETE NATIONALE SNCF
   ‚Ä¢ Facilities: 9179
   ‚Ä¢ Legal periods: 10
   ‚Ä¢ Establishment periods: 43484
   ‚Ä¢ Addresses: 9179
   ‚Ä¢ Activity classifications: 17
   ‚Ä¢ Registry records: 9180


## 2. Detailed Data Analysis

Let's examine the transformed data in detail:


In [6]:
if result:
    print("üîç DETAILED DATA ANALYSIS")
    print("=" * 50)

    # Company Information
    print("\nüè¢ COMPANY INFORMATION")
    print(f"   Name: {result.company.name}")
    print(f"   Creation Date: {result.company.creation_date}")
    print(f"   Acronym: {result.company.acronym}")
    print(f"   Employee Band: {result.company.employee_band}")
    print(f"   Company Size Category: {result.company.company_size_category}")
    print(f"   Diffusion Status: {result.company.diffusion_status}")

    # Company Identifiers
    print("\nüÜî COMPANY IDENTIFIERS")
    for identifier in result.company.identifiers:
        print(f"   ‚Ä¢ {identifier.scheme.upper()}: {identifier.value}")
        print(f"     Normalized: {identifier.normalized_value}")
        print(f"     Verified: {identifier.is_verified}")
        print(f"     Verified At: {identifier.verified_at}")

    # Facilities Information
    print(f"\nüè≠ FACILITIES ({len(result.facilities)} total)")
    for i, facility in enumerate(result.facilities[:3], 1):  # Show first 3
        print(f"   {i}. {facility.name}")
        print(
            f"      SIRET: {facility.identifiers[0].value if facility.identifiers else 'N/A'}"
        )
        print(f"      Parent SIREN: {facility.parent_siren}")
        print(f"      Is Headquarters: {facility.is_headquarters}")
        print(f"      Creation Date: {facility.creation_date}")
        print(f"      Employee Band: {facility.employee_band}")

    if len(result.facilities) > 3:
        print(f"   ... and {len(result.facilities) - 3} more facilities")
else:
    print("‚ùå No data available for analysis")

üîç DETAILED DATA ANALYSIS

üè¢ COMPANY INFORMATION
   Name: SOCIETE NATIONALE SNCF
   Creation Date: 1955-01-01
   Acronym: SNCF
   Employee Band: 53
   Company Size Category: GE
   Diffusion Status: O

üÜî COMPANY IDENTIFIERS
   ‚Ä¢ SIREN: 552049447
     Normalized: 552049447
     Verified: True
     Verified At: 2025-10-15 21:39:57.642156

üè≠ FACILITIES (9179 total)
   1. Unknown Facility
      SIRET: 55204944700014
      Parent SIREN: 552049447
      Is Headquarters: False
      Creation Date: 1986-08-01
      Employee Band: 21
   2. Unknown Facility
      SIRET: 55204944700030
      Parent SIREN: 552049447
      Is Headquarters: False
      Creation Date: 1986-06-23
      Employee Band: NN
   3. Unknown Facility
      SIRET: 55204944700725
      Parent SIREN: 552049447
      Is Headquarters: False
      Creation Date: 1986-06-23
      Employee Band: 21
   ... and 9176 more facilities


## 3. Address and Coordinate Analysis

Let's examine the address data and coordinate conversion:


In [7]:
if result and result.addresses:
    print("üìç ADDRESS AND COORDINATE ANALYSIS")
    print("=" * 50)

    for i, address in enumerate(result.addresses[:3], 1):  # Show first 3
        print(f"\nüè† ADDRESS {i}")
        print(f"   Country: {address.country}")
        print(f"   Locality: {address.locality}")
        print(f"   Postal Code: {address.postal_code}")
        print(f"   Street Address: {address.street_address}")
        print(f"   Coordinates: ({address.longitude}, {address.latitude})")
        print(f"   Provider: {address.provider}")
        print(f"   Geocode Precision: {address.geocode_precision}")
        print(f"   Valid From: {address.start}")
        print(f"   Valid Until: {address.end or 'Current'}")

        # Show coordinate conversion details
        if address.longitude and address.latitude:
            print(
                f"   üåç WGS84 Coordinates: {address.longitude:.6f}, {address.latitude:.6f}"
            )
        else:
            print("   ‚ö†Ô∏è  No coordinates available")

    if len(result.addresses) > 3:
        print(f"\n... and {len(result.addresses) - 3} more addresses")
else:
    print("‚ùå No address data available")

üìç ADDRESS AND COORDINATE ANALYSIS

üè† ADDRESS 1
   Country: FR
   Locality: PARIS
   Postal Code: 75009
   Street Address: 16 RUE DE BUDAPEST
   Coordinates: (2.3274850000000002, 48.87679700000001)
   Provider: sirene
   Geocode Precision: approximate
   Valid From: 1986-08-01
   Valid Until: Current
   üåç WGS84 Coordinates: 2.327485, 48.876797

üè† ADDRESS 2
   Country: FR
   Locality: LA CHAPELLE-SUR-ERDRE
   Postal Code: 44240
   Street Address: 14 RUE OLIVIER DE SESMAISONS
   Coordinates: (-1.5516809999999974, 47.29907699999999)
   Provider: sirene
   Geocode Precision: approximate
   Valid From: 1986-06-23
   Valid Until: Current
   üåç WGS84 Coordinates: -1.551681, 47.299077

üè† ADDRESS 3
   Country: FR
   Locality: METZ
   Postal Code: 57000
   Street Address: 3 PLACE DU GENERAL DE GAULLE
   Coordinates: (6.176391999999999, 49.11032699999999)
   Provider: sirene
   Geocode Precision: approximate
   Valid From: 1986-06-23
   Valid Until: Current
   üåç WGS84 Coordinat

## 4. Activity Classifications

Let's examine the NAF activity codes:


In [8]:
if result and result.activity_classifications:
    print("üè∑Ô∏è  ACTIVITY CLASSIFICATIONS")
    print("=" * 50)

    for i, activity in enumerate(result.activity_classifications, 1):
        print(f"\nüìã ACTIVITY {i}")
        print(f"   Scheme: {activity.scheme}")
        print(f"   Code: {activity.code}")
        print(f"   Label: {activity.label}")
        print(f"   Valid From: {activity.start}")
        print(f"   Valid Until: {activity.end or 'Current'}")
else:
    print("‚ùå No activity classification data available")

üè∑Ô∏è  ACTIVITY CLASSIFICATIONS

üìã ACTIVITY 1
   Scheme: naf_rev2
   Code: 49.10Z
   Label: NAFRev2 49.10Z
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 2
   Scheme: naf1993
   Code: 60.1Z
   Label: NAF1993 60.1Z
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 3
   Scheme: naf_rev1
   Code: 60.1Z
   Label: NAFRev1 60.1Z
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 4
   Scheme: naf_rev2
   Code: 49.20Z
   Label: NAFRev2 49.20Z
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 5
   Scheme: naf_rev2
   Code: 41.20B
   Label: NAFRev2 41.20B
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 6
   Scheme: naf1993
   Code: 45.2B
   Label: NAF1993 45.2B
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 7
   Scheme: naf_rev2
   Code: 53.10Z
   Label: NAFRev2 53.10Z
   Valid From: 2008-01-01
   Valid Until: Current

üìã ACTIVITY 8
   Scheme: naf_rev2
   Code: 87.10B
   Label: NAFRev2 87.10B


## 5. Coordinate Conversion Demonstration

Let's demonstrate the Lambert 93 to WGS84 coordinate conversion:


In [9]:
print("üó∫Ô∏è  COORDINATE CONVERSION DEMONSTRATION")
print("=" * 50)

# Test coordinates (Paris area)
test_coordinates = [
    ("652345", "6862275", "Paris - Champs-√âlys√©es"),
    ("500000", "6500000", "Southwest France"),
    ("800000", "7200000", "Northeast France"),
]

print("\nüìç Lambert 93 ‚Üí WGS84 Conversion Examples:")
for lambert_x, lambert_y, description in test_coordinates:
    try:
        wgs84_coords = lambert93_to_wgs84(lambert_x, lambert_y)
        if wgs84_coords:
            lon, lat = wgs84_coords
            print(f"   {description}:")
            print(f"     Lambert 93: ({lambert_x}, {lambert_y})")
            print(f"     WGS84: ({lon:.6f}, {lat:.6f})")
        else:
            print(f"   {description}: Conversion failed")
    except Exception as e:
        print(f"   {description}: Error - {e}")

# Test error handling
print("\n‚ö†Ô∏è  Error Handling Examples:")
error_cases = [
    ("invalid", "123456", "Invalid X coordinate"),
    ("123456", "invalid", "Invalid Y coordinate"),
    ("9999999", "9999999", "Out of bounds coordinates"),
]

for lambert_x, lambert_y, description in error_cases:
    try:
        wgs84_coords = lambert93_to_wgs84(lambert_x, lambert_y)
        print(f"   {description}: Unexpected success - {wgs84_coords}")
    except Exception as e:
        print(f"   {description}: Correctly caught error - {type(e).__name__}")

Invalid coordinate format: x=invalid, y=123456, error=could not convert string to float: 'invalid'
Invalid coordinate format: x=123456, y=invalid, error=could not convert string to float: 'invalid'
Coordinates out of expected Lambert 93 range: x=9999999.0, y=9999999.0
Coordinate conversion failed: x=9999999, y=9999999, error=Coordinates out of Lambert 93 range: x=9999999.0, y=9999999.0


üó∫Ô∏è  COORDINATE CONVERSION DEMONSTRATION

üìç Lambert 93 ‚Üí WGS84 Conversion Examples:
   Paris - Champs-√âlys√©es:
     Lambert 93: (652345, 6862275)
     WGS84: (2.350483, 48.858747)
   Southwest France:
     Lambert 93: (500000, 6500000)
     WGS84: (0.435358, 45.570269)
   Northeast France:
     Lambert 93: (800000, 7200000)
     WGS84: (4.447200, 51.883872)

‚ö†Ô∏è  Error Handling Examples:
   Invalid X coordinate: Correctly caught error - CoordinateConversionError
   Invalid Y coordinate: Correctly caught error - CoordinateConversionError
   Out of bounds coordinates: Correctly caught error - CoordinateConversionError


## 6. Validation Modes Demonstration

Let's demonstrate the different validation modes:


In [10]:
print("üîç VALIDATION MODES DEMONSTRATION")
print("=" * 50)

# Test different validation modes
validation_modes = [
    (ValidationMode.STRICT, "Strict - Fails on any validation error"),
    (ValidationMode.LENIENT, "Lenient - Handles missing data gracefully"),
    (ValidationMode.PERMISSIVE, "Permissive - Accepts any data"),
]

for mode, description in validation_modes:
    print(f"\nüìã {description}")

    # Create config with specific validation mode
    test_config = ETLConfig(validation_mode=mode)

    try:
        # Try extraction with this validation mode
        test_result = await extract_and_transform_siren(DEMO_SIREN, client, test_config)
        print(f"   ‚úÖ Success with {mode.value} validation")
        print(f"   üìä Company: {test_result.company.name}")
        print(f"   üìä Facilities: {len(test_result.facilities)}")
    except Exception as e:
        print(f"   ‚ùå Failed with {mode.value} validation: {type(e).__name__}")
        print(f"   üìù Error: {str(e)[:100]}...")

print("\nüí° Note: All modes should work with valid data, but handle errors differently")

üîç VALIDATION MODES DEMONSTRATION

üìã Strict - Fails on any validation error
   ‚úÖ Success with strict validation
   üìä Company: SOCIETE NATIONALE SNCF
   üìä Facilities: 9179

üìã Lenient - Handles missing data gracefully
   ‚úÖ Success with lenient validation
   üìä Company: SOCIETE NATIONALE SNCF
   üìä Facilities: 9179

üìã Permissive - Accepts any data
   ‚úÖ Success with permissive validation
   üìä Company: SOCIETE NATIONALE SNCF
   üìä Facilities: 9179

üí° Note: All modes should work with valid data, but handle errors differently


## 7. Data Serialization and Export

Let's demonstrate how to serialize and export the transformed data:


In [11]:
if result:
    print("üíæ DATA SERIALIZATION AND EXPORT")
    print("=" * 50)

    # Convert to dictionary
    data_dict = result.model_dump()
    print(f"‚úÖ Converted to dictionary: {len(str(data_dict))} characters")

    # Convert to JSON
    json_data = result.model_dump_json(indent=2)
    print(f"‚úÖ Converted to JSON: {len(json_data)} characters")

    # Show sample JSON structure
    print("\nüìÑ Sample JSON Structure:")
    sample_data = {
        "company": {
            "name": result.company.name,
            "identifiers": [
                {
                    "scheme": result.company.identifiers[0].scheme,
                    "value": result.company.identifiers[0].value,
                }
            ]
            if result.company.identifiers
            else [],
        },
        "facilities": [
            {"name": facility.name, "parent_siren": facility.parent_siren}
            for facility in result.facilities[:2]
        ],
        "addresses": [
            {
                "locality": address.locality,
                "coordinates": [address.longitude, address.latitude],
            }
            for address in result.addresses[:2]
            if address.longitude
        ],
    }

    print(json.dumps(sample_data, indent=2, ensure_ascii=False))

    # Demonstrate data export for different use cases
    print("\nüì§ Export Options:")
    print("   ‚Ä¢ JSON: For API responses and data exchange")
    print("   ‚Ä¢ Dictionary: For Python processing")
    print("   ‚Ä¢ CSV: For spreadsheet analysis (requires pandas)")
    print("   ‚Ä¢ Database: For Django model creation")

    # Show extraction metadata
    print("\nüìä Extraction Metadata:")
    for key, value in result.extraction_metadata.items():
        print(f"   ‚Ä¢ {key}: {value}")
else:
    print("‚ùå No data available for serialization")

üíæ DATA SERIALIZATION AND EXPORT
‚úÖ Converted to dictionary: 54099874 characters
‚úÖ Converted to JSON: 66731480 characters

üìÑ Sample JSON Structure:
{
  "company": {
    "name": "SOCIETE NATIONALE SNCF",
    "identifiers": [
      {
        "scheme": "siren",
        "value": "552049447"
      }
    ]
  },
  "facilities": [
    {
      "name": "Unknown Facility",
      "parent_siren": "552049447"
    },
    {
      "name": "Unknown Facility",
      "parent_siren": "552049447"
    }
  ],
  "addresses": [
    {
      "locality": "PARIS",
      "coordinates": [
        2.3274850000000002,
        48.87679700000001
      ]
    },
    {
      "locality": "LA CHAPELLE-SUR-ERDRE",
      "coordinates": [
        -1.5516809999999974,
        47.29907699999999
      ]
    }
  ]
}

üì§ Export Options:
   ‚Ä¢ JSON: For API responses and data exchange
   ‚Ä¢ Dictionary: For Python processing
   ‚Ä¢ CSV: For spreadsheet analysis (requires pandas)
   ‚Ä¢ Database: For Django model creation



## 8. Error Handling Demonstration

Let's demonstrate how the ETL service handles various error scenarios:


In [12]:
print("‚ö†Ô∏è  ERROR HANDLING DEMONSTRATION")
print("=" * 50)

# Test various error scenarios
error_scenarios = [
    ("000000000", "Invalid SIREN format"),
    ("123456789", "Non-existent SIREN"),
    ("", "Empty SIREN"),
    ("abc123def", "Non-numeric SIREN"),
]

for test_siren, description in error_scenarios:
    print(f"\nüß™ Testing: {description} (SIREN: '{test_siren}')")

    try:
        error_result = await extract_and_transform_siren(test_siren, client, config)
        print(f"   ‚úÖ Unexpected success: {error_result.company.name}")
    except Exception as e:
        print(f"   ‚ùå Correctly caught error: {type(e).__name__}")
        print(f"   üìù Error message: {str(e)[:100]}...")

print("\nüí° The ETL service provides comprehensive error handling for:")
print("   ‚Ä¢ Invalid SIREN formats")
print("   ‚Ä¢ Non-existent companies")
print("   ‚Ä¢ API connectivity issues")
print("   ‚Ä¢ Data validation errors")
print("   ‚Ä¢ Coordinate conversion failures")

Failed to extract company data for SIREN 000000000: No company data found for SIREN: 000000000
Failed to extract SIREN 000000000: Failed to extract company data for SIREN 000000000: No company data found for SIREN: 000000000
ETL process failed for SIREN 000000000: Failed to extract SIREN 000000000: Failed to extract company data for SIREN 000000000: No company data found for SIREN: 000000000


‚ö†Ô∏è  ERROR HANDLING DEMONSTRATION

üß™ Testing: Invalid SIREN format (SIREN: '000000000')
   ‚ùå Correctly caught error: ExtractionError
   üìù Error message: Failed to extract SIREN 000000000: Failed to extract company data for SIREN 000000000: No company da...

üß™ Testing: Non-existent SIREN (SIREN: '123456789')


Failed to extract company data for SIREN 123456789: No company data found for SIREN: 123456789
Failed to extract SIREN 123456789: Failed to extract company data for SIREN 123456789: No company data found for SIREN: 123456789
ETL process failed for SIREN 123456789: Failed to extract SIREN 123456789: Failed to extract company data for SIREN 123456789: No company data found for SIREN: 123456789


   ‚ùå Correctly caught error: ExtractionError
   üìù Error message: Failed to extract SIREN 123456789: Failed to extract company data for SIREN 123456789: No company da...

üß™ Testing: Empty SIREN (SIREN: '')
   ‚ùå Correctly caught error: ValueError
   üìù Error message: Invalid SIREN format: . Must be 9 digits....

üß™ Testing: Non-numeric SIREN
   ‚ùå Correctly caught error: ValueError
   üìù Error message: Invalid SIREN format: abc123def. Must be 9 digits....

üí° The ETL service provides comprehensive error handling for:
   ‚Ä¢ Invalid SIREN formats
   ‚Ä¢ Non-existent companies
   ‚Ä¢ API connectivity issues
   ‚Ä¢ Data validation errors
   ‚Ä¢ Coordinate conversion failures


## 9. Performance Analysis

Let's analyze the performance characteristics of the ETL service:


In [13]:
import time

print("‚ö° PERFORMANCE ANALYSIS")
print("=" * 50)

if result:
    # Measure extraction time
    start_time = time.time()

    try:
        perf_result = await extract_and_transform_siren(DEMO_SIREN, client, config)
        end_time = time.time()

        extraction_time = end_time - start_time

        print(f"‚è±Ô∏è  Extraction Time: {extraction_time:.2f} seconds")
        print("üìä Data Volume:")
        print("   ‚Ä¢ Company records: 1")
        print(f"   ‚Ä¢ Facility records: {len(perf_result.facilities)}")
        print(f"   ‚Ä¢ Legal periods: {len(perf_result.legal_unit_periods)}")
        print(f"   ‚Ä¢ Establishment periods: {len(perf_result.establishment_periods)}")
        print(f"   ‚Ä¢ Address records: {len(perf_result.addresses)}")
        print(
            f"   ‚Ä¢ Activity classifications: {len(perf_result.activity_classifications)}"
        )
        print(f"   ‚Ä¢ Registry records: {len(perf_result.registry_records)}")

        # Calculate throughput
        total_records = (
            1
            + len(perf_result.facilities)
            + len(perf_result.legal_unit_periods)
            + len(perf_result.establishment_periods)
            + len(perf_result.addresses)
        )

        records_per_second = (
            total_records / extraction_time if extraction_time > 0 else 0
        )

        print("\nüìà Performance Metrics:")
        print(f"   ‚Ä¢ Total records processed: {total_records}")
        print(f"   ‚Ä¢ Records per second: {records_per_second:.1f}")
        print(
            f"   ‚Ä¢ Average time per record: {extraction_time / total_records * 1000:.1f}ms"
        )

    except Exception as e:
        print(f"‚ùå Performance test failed: {e}")
else:
    print("‚ùå No data available for performance analysis")

print("\nüí° Performance characteristics:")
print("   ‚Ä¢ Async/await for non-blocking API calls")
print("   ‚Ä¢ Efficient coordinate conversion with pyproj")
print("   ‚Ä¢ Optimized Pydantic model validation")
print("   ‚Ä¢ Configurable retry and timeout settings")

‚ö° PERFORMANCE ANALYSIS
‚è±Ô∏è  Extraction Time: 62.89 seconds
üìä Data Volume:
   ‚Ä¢ Company records: 1
   ‚Ä¢ Facility records: 9179
   ‚Ä¢ Legal periods: 10
   ‚Ä¢ Establishment periods: 43628
   ‚Ä¢ Address records: 9179
   ‚Ä¢ Activity classifications: 16
   ‚Ä¢ Registry records: 9180

üìà Performance Metrics:
   ‚Ä¢ Total records processed: 61997
   ‚Ä¢ Records per second: 985.8
   ‚Ä¢ Average time per record: 1.0ms

üí° Performance characteristics:
   ‚Ä¢ Async/await for non-blocking API calls
   ‚Ä¢ Efficient coordinate conversion with pyproj
   ‚Ä¢ Optimized Pydantic model validation
   ‚Ä¢ Configurable retry and timeout settings
