# Capitaux 02: AZEC Capitaux Processor Testing

**Purpose**: Test AZEC capital extraction from CAPITXCU and INCENDCU data

**Tests**:
1. Read CAPITXCU and INCENDCU bronze data
2. Extract capital columns (capx_100, capx_cua, smp_sre)
3. Aggregate PE/RD from INCENDCU (mt_baspe, mt_basdi)

---

In [1]:
import sys
from pathlib import Path

project_root = Path.cwd().parent.parent
sys.path.insert(0, str(project_root))
print(f"Project root: {project_root}")

Project root: /workspace/new_python


In [2]:
from pyspark.sql import SparkSession
# from azfr_fsspec_utils import fspath
# import azfr_fsspec_abfs

# azfr_fsspec_abfs.use()

spark = SparkSession.builder \
    .appName("Capitaux_AZEC_Testing") \
    .getOrCreate()

print(f"✓ Spark {spark.version}")

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/12/17 20:15:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


✓ Spark 3.4.4


## 1. Load Configuration

In [3]:
from utils.loaders.config_loader import ConfigLoader
from src.reader import BronzeReader

config = ConfigLoader(str(project_root / "config" / "config.yml"))
bronze_reader = BronzeReader(
    spark, config,
    str(project_root / "config" / "reading_config.json")
)

VISION = "202509"
print(f"Vision: {VISION}")

Vision: 202509


## 2. Read CAPITXCU Data

In [4]:
try:
    df_capitxcu = bronze_reader.read_file_group('capitxcu_azec', VISION)
    print(f"✓ CAPITXCU: {df_capitxcu.count():,} rows")
    print(f"  Columns: {df_capitxcu.columns}")
    df_capitxcu.show(3)
except Exception as e:
    print(f"⚠ CAPITXCU not available: {e}")
    df_capitxcu = None

✓ CAPITXCU: 1,600 rows
  Columns: ['police', 'produit', 'smp_sre', 'brch_rea', 'capx_100', 'capx_cua', '_source_file']
+---------+-------+-------+--------+---------+---------+--------------------+
|   police|produit|smp_sre|brch_rea| capx_100| capx_cua|        _source_file|
+---------+-------+-------+--------+---------+---------+--------------------+
|AZ0000465|    A00|    LCI|     ID0|337013.87| 59449.65|file:///workspace...|
|AZ0000136|    A00|    SMP|     ID0|458517.31| 90660.08|file:///workspace...|
|AZ0000253|    A00|    SMP|     ID0|257900.83|289002.02|file:///workspace...|
+---------+-------+-------+--------+---------+---------+--------------------+
only showing top 3 rows



## 3. Read INCENDCU Data

In [5]:
try:
    df_incendcu = bronze_reader.read_file_group('incendcu_azec', VISION)
    print(f"✓ INCENDCU: {df_incendcu.count():,} rows")
    print(f"  Columns: {df_incendcu.columns}")
    df_incendcu.show(3)
except Exception as e:
    print(f"⚠ INCENDCU not available: {e}")
    df_incendcu = None

✓ INCENDCU: 600 rows
  Columns: ['police', 'produit', 'cod_naf', 'cod_tre', 'mt_baspe', 'mt_basdi', '_source_file']
+---------+-------+-------+-------+--------+--------+--------------------+
|   police|produit|cod_naf|cod_tre|mt_baspe|mt_basdi|        _source_file|
+---------+-------+-------+-------+--------+--------+--------------------+
|AZ0000784|    A00|  4120A|    T01| 10000.0| 20000.0|file:///workspace...|
|AZ0000443|    A00|  4120A|    T01| 10000.0| 20000.0|file:///workspace...|
|AZ0000482|    A00|  4120A|    T01| 10000.0| 20000.0|file:///workspace...|
+---------+-------+-------+-------+--------+--------+--------------------+
only showing top 3 rows



## 4. Test Capital Aggregation (INCENDCU)

In [6]:
from pyspark.sql.functions import col, sum as spark_sum

if df_incendcu is not None:
    # Aggregate PE and RD by policy
    df_pe_rd = df_incendcu.groupBy('police', 'produit').agg(
        spark_sum('mt_baspe').alias('perte_exp_azec'),
        spark_sum('mt_basdi').alias('risque_direct_azec')
    )
    
    print(f"✓ Aggregated PE/RD by policy: {df_pe_rd.count():,} policies")
    df_pe_rd.show(5)
else:
    print("⚠ Skipping aggregation - no INCENDCU data")

✓ Aggregated PE/RD by policy: 440 policies
+---------+-------+--------------+------------------+
|   police|produit|perte_exp_azec|risque_direct_azec|
+---------+-------+--------------+------------------+
|AZ0000527|    A00|       20000.0|           40000.0|
|AZ0000772|    A00|       10000.0|           20000.0|
|AZ0000761|    A00|       20000.0|           40000.0|
|AZ0000028|    A00|       10000.0|           20000.0|
|AZ0000220|    A00|       10000.0|           20000.0|
+---------+-------+--------------+------------------+
only showing top 5 rows



## Summary

In [7]:
print("="*60)
print("AZEC CAPITAUX TESTING COMPLETE")
print("="*60)
print(f"CAPITXCU: {'Available' if df_capitxcu is not None else 'Not available'}")
print(f"INCENDCU: {'Available' if df_incendcu is not None else 'Not available'}")
print("\n→ Next: Notebook 03 - Consolidation")

AZEC CAPITAUX TESTING COMPLETE
CAPITXCU: Available
INCENDCU: Available

→ Next: Notebook 03 - Consolidation
