# Calcul du nombre de prélèvements CVM non conforme par commune et par année

L'objectif de ce notebook est de partir de la liste de communes cog_communes, et pour chaque commune et chaque année, calculer le nombre de prélèvements non conformes pour le CVM.

Il y aura plusieurs aggrégations à faire :

- commune (inseecommune) peut avoir plusieurs UDIs (cdreseau)
- un prélèvement (referenceprel) peut être rattaché à plusieurs UDIs (cdreseau)
- un prélèvement (referenceprel) peut être composé de plusieurs paramètres (cdparametresiseeaux) ; mais dans le cas du CVM, il y a un seul paramètre selon la catégorisation de Pauline, donc c'est plus simple

La résultat final est dans la dernière cellule. Toutes les cellules précédentes sont là pour aider à la compréhension, en présentant des résultats intermédiaires.


In [1]:
%load_ext sql
%sql duckdb:///../../database/data.duckdb
%config SqlMagic.displaylimit = 10

RuntimeError: (duckdb.duckdb.IOException) IO Error: Could not set lock on file "/Users/jgreze/git/13_pollution_eau/analytics/notebooks/../../database/data.duckdb": Conflicting lock is held in /opt/homebrew/Cellar/python@3.12/3.12.5/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python (PID 49878) by user jgreze. See also https://duckdb.org/docs/connect/concurrency
(Background on this error at: https://sqlalche.me/e/20/e3q8)
If you need help solving this issue, send us a message: https://ploomber.io/community


In [None]:
%sqlcmd tables

In [None]:
%sql select * from cog_communes

In [None]:
%sql select *  from edc_communes

In [None]:
%sql select *  from edc_prelevements

In [None]:
%%sql
    select *
    from "edc_resultats"
    where cdparametresiseeaux = 'CLVYL' and valtraduite > 0.5

# on liste l'ensemble des analyses non conformes pour le paramètre CVM

In [None]:
%%sql

with "resultats_cvm" as (
    select
      *,
      (CASE WHEN valtraduite > 0.5 THEN 1 ELSE 0 END) AS is_non_conforme
    from "edc_resultats"
    where cdparametresiseeaux = 'CLVYL'
),
"prelevements_cvm" as (
    select
        "cdreseau",
        "resultats_cvm"."de_partition",
        SUM(is_non_conforme) as "nbr_resultats_non_conformes",
        count(*) as "nbr_resultats_total",
        string_agg(CASE WHEN is_non_conforme = 1 THEN "resultats_cvm"."referenceprel" ELSE null END) as "list_referenceprels_non_conformes",
        string_agg(CASE WHEN is_non_conforme = 1 THEN "valtraduite" ELSE null END) as "list_valtraduite_non_conformes"
    from "resultats_cvm"
    left join "edc_prelevements" on
        "edc_prelevements"."referenceprel" = "resultats_cvm"."referenceprel"
        and
        "edc_prelevements"."de_partition" = "resultats_cvm"."de_partition"
    group by "cdreseau", "resultats_cvm"."de_partition"
)
select * from "prelevements_cvm" where "nbr_resultats_non_conformes" > 0

In [None]:
%%sql

with "resultats_cvm" as (
    select
      *,
      (CASE WHEN valtraduite > 0.5 THEN 1 ELSE 0 END) AS is_non_conforme
    from "edc_resultats"
    where cdparametresiseeaux = 'CLVYL'
),
"prelevements_cvm" as (
    select
        "cdreseau",
        "resultats_cvm"."de_partition",
        SUM(is_non_conforme) as "nbr_resultats_non_conformes",
        count(*) as "nbr_resultats_total"
    from "resultats_cvm"
    left join "edc_prelevements" on
        "edc_prelevements"."referenceprel" = "resultats_cvm"."referenceprel"
        and
        "edc_prelevements"."de_partition" = "resultats_cvm"."de_partition"
    group by "cdreseau", "resultats_cvm"."de_partition"
),
"communes_cvm" as (
    select
        "inseecommune",
        "edc_communes"."de_partition",
        coalesce(sum("nbr_resultats_non_conformes"), 0) as "nbr_resultats_non_conformes",
        coalesce(sum("nbr_resultats_total"), 0) as "nbr_resultats_total",
        case
            when sum("nbr_resultats_non_conformes") > 0 then 'non conforme'
            when sum("nbr_resultats_total") > 0 then 'conforme'
            else 'non analysé'
        end as "resultat"
    from "edc_communes"
    left join "prelevements_cvm" on
        "prelevements_cvm"."cdreseau" = "edc_communes"."cdreseau"
        and
        "prelevements_cvm"."de_partition" = "edc_communes"."de_partition"
    group by "inseecommune", "edc_communes"."de_partition"
)
select * from communes_cvm where "nbr_resultats_non_conformes" > 0


In [None]:
%%sql

with "resultats_cvm" as (
    select
      *,
      (CASE WHEN valtraduite > 0.5 THEN 1 ELSE 0 END) AS is_non_conforme
    from "edc_resultats"
    where cdparametresiseeaux = 'CLVYL'
),
"prelevements_cvm" as (
    select
        "cdreseau",
        "resultats_cvm"."de_partition",
        SUM(is_non_conforme) as "nbr_resultats_non_conformes",
        count(*) as "nbr_resultats_total"
    from "resultats_cvm"
    left join "edc_prelevements" on
        "edc_prelevements"."referenceprel" = "resultats_cvm"."referenceprel"
        and
        "edc_prelevements"."de_partition" = "resultats_cvm"."de_partition"
    group by "cdreseau", "resultats_cvm"."de_partition"
),
"communes_cvm" as (
    select
        "inseecommune",
        "edc_communes"."de_partition",
        coalesce(sum("nbr_resultats_non_conformes"), 0) as "nbr_resultats_non_conformes",
        coalesce(sum("nbr_resultats_total"), 0) as "nbr_resultats_total",
        case
            when sum("nbr_resultats_non_conformes") > 0 then 'non conforme'
            when sum("nbr_resultats_total") > 0 then 'conforme'
            else 'non analysé'
        end as "resultat"
    from "edc_communes"
    left join "prelevements_cvm" on
        "prelevements_cvm"."cdreseau" = "edc_communes"."cdreseau"
        and
        "prelevements_cvm"."de_partition" = "edc_communes"."de_partition"
    group by "inseecommune", "edc_communes"."de_partition"
),
"annees" as (SELECT unnest(generate_series(2020, 2024)) as "annee")
select
    "cog"."COM" as "commune_code_insee",
    "cog"."LIBELLE" as "commune_nom",
    a."annee",
    coalesce("resultat", 'non analysé') as "resultat_cvm"
from "cog_communes" as "cog"
cross join 
    "annees" a
left join "communes_cvm" on
   "cog"."COM" = "communes_cvm"."inseecommune"
   and
   a."annee"::string =  "communes_cvm"."de_partition"

# pour tester une commune avec un prélèvement non conforme, ajouter :
# where "commune_code_insee" = '07194'