# 📒 Notebook : 07_checks_tables

---

## 📝 Objectif

- Lister les tables, les colonnes de chaque table et tester les jointures sémantiques.
- Statistiques et checks rapides.
- Permettre la requête SQL directe sur les dimension et la table des faits.

---

## 👤 Auteur(s) / Contact

- SEKARI Inès — [ines.sekari@efrei.net]
- NKUIDA Malaïka - [malaika.nkuida@efrei.net]

---

## 🗓️ Versioning & Mise à jour

| Version | Date        | Modifications                          |
|---------|-------------|----------------------------------------|
| 1.0     | 2025-05-05  | Création du script de vérification des tables        |


In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_hopital LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 9, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 034b6559-55e9-4a1a-a504-bae37f65f887)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_maladie LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 8, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 2ea3d5c5-355e-4f27-8644-615804728b6f)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_medicament LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 10, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 607c2efc-72a9-4fa2-9f24-ee7d98b8097b)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_motif_admission LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 11, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 69a77933-7a89-40d8-9d86-9dad8cab1512)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_patient LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 12, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 732abfe1-262f-4980-bbc0-b2f4ae24637d)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_dim_temps LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 13, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, a89753fd-1f31-4a4f-bcae-5f2e2098bcd2)

In [None]:
df = spark.sql("SELECT * FROM Health_lakehouse.gold_fact_consultation LIMIT 1000")
display(df)

StatementMeta(, ae825166-2348-4898-9fbe-bf6c1f578e74, 14, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 90427e02-19d9-440c-9e14-304ca176675e)

In [9]:
spark.sql("SHOW TABLES IN health_lakehouse").show(truncate=False)

tables = [
    "gold_dim_patient",
    "gold_dim_maladie",
    "gold_dim_hopital",
    "gold_dim_temps",
    "gold_dim_medicament",
    "gold_dim_motif_admission",
    "gold_fact_consultation",
]

for t in tables:
    print(f"\n===== Schéma de {t} =====")
    spark.sql(f"DESCRIBE TABLE Health_lakehouse.{t}").show(truncate=False)

spark.sql("""
SELECT f.consultation_id, f.patient_id, h.hopital_name
FROM Health_lakehouse.gold_fact_consultation f
LEFT JOIN Health_lakehouse.gold_dim_hopital h ON f.hopital_nom = h.hopital_id
LIMIT 10
""").show()

for t in tables:
    print(f"\nTable: {t}")
    spark.sql(f"SELECT COUNT(*) FROM health_lakehouse.{t}").show()
    # Optionnel : afficher les 5 premières lignes
    spark.sql(f"SELECT * FROM health_lakehouse.{t} LIMIT 5").show()


StatementMeta(, b62436fd-7230-44d2-97b1-4b9e3fac2ecf, 11, Finished, Available, Finished)

+----------------+------------------------+-----------+
|namespace       |tableName               |isTemporary|
+----------------+------------------------+-----------+
|health_lakehouse|bronze_maladie          |false      |
|health_lakehouse|bronze_icd_code         |false      |
|health_lakehouse|bronze_medicament       |false      |
|health_lakehouse|bronze_motif_admission  |false      |
|health_lakehouse|bronze_patient          |false      |
|health_lakehouse|bronze_meds_code        |false      |
|health_lakehouse|bronze_motifs_code      |false      |
|health_lakehouse|silver_icd_code         |false      |
|health_lakehouse|silver_meds_code        |false      |
|health_lakehouse|silver_motifs_code      |false      |
|health_lakehouse|silver_hopital          |false      |
|health_lakehouse|gold_dim_patient        |false      |
|health_lakehouse|gold_dim_maladie        |false      |
|health_lakehouse|gold_dim_medicament     |false      |
|health_lakehouse|gold_dim_hopital        |false