# 3. Health Data: CNES & Infrastructure

**The Problem:** The CNES database is notoriously complex. It has hundreds of columns for specific equipment ("MRI machines", "Pediatric ICU beds"). Also, many units lack coordinates.

**The Solution:**
1.  **Semantic Groups:** AtlasBR aggregates hundreds of columns into readable concepts (e.g., `total_leitos_internacao`).
2.  **Fallback Geocoding:** If a hospital has no lat/lon, we find it via its CEP (Postal Code).



In [6]:
import sys
import os
from pathlib import Path

# --- DEVELOPER SETUP (Optional) ---
# If running locally without 'pip install', we add the '../src' folder to path.
current_path = Path(os.getcwd())
if current_path.name == "tutorials":
    # Go up one level to root, then into 'src' (if using src-layout) or just root (flat-layout)
    root_dir = current_path.parent
    src_dir = root_dir / "src"
    
    if src_dir.exists():
        sys.path.append(str(src_dir))
    else:
        sys.path.append(str(root_dir))

import atlasbr

In [7]:
from atlasbr.app.cnes import load_cnes

atlasbr.configure_logging()
MUNICIPALITY = "Rio de Janeiro, RJ"

## 1. Load with Geocoding
We enable `geocode=True`. This tells AtlasBR: *"If you can't find the lat/lon, look up the CEP centroid."*

In [8]:
gdf_health = load_cnes(
    places=[MUNICIPALITY],
    year=2023,
    month=9,      # CNES is monthly!
    geocode=True, # <--- The magic fix for missing coordinates
    gcp_billing=os.getenv("GCLOUD_PROJECT_ID")
)

print(f"üè• Loaded {len(gdf_health)} health establishments.")


2025-12-16 23:12:30,132 - atlasbr - INFO -     ‚ÑπÔ∏è  Resolved 'Rio de Janeiro, RJ' -> 3304557
2025-12-16 23:12:30,154 - atlasbr - INFO -     üè• Fetching CNES 9/2023 from Base dos Dados...


Downloading: 100%|[32m‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà[0m|

2025-12-16 23:12:33,883 - atlasbr - INFO -     üìç Fetching CEP coordinates from Base dos Dados...



Downloading: 100%|[32m‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà[0m|

2025-12-16 23:12:37,992 - atlasbr - INFO -     üåç Geocoding 597 healthcare units via CEP...





2025-12-16 23:12:38,186 - atlasbr - INFO - ‚úÖ Loaded 597 CNES units (Geolocated).


üè• Loaded 597 health establishments.


## 2. Decoding Complexity
Instead of raw codes, look at the `complexidade` and aggregated columns.
We can easily filter for **High Complexity** units (Major Hospitals).


In [14]:
major_hospitals = gdf_health.loc[
    (gdf_health["complexidade"] == "alta") | 
    (gdf_health["total_leitos_internacao"] > 50)
].copy()

print(f"Found {len(major_hospitals)} major facilities.")
display(major_hospitals[[
    "id_estabelecimento_cnes", 
    "total_leitos_internacao", 
    "total_salas_cirurgicas_obstetricas",
    "quantidade_trabalhadores_saude"
]].head())


Found 175 major facilities.


Unnamed: 0,id_estabelecimento_cnes,total_leitos_internacao,total_salas_cirurgicas_obstetricas,quantidade_trabalhadores_saude
2,6664040,0,0,0
11,6804209,0,0,0
27,9448047,0,0,0
29,5510341,94,3,0
30,9567933,75,9,0


## 3. Interactive Map: Hospital Capacity
Circle size represents the number of beds.
This instantly reveals the "Health Hubs" of the city.

In [10]:

major_hospitals.explore(
    column="total_leitos_internacao",
    cmap="viridis",
    scheme="naturalbreaks",
    tooltip=[
        "id_estabelecimento_cnes", 
        "complexidade", 
        "total_leitos_internacao"
    ],
    style_kwds={
        "style_function": lambda x: {
            "radius": x["properties"]["total_leitos_internacao"] / 10 + 2
        }
    },
    tiles="CartoDB DarkMatter"
)