# Population density per sq. km by region and year

## Data source
[Open interactive table](https://www.statistikdatabasen.scb.se/pxweb/en/ssd/START__BE__BE0101__BE0101C/BefArealTathetKon/table/tableViewLayout1/)

## What this notebook does
- Fetches population density (per sq. km) for all Swedish counties for the latest available year from SCB (PXWeb).
- Converts the response into a tidy `pandas` DataFrame.
- Saves the result to `data/Pop_density.csv`.


## Setup

This section imports dependencies and defines where outputs are written.


In [9]:
# Imports
from pathlib import Path  # file system paths

import pandas as pd  # data wrangling
from pyscbwrapper import SCB  # SCB PXWeb API wrapper

In [10]:
# Paths

ROOT = Path.cwd().resolve().parents[1]

# Output directory for exported files
data_dir = ROOT / "data"
data_dir.mkdir(parents=True, exist_ok=True)

## SCB table and variables

### Table ID
The `TABLE` tuple identifies the PXWeb table to query.

### Variables
`scb.info()` / `scb.get_variables()` list available regions, years, and observations.


In [11]:
# SCB PXWeb table identifier
TABLE = ("en", "BE", "BE0101", "BE0101C", "BefArealTathetKon")

In [12]:
# Initialize a client for this table
scb = SCB(*TABLE)

In [13]:
# Inspect table metadata (dimensions, available values)
scb.info()

{'title': 'Population density per sq. km by region, sex, observations and year',
 'variables': [{'code': 'Region',
   'text': 'region',
   'values': ['00',
    '01',
    '0114',
    '0115',
    '0117',
    '0120',
    '0123',
    '0125',
    '0126',
    '0127',
    '0128',
    '0136',
    '0138',
    '0139',
    '0140',
    '0160',
    '0162',
    '0163',
    '0180',
    '0181',
    '0182',
    '0183',
    '0184',
    '0186',
    '0187',
    '0188',
    '0191',
    '0192',
    '03',
    '0305',
    '0319',
    '0330',
    '0331',
    '0360',
    '0380',
    '0381',
    '0382',
    '04',
    '0428',
    '0461',
    '0480',
    '0481',
    '0482',
    '0483',
    '0484',
    '0486',
    '0488',
    '05',
    '0509',
    '0512',
    '0513',
    '0560',
    '0561',
    '0562',
    '0563',
    '0580',
    '0581',
    '0582',
    '0583',
    '0584',
    '0586',
    '06',
    '0604',
    '0617',
    '0642',
    '0643',
    '0662',
    '0665',
    '0680',
    '0682',
    '0683',
    '0684',
  

In [14]:
# Fetch variable values as a dict (used to build filters)
var_ = scb.get_variables()

## Build the query

### Filters
- Regions: counties only
- Year: latest available
- Observation: Population density per sq. km


In [15]:
# Filters
# Keep only county-level regions (exclude total, municipalities, etc.)
counties = [r for r in var_["region"] if "county" in r.lower()]

# Choose the latest available year (SCB returns years as strings)
year_obs = max(var_["year"], key=int)


In [16]:
# Build query (match exact variable names from `scb.info()` / `scb.get_variables()`)

scb.set_query(
    region=counties,
    observations=["Population density per sq. km"],
    year=year_obs,
)

In [17]:
# Execute query and extract the observations list
scb_data = scb.get_data()
scb_fetch = scb_data["data"]

In [18]:
# Map region codes (e.g., "01") to readable county names
codes = scb.get_query()["query"][0]["selection"]["values"]

counties_dict = dict(zip(codes, counties))

## Transform response
Convert the PXWeb response into a tidy table with one row per county (for the selected year).

### Output columns
- `code`: region code
- `county`: county name
- `year`: observation year
- `PopDen`: population density per sq. km


In [19]:
# Convert the SCB response to a tidy DataFrame (one row per county)

records = []

for r in scb_fetch:
    # r["key"] contains [region_code, year, ...] depending on the table
    code, year = r["key"][:2]
    name = counties_dict.get(code, code)
    value = r["values"][0]
    records.append({
        "code": code,
        "county": name,
        "year": year,
        "PopDen": value
    })

df = pd.DataFrame(records)

# Optional: convert PopDen to numeric for analysis/plotting
# df["PopDen"] = pd.to_numeric(df["PopDen"], errors="coerce")


## Preview and export
Preview a few rows, then write the final CSV.


In [20]:
# Preview the final table
df.head(10)

Unnamed: 0,code,county,year,PopDen
0,1,Stockholm county,2024,377.7
1,3,Uppsala county,2024,49.6
2,4,Södermanland county,2024,49.5
3,5,Östergötland county,2024,44.6
4,6,Jönköping county,2024,35.3
5,7,Kronoberg county,2024,24.0
6,8,Kalmar county,2024,22.0
7,9,Gotland county,2024,19.4
8,10,Blekinge county,2024,53.4
9,12,Skåne county,2024,129.8


In [21]:
# Export to CSV
df.to_csv(data_dir / "Pop_density.csv", index=False)

## Output
The resulting file is saved as `data/Pop_density.csv`.
