| [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](../LICENSE) | [![Python](https://img.shields.io/badge/Python-3.10+-black.svg)](https://www.python.org/) | [![Jupyter](https://img.shields.io/badge/Jupyter-Notebook-red.svg)](https://jupyter.org/) | [![SQLite](https://img.shields.io/badge/Database-SQLite-darkblue.svg)](https://www.sqlite.org/index.html) | [![Pandas](https://img.shields.io/badge/Data-Pandas-purple.svg)](https://pandas.pydata.org/) | [![Plotly](https://img.shields.io/badge/Plots-Plotly-darkorange.svg)](https://plotly.com/python/) | [![Requests](https://img.shields.io/badge/HTTP-Requests-darkred.svg)](https://docs.python-requests.org/) | [![JSON](https://img.shields.io/badge/Data-JSON-grey.svg)](https://www.json.org/) | [![Pathlib](https://img.shields.io/badge/FS-Pathlib-black.svg)](https://docs.python.org/3/library/pathlib.html) |
|---|---|---|---|---|---|---|---|---|


## Notebook 1 - Data Collection, Inspection and Storage  

**LuftDataQC: PM2.5 raw data from NILU API (2023)**  
Source: [https://api.nilu.no](https://api.nilu.no)  

---

### EN: Project overview  
**Goal:** Fetch 2023 hourly PM2.5 data from NILU for all available stations in Norway, inspect per-station coverage, and persist raw data.  
**Method:** HTTP requests ‚Üí JSON ‚Üí pandas ‚Üí SQLite.  
**Tools:** Python (`requests`, `json`, `pathlib`, `pandas`, `sqlite3`, `plotly`)  

### NO: Prosjektoversikt  
**M√•l:** Hente timesvise PM2.5-data (2023) fra NILU for alle tilgjengelige stasjoner i Norge, inspisere dekning per stasjon og lagre r√•data.  
**Metode:** HTTP-foresp√∏rsler ‚Üí JSON ‚Üí pandas ‚Üí SQLite.  
**Verkt√∏y:** Python (`requests`, `json`, `pathlib`, `pandas`, `sqlite3`, `plotly`)  

---

### Reproducibility - quick reference | Reproduserbarhet - hurtigoversikt  

**Outputs (this notebook):**  
- `data/raw/nilu_pm25_<station>_2023.json` *(one file per station | √©n fil per stasjon)*  
- `data/processed/pm25_2023.sqlite` *(SQLite database | SQLite-database)*  
- `results/pm25_station_coverage_2023.html` *(interactive coverage chart | interaktivt dekningsdiagram)*  
- `results/pm25_station_coverage_2023.png` *(static export via kaleido | statisk eksport via kaleido)*  
**Parameters (this notebook):**  
- **Year** = 2023  
- **Pollutant** = PM2.5  
- Station count and coverage are dynamically computed, no hard-coded values.  

In [None]:
# (EN) Reproducibility parameters and paths
# (NO) Reproduserbarhetsparametere og stier

YEAR = "2023"
COMPONENT = "PM2.5"

from pathlib import Path

# Define project paths (relative to /notebooks)
PROJECT_ROOT = Path.cwd().parent
RAW_DIR = PROJECT_ROOT / "data" / "raw"
PROCESSED_DIR = PROJECT_ROOT / "data" / "processed"
RESULT_DIR = PROJECT_ROOT / "results"

# Ensure output folders exist (run once)
for folder in [RAW_DIR, PROCESSED_DIR, RESULT_DIR]:
    folder.mkdir(parents=True, exist_ok=True)

# Standard file paths for outputs
COVERAGE_HTML = RESULT_DIR / f"pm25_station_coverage_{YEAR}.html"
COVERAGE_PNG  = RESULT_DIR / f"pm25_station_coverage_{YEAR}.png"
DB_PATH       = PROCESSED_DIR / f"pm25_{YEAR}.sqlite"

In [None]:
# (EN) Import required libraries and verify data/result folders
# (NO) Importerer n√∏dvendige biblioteker og bekrefter mapper for data/resultater

# Standard library
import sqlite3

# Third-party libraries
import requests
import json
import pandas as pd
import plotly.express as px

# Confirm that folders exist
print(f"  ‚Ä¢ RAW_DIR exists:        {RAW_DIR.exists()}")
print(f"  ‚Ä¢ PROCESSED_DIR exists:  {PROCESSED_DIR.exists()}")
print(f"  ‚Ä¢ RESULT_DIR exists:     {RESULT_DIR.exists()}")

In [None]:
# EN: Fetch PM2.5 data for all stations in 2023 and save each file as JSON
# NO: Hent PM2.5 data for alle stasjoner i 2023 og lagre hver fil som JSON

# Define the date range for data retrieval
from_date = "2023-01-01"
to_date   = "2023-12-31"

# Build the API URL for NILU's historical air quality data
url = f"https://api.nilu.no/aq/historical/{from_date}/{to_date}/all"
print(f"Requesting data from NILU API:\n‚Üí {url}")

# If files already exist in RAW_DIR, skip downloading
if not any(RAW_DIR.glob(f"nilu_pm25_*_{from_date[:4]}.json")):
    try:
        # Send GET request with timeout for connection and read
        response = requests.get(url, timeout=(10, 60))

        # Check if the response is successful (HTTP 200)
        if response.status_code != 200:
            print(f"Failed to fetch data (HTTP {response.status_code}). / Kunne ikke hente data (HTTP {response.status_code}).")
        else:
            # Convert JSON response to Python list of dictionaries
            all_data = response.json()

            # Filter only PM2.5 measurements from the full dataset
            pm25_data = [entry for entry in all_data if entry.get("component") == "PM2.5"]
            print(f"Download complete ‚Äî found {len(pm25_data)} PM2.5 records. / Nedlasting fullf√∏rt ‚Äî fant {len(pm25_data)} PM2.5-observasjoner.")

            # Group measurements by station name
            station_groups = {}
            for entry in pm25_data:
                station = entry.get("station")
                if station:
                    station_groups.setdefault(station, []).append(entry)

            print(f"Stations with data: {len(station_groups)}")

            # Save one JSON file per station to the RAW_DIR folder
            for station, records in station_groups.items():
                # Replace problematic characters in station name for file naming
                clean_station = station.replace(" ", "_").replace("/", "_")
                filename = f"nilu_pm25_{clean_station}_{from_date[:4]}.json"
                file_path = RAW_DIR / filename

                try:
                    # Write station-specific data to JSON file with UTF-8 encoding
                    with open(file_path, "w", encoding="utf-8") as f:
                        json.dump(records, f, ensure_ascii=False, indent=2)
                    print(f"Saved: {file_path.name}")

                except Exception as e:
                    # Catch and report any file writing error
                    print(f"Failed to save {filename}: {e} / Kunne ikke lagre {filename}: {e}")

    except requests.exceptions.RequestException as e:
        print(f"HTTP request failed: {e} / HTTP-foresp√∏rsel feilet: {e}")

else:
    print("Skipping download ‚Äî file already exists in data/raw. / Hopper over nedlasting ‚Äî filen finnes allerede i data/raw.")


**(EN)** Found 58 JSON files with PM2.5 data, each corresponding to one NILU monitoring station in Norway for the year 2023.  All PM2.5 station datasets for 2023 were saved as separate JSON files in `data/raw/`.  
Each file name follows the pattern `nilu_pm25_<station>_2023.json`, ensuring easy station-level access.  

**(NO)** Fant 58 JSON-filer med PM2.5-data, hver tilsvarende √©n NILU m√•lestasjon i Norge for √•ret 2023. Alle PM2.5-stasjonsdatasett for 2023 ble lagret som separate JSON-filer i `data/raw/`.  
Hvert filnavn f√∏lger m√∏nsteret `nilu_pm25_<station>_2023.json`, noe som sikrer enkel tilgang p√• stasjonsniv√•.


### Quality Assurance (QA): Data Integrity Checks | Kvalitetssikring (KS): Kontroll av dataintegritet


In [None]:
# (EN) Check for empty or invalid PM2.5 JSON files after saving
# (NO) Sjekk for tomme eller ugyldige PM2.5-JSON-filer etter lagring

pm25_files = list(RAW_DIR.glob("nilu_pm25_*.json"))
empty_files = []

for file_path in pm25_files:
    try:
        with open(file_path, "r", encoding="utf-8") as f:
            data = json.load(f)
            if not data:
                empty_files.append(file_path.name)
    except Exception as e:
        print(f" Error reading {file_path.name}: {e}")
        empty_files.append(file_path.name)

# Result summary
if empty_files:
    print(f" Found {len(empty_files)} empty or unreadable files:")
    for fname in empty_files:
        print("-", fname)
else:
    print(" No empty PM2.5 files found. All files contain data.")

**(EN)** Data integrity check passed ‚Äî all downloaded PM2.5 JSON files contain valid measurement records.  
**(NO)** Dataintegritetssjekk best√•tt ‚Äî alle nedlastede PM2.5-JSON-filer inneholder gyldige m√•leresultater.


In [None]:
# (EN) Recursively search for all PM2.5 JSON files in the 'data' folder
# (NO) S√∏k rekursivt etter alle PM2.5 JSON-filer i 'data'-mappen

raw_dir = RAW_DIR
pm25_files = list(raw_dir.glob("nilu_pm25_*.json"))

# show the first 10 files 
print(f"Found {len(pm25_files)} PM2.5 files.")
for file in pm25_files[:10]:  
    print("-", file)

**(EN)** Located all PM2.5 JSON files for 2023 in the `data/raw` folder - 58 files in total.         
**(NO)** Fant alle PM2.5-JSON-filer for 2023 i mappen `data/raw` - totalt 58 filer. 


In [None]:
# (EN) Inspect one JSON file to check if the structure is as expected
# (NO) Inspiser en JSON-fil for √• kontrollere at strukturen er som forventet

for file_path in pm25_files[:1]:  # Only the first file
    print(f"\nInspecting: {file_path.name}")
    try:
        with open(file_path, "r", encoding="utf-8") as f:
            data = json.load(f)
            print(f"Type: {type(data)}, Length: {len(data)}")

            if data:
                print(f"First record keys: {list(data[0].keys())}")
                print(f"Length of 'values': {len(data[0].get('values', []))}")
            else:
                print(" File is empty (zero-length list).")
    except Exception as e:
        print(f" Error reading file: {e}")

**(EN)** The PM2.5 JSON files are structured as a list containing one dictionary per station.  
Each dictionary stores metadata (e.g., `station`, `component`, `unit`) and the`"values"`field with hourly measurements for the year.  
**(NO)** PM2.5-JSON-filene er strukturert som en liste som inneholder √©n ordbok per stasjon.  
Hver ordbok inneholder metadata (f.eks. station, component, unit) og "values"-feltet med de timesvise m√•lingene for √•ret.

Inspecting: `nilu_pm25_Vahl_skole_2023.json`  
Type: `<class 'list'>`, Lengde: `1`  
First record keys: `['id', 'zone', 'municipality', 'area', 'station', 'type', 'eoi', 'component', 'latitude', 'longitude', 'timestep', 'isVisible', 'unit', 'values']`  
Length of `"values"`: `8393`




In [None]:
# (EN) Check structure and metadata: station, component, unit, and number of values
# (NO) Sjekk struktur og metadata: stasjon, komponent, enhet og antall m√•linger

for file_path in RAW_DIR.glob("nilu_pm25_*.json"):
    with open(file_path, "r", encoding="utf-8") as f:
        data = json.load(f)

    if not isinstance(data, list) or len(data) < 1:
        print(f" Unexpected structure in: {file_path.name}")
        continue

    for record in data:
        station = record.get("station", "unknown")
        component = record.get("component", "unknown")
        unit = record.get("unit", "unknown")
        values = record.get("values", [])
        num_values = len(values)

        print(f"{file_path.name}")
        print(f"   Station: {station} | Component: {component} | Unit: {unit} | # values: {num_values}\n")
        break  

##### **(EN)**  File structure

The PM2.5 files are JSON lists, each containing a single dictionary for one monitoring station. Each dictionary contains:

* **Station name** (e.g., `"Vahl skole"`)
* **Component** (e.g., `"PM2.5 ¬µg/m¬≥"`)
* **Metadata** (location, type, timestep, unit, coordinates)
* **Values** ‚Äî a list of hourly measurements for the selected year

##### **(NO)**  Filstruktur

PM2.5-filene er JSON-lister som inneholder √©n ordbok for √©n m√•lestasjon. Hver ordbok inkluderer:

* **M√•lestasjon** (f.eks. `"Vahl skole"`)
* **Komponent** (f.eks. `"PM2.5 ¬µg/m¬≥"`)
* **Metadata** (lokasjon, type, tidssteg, enhet, koordinater)
* **Values** ‚Äî en liste med timesvise m√•linger for det valgte √•ret

**Example keys:**
`['id', 'zone', 'municipality', 'area', 'station', 'type', 'eoi', 'component', 'latitude', 'longitude', 'timestep', 'isVisible', 'unit', 'values']`

**Example file:** `nilu_pm25_Olav_V_gate_2023.json`
Station: **Olav V gate** | Component: **PM2.5** | Unit: **¬µg/m¬≥** | Number of hourly values: **8,718**

_____


### Station Coverage Summary | Oversikt over stasjonsdekning

**(EN)** The number of hourly PM2.5 records per station (2023) was counted to identify stations with sufficient coverage for further analysis.

**(NO)**  Antall timesvise PM2.5-m√•linger per stasjon (2023) ble telt for √• identifisere stasjoner med tilstrekkelig datadekning for videre analyse.


In [None]:
# (EN) Load all PM2.5 records from JSON files in the raw data folder
# (NO) Last inn alle PM2.5-data fra JSON-filer i mappen med r√•data

pm25_files = list(RAW_DIR.glob("nilu_pm25_*.json"))

if not pm25_files:
    print(" No PM2.5 files found in RAW_DIR. Please run the data download step first.")
else:
    pm25_data = []
    for file in pm25_files:
        with open(file, "r", encoding="utf-8") as f:
            records = json.load(f)
            pm25_data.extend(records)
    print(f" Loaded {len(pm25_data)} PM2.5 records from {len(pm25_files)} files.")

In [None]:
# (EN) Count the number of hourly PM2.5 values per station and display the top 10  
# (NO) Tell antall PM2.5-timer per stasjon og vis de 10 beste

# List all JSON files that start with 'nilu_pm25_'
pm25_files = list(RAW_DIR.glob("nilu_pm25_*.json"))

station_counts = []

# Loop through each file and extract station name and number of hourly values
for file_path in pm25_files:
    try:
        with open(file_path, "r", encoding="utf-8") as f:
            data = json.load(f)
            if data and isinstance(data, list):
                station = data[0].get("station", file_path.stem)
                values = data[0].get("values", [])
                station_counts.append((station, len(values)))
    except Exception as e:
        print(f" Error reading '{file_path.name}': {e}")

# Sort stations by number of values in descending order
station_counts = sorted(station_counts, key=lambda x: x[1], reverse=True)

# Print the top 10 stations with the most PM2.5 values
print(" Top 10 stations with most hourly PM2.5 values:\n" + "-"*50)
for station, count in station_counts[:10]:
    print(f"{station:<25} -> {count} records")

#### Top 10 Stations by Hourly PM2.5 Records (2023)  
*(EN) Stations with the highest number of hourly PM2.5 measurements.*  
*(NO) Stasjoner med flest timesvise PM2.5-m√•linger.*  

| Rank | Station            | Hourly Records |
|------|--------------------|----------------|
| 1    | Moheia Vest        | 8,731          |
| 2    | Danmarks plass     | 8,725          |
| 3    | Rolland, √Ösane     | 8,725          |
| 4    | Furulund           | 8,724          |
| 5    | Nedre Langgate     | 8,723          |
| 6    | Knarrdalstranda    | 8,722          |
| 7    | V√•land             | 8,722          |
| 8    | R√•dal              | 8,721          |
| 9    | Klosterhaugen      | 8,720          |
| 10   | Sk√∏yen             | 8,719          |






### Storing All PM2.5 Data in SQLite | Lagring av alle PM2.5-data i SQLite

**(EN)** To simulate a scalable and structured data architecture, all downloaded JSON files containing PM2.5 measurements for 2023 were imported into a local SQLite database. Each station's data is saved as an individual table. This structure allows the use of SQL queries in future analyses, integration with other data sources (e.g., weather or GIS), or scalable pipelines for air quality monitoring.

**(NO)** For √• simulere en skalerbar og strukturert datainfrastruktur, ble alle nedlastede JSON-filer med PM2.5-m√•linger for 2023 importert til en lokal SQLite-database. Dataene for hver stasjon er lagret som en egen tabell. Denne strukturen muliggj√∏r bruk av SQL-sp√∏rringer i fremtidige analyser, integrasjon med andre datakilder (f.eks. v√¶r eller GIS), eller skalerbare arbeidsflyter for luftkvalitetsoverv√•king.


In [None]:
# (EN) Save PM2.5 data from all stations to SQLite database
# (NO) Lagring av alle PM2.5-data i SQLite fra alle stasjoner til SQLite-database

# Paths previously defined in notebook
DB_PATH = PROCESSED_DIR / "pm25_2023.sqlite"

# Create SQLite connection
conn = sqlite3.connect(DB_PATH)

# List all PM2.5 JSON files
json_files = sorted(RAW_DIR.glob("nilu_pm25_*.json"))
print(f"Found {len(json_files)} PM2.5 files.")

# Loop through files, load into a pandas DataFrame, and save data to SQLite
for file in json_files:
    with open(file, "r", encoding="utf-8") as f:
        data = json.load(f)

    if data and isinstance(data, list) and isinstance(data[0], dict) and "values" in data[0]:
        
        # Sanitize station name for table naming
        station = data[0].get("station", file.stem).replace(",", "").replace(" ", "_")
        values = data[0]["values"]

        # Load into a pandas DataFrame
        df = pd.DataFrame(values)

       # Add station-level metadata to each row
        df["station"] = station
        df["component"] = data[0].get("component")
        df["unit"] = data[0].get("unit")
        df["timestep"] = data[0].get("timestep")

        # Save to SQLite table
        table_name = f"pm25_{station.lower()}"
        df.to_sql(table_name, conn, if_exists="replace", index=False)
        print(f"Table saved: {table_name}")

# Close connection
conn.close()

print(f"SQLite database created at: {DB_PATH}")

**(EN) Result:** Created one SQLite table per station (58 tables) plus one aggregated table (`pm25_all`) from the PM2.5 JSON file. This structure supports modular data management while keeping a complete reference dataset for auditing or integration.  

**(NO) Resultat:** Opprettet √©n SQLite-tabell per stasjon (58 tabeller) pluss √©n aggregert tabell (`pm25_all`) fra PM2.5-JSON-filen. Denne strukturen st√∏tter modul√¶r databehandling og gir et komplett referansedatasett for revisjon eller integrasjon.  

In [None]:
# (EN) Load hourly PM2.5 data from the SQLite database
# (NO) Last inn timesvise PM2.5-data fra SQLite-databasen

with sqlite3.connect(DB_PATH) as conn:
    sk√∏yen_df   = pd.read_sql("SELECT * FROM pm25_sk√∏yen", conn)
    furulund_df = pd.read_sql("SELECT * FROM pm25_furulund", conn)

# Show first rows to confirm
display(sk√∏yen_df.head())
display(furulund_df.head())

In [None]:
# (EN) DB - Quality Assurance (QA): list all tables, then Top‚Äë10 stations by hourly PM2.5 values (with 2023 coverage %)
# (NO) DB - Kvalitetssikring (KS): list opp alle tabeller, deretter Topp‚Äë10 stasjoner etter timesvise PM2.5‚Äëverdier (med dekning % for 2023)

EXPECTED_HOURS = 8760  # 2023

with sqlite3.connect(DB_PATH) as conn:
    # 1) All tables (context)
    all_tables = pd.read_sql("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;", conn)
    print(f"Total tables in database: {len(all_tables)}")
    display(all_tables)

    # 2) Per‚Äëstation PM2.5 tables
    per_station = pd.read_sql("""
        SELECT name 
        FROM sqlite_master 
        WHERE type='table' AND name LIKE 'pm25_%'
        ORDER BY name;
    """, conn)
    print(f"\nPer-station PM2.5 tables: {len(per_station)}")
    display(per_station.head(8))  # preview

    # 3) Row counts per station + coverage %
    counts = []
    for tbl in per_station["name"]:
        n_rows = pd.read_sql(f'SELECT COUNT(*) AS rows FROM "{tbl}";', conn).iloc[0, 0]
        station = tbl.replace("pm25_", "", 1)
        counts.append((station, n_rows))

df_counts = (
    pd.DataFrame(counts, columns=["Station", "Rows"])
    .assign(CoveragePct=lambda d: (d["Rows"] / EXPECTED_HOURS * 100).round(1))
    .sort_values(["Rows", "Station"], ascending=[False, True])
    .reset_index(drop=True)
)

print("\nSQLite ‚Äî Top 10 stations by hourly PM2.5 values:")
display(df_counts.head(10))

### SQLite Per-Station Tables Overview | Oversikt over SQLite-tabeller per stasjon

**(EN)** The local SQLite database contains **58 tables**, each corresponding to a monitoring station. Below is a preview of the structure and row counts.

**(NO)** Den lokale SQLite-databasen inneholder **58 tabeller**, √©n for hver m√•lestasjon. Nedenfor vises en forh√•ndsvisning av strukturen og antall rader.

#### Table Preview 

| Table Name              | Rows  |
|-------------------------|-------|
| pm25_alnabru            | 8,420 |
| pm25_alvim              | 8,352 |
| pm25_backeparken        | 8,552 |
| pm25_bankplassen        | 8,226 |
| pm25_bekkestua          | 8,584 |
| pm25_bj√∏rndalssletta    | 7,516 |
| pm25_bryn_skole         | 60    |
| pm25_brynbanen          | 7,680 |
| pm25_bygd√∏y_alle        | 8,351 |
| pm25_danmarks_plass     | 8,725 |


#### Top 10 Stations by Hourly PM2.5 Coverage (2023)

| Rank | Station           | Rows  | Coverage (%) |
|------|--------------------|-------|---------------|
| 1    | moheia_vest        | 8,731 | 99.7%         |
| 2    | danmarks_plass     | 8,725 | 99.6%         |
| 3    | rolland_√•sane      | 8,725 | 99.6%         |
| 4    | furulund           | 8,724 | 99.6%         |
| 5    | nedre_langgate     | 8,723 | 99.6%         |
| 6    | knarrdalstranda    | 8,722 | 99.6%         |
| 7    | v√•land             | 8,722 | 99.6%         |
| 8    | r√•dal              | 8,721 | 99.6%         |
| 9    | klosterhaugen      | 8,720 | 99.5%         |
| 10   | sk√∏yen             | 8,719 | 99.5%         |

_________

### Selecting Stations for Analysis | Valg av m√•lestasjoner

**(EN)** Based on the number of hourly PM2.5 records collected in 2023, two contrasting stations were selected for focused analysis:  **Sk√∏yen** (traffic-heavy, urban) and **Furulund** (quiet, residential), both located in **Oslo**.  
These stations offer near-complete yearly coverage and represent distinct urban contexts, making them ideal for time-series modeling and anomaly detection.

**(NO)** Basert p√• antall timesverdier for PM2.5 registrert i 2023, ble to kontrasterende stasjoner valgt for videre analyse:  **Sk√∏yen** (trafikkert, urbant) og **Furulund** (rolig, boligomr√•de), begge i **Oslo**.  
Disse stasjonene har nesten full dekning og representerer ulike bymilj√∏er, noe som gir et godt grunnlag for tidsserieanalyse og avviksdeteksjon.


In [None]:
# (EN) Create an interactive bar chart of PM2.5 hourly records per station (2023)
# (NO) Lag et interaktivt stolpediagram over PM2.5-timer per stasjon (2023)

import plotly.express as px

# Create a DataFrame from the station_counts list
df_counts = pd.DataFrame(station_counts, columns=["Station", "HourlyRecords"])

# Sort by number of records
df_counts = df_counts.sort_values(by="HourlyRecords", ascending=False)

# Create interactive bar chart
fig = px.bar(
    df_counts,
    x="Station",
    y="HourlyRecords",
    title=f"PM2.5 Station Coverage - {YEAR}",  
    labels={"HourlyRecords": "Number of Records", "Station": "Station"},
    hover_data={"Station": True, "HourlyRecords": True},
)

fig.update_layout(
    xaxis_tickangle=45,
    height=600,
    margin=dict(t=60, b=100),
)

fig.show()

# Save figure
fig.write_html(COVERAGE_HTML)
fig.write_image(COVERAGE_PNG)

print("Chart saved to:")
print(" ‚Üí", COVERAGE_HTML)
print(" ‚Üí", COVERAGE_PNG)

**(EN)** Result: Chart `pm25_station_coverage_2023.html` and `pm25_station_coverage_2023.png` saved in `results/`  
**(NO)** Resultat: Dekningsfigur lagret i `results/pm25_station_coverage_2023.html` og `PM2.5 pm25_station_coverage_2023.png`

üìé üîó **[View interactive chart (HTML)](../results/pm25_station_coverage_2023.html?raw=1)**
`


#### Data Collection and Structuring Completed | Datainnhenting og strukturering fullf√∏rt

**EN:** Phase 1 is complete:
- Fetched 2023 hourly PM2.5 from NILU for all available stations (Norway).
- Inspected per‚Äëstation coverage.
- Selected two Oslo stations (Sk√∏yen, Furulund) for deeper analysis.
- Persisted raw/structured data to SQLite for scalable querying and reuse.

**NO:** F√∏rste fase er fullf√∏rt:
- Hentet timesvise PM2.5‚Äëdata (2023) fra NILU for alle tilgjengelige stasjoner.
- Inspiserte datadekning per stasjon.
- Valgte to Oslo-stasjoner (Sk√∏yen, Furulund) for videre analyser.
- Lagret r√•/strukturerte data i SQLite for skalerbare sp√∏rringer og gjenbruk.

_____

#### Next Step | Neste steg
Proceed to Notebook 2 for exploratory analysis and quality checks.

_____

**Navigation Links**
  
- [Notebook 2 ‚Äì Exploratory Analysis and Quality Checks](./02_exploratory_qc.ipynb)  
- [Notebook 3 ‚Äì Feature Engineering and Anomaly Detection](./03_features_anomalies.ipynb)  
- [Notebook 4 ‚Äì Summary Report (EN)](./04_report.ipynb)  
- [Notebook 5 ‚Äì Sammendragsrapport (NO)](./05_report_norsk.ipynb) 


In [None]:
# (EN) Notebook 1 complete
# (NO) Notebook 1 ferdig

print("Notebook 1 complete ‚Äî data collected and structured into SQLite (pm25_2023.sqlite), ready for Notebook 2 (Exploratory Analysis and Quality Checks).")
print("Notebook 1 ferdig ‚Äî data samlet inn og strukturert i SQLite (pm25_2023.sqlite), klar for Notebook 2 (Utforskende analyse og kvalitetskontroll).")