# Open Physicians Dataset Workbook

This workbook provides a lightweight walkthrough for loading, inspecting, and summarizing the curated physicians dataset.


In [None]:
from pathlib import Path

import pandas as pd

DATA_PATH = Path("data/cleaned/physicians_clean.csv")
df = pd.read_csv(DATA_PATH)
df.head()


## Snapshot

Use the cells below to understand the dataset footprint before diving deeper.

In [None]:
df.shape


In [None]:
df.info()


## Coverage checks

These checks highlight missing values and coverage for key fields.

In [None]:
missing_rate = df.isna().mean().sort_values(ascending=False)
missing_rate


In [None]:
if df.empty:
    print("The dataset is currently empty. Run the ingestion pipeline to populate it.")
else:
    display(df['license_status'].value_counts(dropna=False))


## Next steps

- Filter by `location_code` or `specialty_code` for targeted profiling.
- Join against mapping tables in `mappings/` for human-readable labels.
- Export subsets for downstream modeling or QA workflows.