Open dataset of regulatory proceedings from the Michigan Public Service Commission (MPSC), spanning November 1987 through April 2026.
This repository holds the static data snapshot that accompanies the paper OpenMPSC: An Open Dataset of Michigan Public Service Commission Regulatory Proceedings. The same snapshot is bundled in the paper repository under data/; this repo exists as a stand-alone, citable distribution point.
For continuously updated, daily-refreshed access (including full PDFs and extracted text), use the public REST API and web interface at https://openmpsc.com.
- Snapshot date: 2026-04-30
- Coverage: 1987-11-12 to 2026-04-29
- License: CC BY 4.0
| Cases | 6,136 |
| Filings | 164,865 |
| Orders | 12,685 |
| Public comments | 12,309 |
| Hearings | 2,635 |
| Party records | 12,308 |
| Commission meetings | 672 |
| Total PDF pages | 3,812,492 |
| Total PDF storage (live) | ~149 GB |
| Extracted plain text (live) | ~5.4 GB |
(PDFs and full text are not redistributed in this snapshot; they are retrievable from the API.)
All CSVs are zstd -19 compressed. Decompress with zstd -d <file>.csv.zst or stream with zstdcat <file>.csv.zst | ....
| File | Rows | Description |
|---|---|---|
data/summary_stats.csv.zst |
30 | Headline metrics; every count cited in the paper. |
data/taxonomy_categories.csv.zst |
14 | Filing taxonomy top-level categories with counts. |
data/taxonomy_subcategories.csv.zst |
205 | Subcategory codes with counts. |
data/case_types.csv.zst |
41 | All MPSC case types with case counts, filing counts, and filings-per-case. |
data/cases.csv.zst |
6,136 | One row per case: number, subject, industry, type, status flag, open date, lead company, child counts. |
data/filings.csv.zst |
164,865 | One row per filing: filing number, case type, filing type, description, file date, filer, classification (category / subcategory / party type), num_pages, file size. |
data/orders.csv.zst |
12,685 | One row per order: order number, title, order date, file size. |
data/comments.csv.zst |
12,309 | Public comments: commenter name, anonymity flag, comment text, submission timestamp, has-attachment flag. |
data/hearings.csv.zst |
2,635 | Hearings: hearing type, hearing date, start time, virtual flag, cancellation flag. |
data/parties.csv.zst |
12,308 | Party records: party name, role, attorney firm. |
data/meetings.csv.zst |
672 | Commission meeting documents: id, filename, dates, type, page count, YouTube URL. Extracted PDF text excluded for size. |
This snapshot was generated from an export pipeline that flattens the database to per-table CSVs without join keys or several optional columns. Specifically:
case_numberis empty infilings,orders,comments,hearings, andparties. The snapshot is sufficient for per-table aggregate analysis (counts, distributions, classifications) but does not support case-level joins.orders.num_pagesis empty in this export (live data has it).commentsomitscase_number,organization_name,commenter_city,commenter_state, andsubject; onlycommenter_name,is_anonymous,comment_text,submitted_at, andhas_attachmentare populated.hearingsomitscase_number,title,end_time,location, andalj_name; aggregate counts and the type / date / cancelled-flag analyses still hold.partiesomitscase_numberandattorney_name; role distributions and unique-party counts still hold.cases.is_closedis0for all rows in this export; the live API exposes the correct status flag (96.6% of cases are closed in the live system).cases.close_dateandcases.parent_case_numberare empty.
For analyses that require case-level joins, the populated fields above, or the closed-case flag, use the public REST API at https://openmpsc.com/api/v1/. A future revision of this snapshot will restore the join keys.
- All dates are ISO 8601 (
YYYY-MM-DDor full timestamp). - Empty strings in date columns mean "not set" (treated as NULL).
is_anonymous/is_virtual/is_cancelled/has_attachmentare0/1integer flags.case_numberfollows the MPSC convention (e.g.,U-21990) — but is empty in this snapshot for every table other thancases.csv.zstandcase_types.csv.zst(see Known limitations above).filing_categoryis one of 14 codes (the 12-category taxonomy plusUNKandXXXplaceholders for low-confidence classifications);filing_subcategorymostly follows theCAT-SUBpattern (e.g.,TES-DIRfor direct testimony), with a small tail of malformed LLM outputs that are retained rather than silently re-mapped.party_typeon filings is one ofCompany,Intervenor,Staff,AG(Attorney General),Public, orUnknown.
git clone https://github.com/mjbommar/openmpsc-data
cd openmpsc-data
# decompress one table
zstd -d data/cases.csv.zst
# or stream into pandas
python -c "
import pandas as pd, zstandard as zstd, io
with open('data/filings.csv.zst', 'rb') as f:
raw = zstd.ZstdDecompressor().decompress(f.read())
df = pd.read_csv(io.BytesIO(raw))
print(df.shape, df.columns.tolist())
"If you use this dataset, please cite the accompanying paper:
@misc{bommarito2026openmpsc,
title = {{OpenMPSC}: An Open Dataset of {M}ichigan Public Service Commission Regulatory Proceedings},
author = {Bommarito, Michael J.},
year = {2026},
url = {https://openmpsc.com}
}The derived dataset in this repository (the structure of these CSVs, the LLM-generated classification labels, and the snapshot organization) is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
The underlying filings, orders, and public comments were collected from publicly accessible MPSC sources. Michigan's Freedom of Information Act establishes a right of public inspection and copying for non-exempt agency records, but does not by itself convey a redistribution license; individual documents may be subject to authorial copyright or other downstream restrictions. We redistribute these records in good faith for non-commercial research and public-interest use, and downstream users should evaluate their own use independently.
LLM-generated classifications in filings.csv.zst are research aids and do not constitute legal analysis or official MPSC categorization.
For bugs in the data, schema questions, or requests, open an issue on this repo. For the live, daily-updated archive (and full PDF / text retrieval), see https://openmpsc.com.