-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
use Jupyter to compare data dumped into computed_scenarios.json
- Loading branch information
1 parent
097b77e
commit afd2cf9
Showing
5 changed files
with
256 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v2.3.0 | ||
hooks: | ||
- id: check-yaml | ||
- id: end-of-file-fixer | ||
- id: trailing-whitespace | ||
- repo: https://github.com/psf/black | ||
rev: 22.3.0 | ||
hooks: | ||
- id: black | ||
- repo: https://github.com/pre-commit/mirrors-mypy | ||
rev: 'v0.991' | ||
hooks: | ||
- id: mypy | ||
additional_dependencies: | ||
- types-python-dateutil | ||
- types-requests | ||
- repo: local | ||
hooks: | ||
- id: jupyter-nb-clear-output | ||
name: jupyter-nb-clear-output | ||
files: \.ipynb$ | ||
stages: [commit] | ||
language: system | ||
entry: python -m nbconvert --ClearOutputPreprocessor.enabled=True --inplace | ||
- repo: https://github.com/nbQA-dev/nbQA | ||
rev: 1.6.1 | ||
hooks: | ||
- id: nbqa-black | ||
additional_dependencies: [black==22.3.0] | ||
- id: nbqa-isort |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,197 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "03835d1a", | ||
"metadata": {}, | ||
"source": [ | ||
"# To compare model outputs\n", | ||
"This notebook is for comparing the computed_scenarios.json outputs from different model versions of MicroCovid. For example, you might want to compare the model output before and after a pull request is merged. To do that, run the tests before and after the change, save both versions of computed_scenarios.json under different names, and tell us what they are here (\"reference_file\" for before the change and \"test_file\" for after, is the intention).\n", | ||
"\n", | ||
"If you want to see or change the \"test\" that generates the computed_scenarios.json file, look at `src/data/__tests__/scenarios.test.ts`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d2ec30ab", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"reference_file = \"computed_scenarios_recent.json\"\n", | ||
"test_file = \"computed_scenarios_1421.json\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0a4cb0eb", | ||
"metadata": {}, | ||
"source": [ | ||
"and then run the rest of this notebook to play with the data changes." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "bbb04652", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import json\n", | ||
"import re\n", | ||
"\n", | ||
"import matplotlib.pyplot as plt\n", | ||
"import pandas as pd\n", | ||
"from IPython.core.display import HTML\n", | ||
"from pandas import NA" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "00d271a1", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def load(filename):\n", | ||
" with open(filename) as file:\n", | ||
" blob = json.load(file)\n", | ||
" table = pd.json_normalize(blob)\n", | ||
" return table.convert_dtypes()[sorted(table.columns)]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "09cb12bf", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"reference = load(reference_file)\n", | ||
"test = load(test_file)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "3b0bc1c6", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"JOIN_KEYS = [\"scenario\", \"loc\", \"vaccination\"]\n", | ||
"combined = test.join(\n", | ||
" reference.set_index(JOIN_KEYS), lsuffix=\".test\", rsuffix=\".reference\", on=JOIN_KEYS, how=\"outer\"\n", | ||
")\n", | ||
"combined = combined[sorted(combined.columns)]\n", | ||
"combined[\"ratio\"] = combined[\"result.expectedValue.test\"] / combined[\"result.expectedValue.reference\"]\n", | ||
"combined.rename(columns=lambda s: re.sub(r\"([a-z])([.A-Z])\", r\"\\1\\2\", s), inplace=True)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "b5a18f33", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def group_by_ratio(df, only_first):\n", | ||
" by_ratio = df.sort_values(\"ratio\", kind=\"stable\")\n", | ||
" if only_first:\n", | ||
" by_ratio = by_ratio.groupby(\"ratio\").first()\n", | ||
" by_ratio[\"ratio\"] = by_ratio.index\n", | ||
" by_ratio = by_ratio[[\"ratio\", *(col for col in by_ratio.columns if col != \"ratio\")]]\n", | ||
" return (\n", | ||
" by_ratio.set_index([\"loc\", \"scenario\", \"vaccination\"])\n", | ||
" .sort_index(kind=\"stable\")\n", | ||
" .sort_values(\"ratio\", kind=\"stable\")\n", | ||
" )\n", | ||
"\n", | ||
"\n", | ||
"def show_only_changes(df):\n", | ||
" df = df.copy()\n", | ||
" for col in df.columns:\n", | ||
" if col.endswith(\".test\"):\n", | ||
" test_col = col\n", | ||
" ref_col = col.replace(\".test\", \".reference\")\n", | ||
" else:\n", | ||
" continue\n", | ||
" for row in df.index:\n", | ||
" test_val = df[test_col][row]\n", | ||
" ref_val = df[ref_col][row]\n", | ||
" if test_val is not NA and ref_val is not NA and test_val == ref_val:\n", | ||
" df[test_col][row] = NA\n", | ||
" df[ref_col][row] = NA\n", | ||
" if all(value is pd.NA for col in (test_col, ref_col) for value in df[col]):\n", | ||
" df.drop(test_col, axis=1, inplace=True)\n", | ||
" df.drop(ref_col, axis=1, inplace=True)\n", | ||
" return df\n", | ||
"\n", | ||
"\n", | ||
"def blank_na(df):\n", | ||
" return HTML(df.style.to_html().replace(\"<NA>\", \"\"))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "bd443b48", | ||
"metadata": {}, | ||
"source": [ | ||
"# The next table will show highlights of what's changed.\n", | ||
"The ratio between the test and reference \"result.expectedValue\" outputs is in the \"ratio\" column. This table shows\n", | ||
"select rows sorted by that ratio, giving an overview that suggests which kinds of scenarios changed by how much." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "bef69c42", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pd.set_option(\"display.max_colwidth\", 40)\n", | ||
"pd.set_option(\"display.max_columns\", 100)\n", | ||
"blank_na(show_only_changes(group_by_ratio(combined, only_first=True)))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "6c8f8879", | ||
"metadata": {}, | ||
"source": [ | ||
"# The next table will show all changes.\n", | ||
"As above, but all rows are included." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "03449e37", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"blank_na(show_only_changes(group_by_ratio(combined, only_first=False)))" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.0" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters