Generate tables of results, which are extracted from a data source.
- These can be different datatypes:
- comparison: (the most common) two scenarios, compared by effects, costs and cost-effectiveness
- record: the individual records, with effects and costs
- These can be presented by: region, income, appendix 3 status, author, intervention, or any combination of these
- These can be filtered by: region, income, appendix 3 status, author, intervention, scenario, or any combination of these
- These can be in various formats: csv, html, pandas dataframes, python lists.
Part of the Botech ecosystem, written by Forecast Health Australia
This repository was created because we were generating a lot of health-economic model data, often from different sources, and we wanted to compare it in lots of ways.
I was frustrated with hard coding comparisons, and felt there was a better way to do it. There are lots of solutions to this type of thing, but it seemed marginally easier to write a new package that we could modify as our use cases changed.
- Have some general mechanism to convert modelling results to a
csv
of Records - Generate a configuration file
- Invoke this package, either by installing it, or running the main script and passing your files as arguments.
- Use the results however you'd like e.g.
- Render the results as an html and host them on a website
- Python (built with 3.10.12)
- country-metadata
- pandas
git clone https://github.com/ForecastHealth/botech-comparisons.git
pip install -r requirements.txt
(use a virtual environment)
There are two python dataclasses
and one enum
which are important to understand: The Record
, the Comparison
and the Filter
.
You can find their definitions in the datatypes module.
In particular:
- The
Record
is important, because we expect the underlying database to be aCSV
file with the schema of aRecord
- The
Filter
is important, because you can filter and group using these elements.
Write a config.json
, which defines the following:
data_type
: the type of table you want to return (explained below)blueprint
(note - this is probably not useful unless you already know what it is)filtered_records
(individual records)comparisons
(comparisons of records - probably what you want)
data_format
: the format of the table you want to returncsv
,html
,dataframe
, orself
.dataframe
is apandas.DataFrame
self
is alist
of the data type
scenarios
is a list of exactly two elements, where each element corresponds to ascenario
. These must be labelled in your dataset, e.g.baseline
andscale-up
groups
is a list of lists, with each nested list being the ways you want to present the data. For instance, if you have list["region", "income"]
, this means you want the data to be presented by region by income e.g. "North America x High Income", "Oceania x Low Income", etc.filters
are dictionary of Filters where the value is a list of values that you want to include. e.g."income": ["HIGH INCOME"]
will only include results from high income countries"country": ["BRA", "MOZ"]
will only include results from Brazil and Mozambique- etc
{
"data_type": "comparisons",
"data_format": "html",
"scenarios": ["baseline", "scaleup"],
"groups": [
["region", "income"]
],
"filters": {
"income": ["HIGH INCOME"],
"intervention": [0]
}
}
Please refer to the init.py to read the high-level api create_tables()
.
The configuration can be created by parsing a JSON configuration using parse_configurations
and the data
will need to be provided by the user and parsed using something like pandas.read_csv()
.
Feel free to fork, or submit a user issue. If you'd like to be added as a contributor, please message me, or email our website Forecast Health Australia
Tests are built with unittest
and can be run locally using
python -m unittest discover tests
from the root directory.
Rory Watts, Forecast Health Australia
This repository is licensed under the Apache 2.0 License.
Please