Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task/rename bad medicine #195

Merged
merged 5 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .github/workflows/testpack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ jobs:
run: dotnet test --nologo
- name: Package
run: |
dotnet publish BadMedicine/BadMedicine.csproj -o linux-x64 -r linux-x64 -c Release --self-contained --nologo -p:PublishSingleFile=true -p:GenerateDocumentationFile=false
dotnet publish BadMedicine/BadMedicine.csproj -o win-x64 -r win-x64 -c Release --self-contained --nologo -p:PublishSingleFile=true -p:GenerateDocumentationFile=false
dotnet pack BadMedicine.Core/BadMedicine.Core.csproj -p:DebugType=full -p:SymbolPackageFormat=snupkg -p:PackageVersion=$(fgrep AssemblyInformationalVersion SharedAssemblyInfo.cs|cut -d'"' -f2) -o . --include-source --include-symbols --nologo -c Release
tar czf badmedicine-cli-linux-x64.tgz ./linux-x64
zip -9rj badmedicine-cli-win-x64.zip win-x64
dotnet publish SynthEHR/SynthEHR.csproj -o linux-x64 -r linux-x64 -c Release --self-contained --nologo -p:PublishSingleFile=true -p:GenerateDocumentationFile=false
dotnet publish SynthEHR/SynthEHR.csproj -o win-x64 -r win-x64 -c Release --self-contained --nologo -p:PublishSingleFile=true -p:GenerateDocumentationFile=false
dotnet pack SynthEHR.Core/SynthEHR.Core.csproj -p:DebugType=full -p:SymbolPackageFormat=snupkg -p:PackageVersion=$(fgrep AssemblyInformationalVersion SharedAssemblyInfo.cs|cut -d'"' -f2) -o . --include-source --include-symbols --nologo -c Release
tar czf SynthEHR-cli-linux-x64.tgz ./linux-x64
zip -9rj SynthEHR-cli-win-x64.zip win-x64
- name: Nuget push
if: contains(github.ref,'refs/tags/')
run: nuget push HIC.*.nupkg -skipDuplicate -Source https://api.nuget.org/v3/index.json -ApiKey ${{ secrets.NUGET_KEY }}
Expand All @@ -27,7 +27,7 @@ jobs:
if: contains(github.ref, 'refs/tags/v')
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: badmedicine-cli-*
file: SynthEHR-cli-*
tag: ${{ github.ref }}
overwrite: true
file_glob: true
14 changes: 7 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
BadMedicine/obj/
BadMedicine/bin/
SynthEHR/obj/
SynthEHR/bin/
*.user
BadMedicine/.vs/
SynthEHR/.vs/
.vs/
Visual Studio 2017/
BadMedicineTests/obj/
SynthEHRTests/obj/
*.vspx
*.psess
BadMedicineTests/bin/
BadMedicine.Core/obj/
BadMedicine.Core/bin/
SynthEHRTests/bin/
SynthEHR.Core/obj/
SynthEHR.Core/bin/
.idea/
33 changes: 19 additions & 14 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.2.2] - 2024-035-16
## [2.0.0] - Unreleased

- Rename package to SynthEHR

## [1.2.2] - 2024-05-16

-Add warning about naming deprecation, see [README](./README.md#Deprecation)

Expand All @@ -16,7 +20,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Now targets .Net 8.0
- Some bugfixes change random data generation, cross-version consistency not preserved
- BadMedicine itself now AOT/trim clean, but dependencies are not
- SynthEHR itself now AOT/trim clean, but dependencies are not
- Improve BucketList performance
- Add Equ 2.3.0
- Bump HIC.FAnsiSql from 3.0.1 to 3.2.0
Expand Down Expand Up @@ -88,15 +92,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Patient birth dates now go from 1914 (Person.MinimumYearOfBirth) allowing for patients aged up to 100 years

[Unreleased]: https://github.com/HicServices/BadMedicine/compare/v1.2.2...main
[1.2.2]: https://github.com/HicServices/BadMedicine/compare/v1.2.1...v1.2.2
[1.2.1]: https://github.com/HicServices/BadMedicine/compare/v1.2.0...v1.2.1
[1.2.0]: https://github.com/HicServices/BadMedicine/compare/v1.1.2...v1.2.0
[1.1.2]: https://github.com/HicServices/BadMedicine/compare/v1.1.1...v1.1.2
[1.1.1]: https://github.com/HicServices/BadMedicine/compare/v1.1.0...v1.1.1
[1.1.0]: https://github.com/HicServices/BadMedicine/compare/v1.0.0...v1.1.0
[1.0.0]: https://github.com/HicServices/BadMedicine/compare/v0.1.6...v1.0.0
[0.1.6]: https://github.com/HicServices/BadMedicine/compare/v0.1.5...v0.1.6
[0.1.5]: https://github.com/HicServices/BadMedicine/compare/v0.1.4...v0.1.5
[0.1.4]: https://github.com/HicServices/BadMedicine/compare/v0.1.3...v0.1.4
[0.1.3]: https://github.com/HicServices/BadMedicine/compare/0.0.1.2...v0.1.3
[Unreleased]: https://github.com/HicServices/SynthEHR/compare/v2.0.0...main
[2.0.0]: https://github.com/HicServices/SynthEHR/compare/v1.2.2...v2.0.0
[1.2.2]: https://github.com/HicServices/SynthEHR/compare/v1.2.1...v1.2.2
[1.2.1]: https://github.com/HicServices/SynthEHR/compare/v1.2.0...v1.2.1
[1.2.0]: https://github.com/HicServices/SynthEHR/compare/v1.1.2...v1.2.0
[1.1.2]: https://github.com/HicServices/SynthEHR/compare/v1.1.1...v1.1.2
[1.1.1]: https://github.com/HicServices/SynthEHR/compare/v1.1.0...v1.1.1
[1.1.0]: https://github.com/HicServices/SynthEHR/compare/v1.0.0...v1.1.0
[1.0.0]: https://github.com/HicServices/SynthEHR/compare/v0.1.6...v1.0.0
[0.1.6]: https://github.com/HicServices/SynthEHR/compare/v0.1.5...v0.1.6
[0.1.5]: https://github.com/HicServices/SynthEHR/compare/v0.1.4...v0.1.5
[0.1.4]: https://github.com/HicServices/SynthEHR/compare/v0.1.3...v0.1.4
[0.1.3]: https://github.com/HicServices/SynthEHR/compare/0.0.1.2...v0.1.3
41 changes: 18 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,16 @@
# BadMedicine
# SynthEHR (Previously BadMedicine)

## <a name="Deprecation"></a> Deprecation Notice

BadMedicine v1.2.2 will be the final release under this name.

The project will be renamed SynthEHR in all future releases, starting at v2.0.0

The project will be able to be found on [Github](https://gituhb.com/HICServices/SynthEHR) and on [Nuget](https://www.nuget.org/packages/HIC.SynthEHR/)

[![Build Status](https://github.com/HICServices/BadMedicine/actions/workflows/testpack.yml/badge.svg?branch=develop)](https://travis-ci.org/HicServices/BadMedicine) [![NuGet Badge](https://buildstats.info/nuget/HIC.BadMedicine)](https://www.nuget.org/packages/HIC.BadMedicine/)
[![Build Status](https://github.com/HICServices/SynthEHR/actions/workflows/testpack.yml/badge.svg?branch=develop)](https://travis-ci.org/HicServices/SynthEHR) [![NuGet Badge](https://buildstats.info/nuget/HIC.SynthEHR)](https://www.nuget.org/packages/HIC.SynthEHR/)

Library and CLI for randomly generating medical data like you might get out of an Electronic Health Records (EHR) system. It is intended for generating data for demos and testing ETL / cohort generation/ data management tools.

BadMedicine differs from other random data generators e.g. Mockaroo, SQL Data Generator etc in that data generated is based on (simple) models generated from live EHR datasets collected for over 30 years in Tayside and Fife (UK). This makes the data generated recognisable (codes used, frequency of codes etc) from a clinical perspective and representative of the problems (ontology mapping etc) that data analysts would encounter working with real medical data.
SynthEHR differs from other random data generators e.g. Mockaroo, SQL Data Generator etc in that data generated is based on (simple) models generated from live EHR datasets collected for over 30 years in Tayside and Fife (UK). This makes the data generated recognisable (codes used, frequency of codes etc) from a clinical perspective and representative of the problems (ontology mapping etc) that data analysts would encounter working with real medical data.

Datasets generated are not suitable for training AI algorithms etc (See [What is Modelled?](#what-is-modelled))

## Rename
As of v2.0.0 BadMedicine was renamed to SynthEHR. Previous versions of the software can be found at [nuget.org](https://www.nuget.org/packages/HIC.BadMedicine).

## Datasets

The following synthetic datasets can be produced.
Expand All @@ -31,32 +26,32 @@ The following synthetic datasets can be produced.

## Usage:

BadMedicine is available as a [nuget package](https://www.nuget.org/packages/HIC.BadMedicine/) for linking as a library
SynthEHR is available as a [nuget package](https://www.nuget.org/packages/HIC.SynthEHR/) for linking as a library

The standalone CLI (BadMedicine.exe) is available in the [releases section of Github](https://github.com/HicServices/BadMedicine/releases)
The standalone CLI (SynthEHR.exe) is available in the [releases section of Github](https://github.com/HicServices/SynthEHR/releases)

Usage is as follows:

```
BadMedicine.exe c:\temp\
SynthEHR.exe c:\temp\
```

You can change how much data is produced (e.g. 500 patients, 10000 records per dataset):

```
BadMedicine.exe c:\temp\ 500 10000
SynthEHR.exe c:\temp\ 500 10000
```

Or run only a single dataset:

```
BadMedicine.exe c:\omg 5000 200000 -l -d CarotidArteryScan
SynthEHR.exe c:\omg 5000 200000 -l -d CarotidArteryScan
```

You can seed the generator (Guids generated will still differ)

```
BadMedicine.exe c:\omg 5000 200000 -l -d CarotidArteryScan -s 5000
SynthEHR.exe c:\omg 5000 200000 -l -d CarotidArteryScan -s 5000
```

## Building
Expand All @@ -65,16 +60,16 @@ Building requires MSBuild 15 or later (or Visual Studio 2017 or later). You wil

You can build a OS specific binary

First build BadMedicine.csproj
First build SynthEHR.csproj
```
dotnet publish BadMedicine.csproj -r win-x64 --self-contained
dotnet publish SynthEHR.csproj -r win-x64 --self-contained
cd .\bin\Debug\netcoreapp2.2\win-x64\
```
## Direct to Database

You can generate data directly into a relational database (instead of onto disk).

To turn this mode on rename the file `BadMedicine.template.yaml` to `BadMedicine.yaml` and provide the connection strings to your database e.g.:
To turn this mode on rename the file `SynthEHR.template.yaml` to `SynthEHR.yaml` and provide the connection strings to your database e.g.:

```yaml
Database:
Expand All @@ -85,12 +80,12 @@ Database:
# Your DBMS provider ('MySql', 'PostgreSql','Oracle' or 'MicrosoftSQLServer')
DatabaseType: MicrosoftSQLServer
# Database to create/use on the server
DatabaseName: BadMedicineTestData
DatabaseName: SynthEHRTestData
```

## Library Usage

You can generate test data for your program yourself by referencing the [nuget package](https://www.nuget.org/packages/HIC.BadMedicine/):
You can generate test data for your program yourself by referencing the [nuget package](https://www.nuget.org/packages/HIC.SynthEHR/):

```csharp
//Seed the random generator if you want to always produce the same randomisation
Expand All @@ -113,7 +108,7 @@ Assert.IsNotNull(a.Condition1);

## What is Modelled?

Data generated by BadMedicine is driven by Aggregate distributions of real health data collected in Tayside (UK). This means that codes appear in data with the frequency that match real data. For example in the Hospital Admissions data we can see that ICD9 codes (denoted by dash) cease being recorded in ~1997 in favour of ICD10 codes and we can see the most common admission conditions are sensible:
Data generated by SynthEHR is driven by Aggregate distributions of real health data collected in Tayside (UK). This means that codes appear in data with the frequency that match real data. For example in the Hospital Admissions data we can see that ICD9 codes (denoted by dash) cease being recorded in ~1997 in favour of ICD10 codes and we can see the most common admission conditions are sensible:

![alt text](./Images/MainConditionDistribution.png)

Expand Down
8 changes: 4 additions & 4 deletions SharedAssemblyInfo.cs
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
using System.Reflection;

[assembly: AssemblyCompany("Health Informatics Centre, University of Dundee")]
[assembly: AssemblyProduct("Bad Medicine")]
[assembly: AssemblyProduct("SynthEHR")]
[assembly: AssemblyCopyright("Copyright (c) 2018 - 2024")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]

// These should be replaced with correct values by the release process
[assembly: AssemblyVersion("1.2.2")]
[assembly: AssemblyFileVersion("1.2.2")]
[assembly: AssemblyInformationalVersion("1.2.2")]
[assembly: AssemblyVersion("2.0.0")]
[assembly: AssemblyFileVersion("2.0.0")]
[assembly: AssemblyInformationalVersion("2.0.0")]
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
using System.Linq;
using System.Threading;

namespace BadMedicine;
namespace SynthEHR;

/// <summary>
/// Picks random object of Type T based on a specified probability for each element.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Data class describing an appointment including a guid identifier
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <include file='../../Datasets.doc.xml' path='Datasets/Biochemistry'/>
/// <inheritdoc/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
using System.Linq;
using MathNet.Numerics.Distributions;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Data class representing a single row in <see cref="Biochemistry"/> (use if you want to use randomly generated data directly
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
using System;
using System.Linq;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Test data based on the Scottish Vascular Labs CARSCAN database table
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
using CsvHelper.Configuration;
using MathNet.Numerics.Distributions;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Base class for all randomly generated datasets. Handles generating random data types and writing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Finds Types and Creates instances of <see cref="IDataGenerator"/> implementations
Expand All @@ -22,7 +22,7 @@ public readonly struct GeneratorType(Type type)
}

/// <summary>
/// List of generator types. Add yourself to this if outside BadMedicine.Core, to avoid reliance on reflection breaking AOT.
/// List of generator types. Add yourself to this if outside SynthEHR.Core, to avoid reliance on reflection breaking AOT.
/// </summary>
public static readonly List<GeneratorType> Generators =
[
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <include file='../../Datasets.doc.xml' path='Datasets/Demography'/>
/// <inheritdoc/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Data model for a 5 line address in which some lines might be null
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
// RDMP is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with RDMP. If not, see <https://www.gnu.org/licenses/>.

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Data model for a UK postcode
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Generates synthetic random data that is representative of patient hospital admissions data
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@
using System.Collections.Generic;
using System.Data;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Random record for when a <see cref="BadMedicine.Person"/> entered hospital. Basic logic is implemented here to ensure that <see cref="DischargeDate"/>
/// Random record for when a <see cref="SynthEHR.Person"/> entered hospital. Basic logic is implemented here to ensure that <see cref="DischargeDate"/>
/// is after <see cref="AdmissionDate"/> and that the person was alive at the time.
/// </summary>
public sealed class HospitalAdmissionsRecord
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
using System.Data;
using System.IO;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Interface for classes which generate test data to disk.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <include file='../../Datasets.doc.xml' path='Datasets/Maternity'/>
/// <inheritdoc/>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
using System;
using System.Data;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <summary>
/// Describes a single maternity event for a specific <see cref="Person"/>
Expand Down Expand Up @@ -57,7 +57,7 @@ public sealed class MaternityRecord
/// <summary>
/// Generates a new random biochemistry test.
/// </summary>
/// <param name="p">The person who is undergoing maternity activity. Should be Female and of a sufficient age that the operation could have taken place during their lifetime (see <see cref="Maternity.IsEligible(BadMedicine.Person)"/></param>
/// <param name="p">The person who is undergoing maternity activity. Should be Female and of a sufficient age that the operation could have taken place during their lifetime (see <see cref="Maternity.IsEligible(SynthEHR.Person)"/></param>
/// <param name="r"></param>
public MaternityRecord(Person p, Random r)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

using System;

namespace BadMedicine.Datasets;
namespace SynthEHR.Datasets;

/// <include file='../../Datasets.doc.xml' path='Datasets/Prescribing'/>
/// <inheritdoc/>
Expand Down
Loading