Skip to content

Commit

Permalink
docs: #1294: Added homework instructions and updated debug client mod…
Browse files Browse the repository at this point in the history
…els changelog.
  • Loading branch information
zabeen committed May 6, 2024
1 parent 1641ec6 commit 7b9a206
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 3 deletions.
6 changes: 6 additions & 0 deletions Atlas.Debug.Client.Models/CHANGELOG_DebugClientModels.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ This package contains client models utilised by the Atlas debug endpoints.
## Changelog
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

### 2.2.0
* Created new namespace `Atlas.Debug.Client.Models.MatchPrediction` and moved various models related to debugging match prediction here.
* Extended existing model, `GenotypeImputationResponse`, with new prop, `GenotypeCount`.
* Changed type of prop `MatchedGenotypePairs` on existing model `GenotypeMatcherResponse` from `string` to `IEnumerable<string>`.
* Instead of one, potentially very long, formatted string, the matched genotype pairs are now returned as a collection of formatted strings, one for each matching patient-donor genotype pair.

### 2.1.0
* Creation of new library, `Atlas.Debug.Client.Models`, that contains models used in debug endpoints.
* Moved following models to new project:
Expand Down
47 changes: 44 additions & 3 deletions README_ManualTesting.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Validation of the match prediction algorithm against either [exercise 3 of the W

#### Functions
After starting up the Functions app:
1. Invoke the `ImportSubjects` function, submitting the locations of the patient and donor text files.
1. Invoke the `BothExercises_ImportSubjects` function, submitting the locations of the patient and donor text files.
2. Send match predictions requests, either by invoking the `Exercise3_SendMatchPredictionRequests` function or `Exercise3_ResumeMatchPredictionRequests` function.
- Before starting, set the request URL in the functions settings & manually create a new subscription to the `match-prediction-results` topic on the service bus.
- At present, one request is sent per patient, with a subset of donors included as a batched request, to reduce the total number of http calls. Batch size is configurable via function settings.
Expand Down Expand Up @@ -180,10 +180,14 @@ The end result should be:
- The folder `\MiscTestingAndDebuggingResources\ManualTesting\MatchPredictionValidation\exercise4\` contains a script that converts the delimited HF set files to the required JSON schema.
- The script includes a step to convert a subset of HLA values to their equivalent small g group and merge any resulting haplotype duplicates.

#### Import Subject Data
- Launch the local functions app, `ManualTesting / Atlas.MatchPrediction.Test.Validation`, and call the http function, `BothExercises_ImportSubjects`, submitting the locations of the MV4 patient and donor text files.
- Check [notes in this section](#subject-files) to ensure the subject files are in the correct format.

#### Export Test Donors to Atlas
Test donors need to be exported to the Atlas instance before search requests can be run.

- Launch the local functions app and call the http function: `Exercise4_1_PrepareAtlasDonorStores`.
- Launch the local functions app, `ManualTesting / Atlas.MatchPrediction.Test.Validation`, and call the http function: `Exercise4_1_PrepareAtlasDonorStores`.
- The request will take several minutes to complete as it involves the following steps:
1. Wipe the remote donor import donor store.
2. Re-populate it with test donors.
Expand Down Expand Up @@ -225,12 +229,49 @@ Test donors need to be exported to the Atlas instance before search requests can
- See Swagger UI for request model.
- The function will publish success notifications for the search request IDs listed in the request, and thereby trigger the `Exercise4_FetchSearchResults` function.

### Report Results
#### Report Results
- Once all the searches have completed and results have been downloaded, use the following SQL queries in `\MiscTestingAndDebuggingResources\ManualTesting\MatchPredictionValidation\`:
- `SQL_IncompleteOrFailedSearches.sql` - identifies failed searches or searches whose results have not yet been retrieved.
- `SQL_ReportAllResults.sql` - will select out results for successful searches in the required format by `SearchSet` ids.
- `SQL_Unrepresented.sql` - will select out any patients or donors that could not be explained by their assigned HF set.

#### Homework
The goal of the homework - and the MV4 exercise at large - is to better understand how and why the algorithms differ in both matching and match prediction.
PDPs of interest were disseminated after comparison of result sets generated by the different participating algorithms, including Atlas.
The PDPs were split into several CSV files, with format: `PatientId,DonorId`, without a header row.

##### Prepare Atlas
- A dedicated test Atlas instance is not needed to process the homework, as we do not need to import any test donors and/or run test searches. We can instead use the debug functions of any non-prod instance, e.g., DEV, UAT.
- First, check the HMD of the chosen Atlas instance has v3.52.0 of the HLA nomenclature. If not, then follow [these instructions](/README_Integration.md#hla-metadata) to recreate the HMD to this version.
- Thereafter, upload the MV4 HF sets files to Atlas and monitor the `notifications`/`alerts` topics to ensure all the files imported successfully.

##### Prepare Local
- Results from homework processing will be stored in the validation db.
- The default value for the validation db connection string is `local` on both the functions app (`Atlas.MatchPrediction.Test.Validation` > `local.settings.json` > `ConnectionStrings:MatchPredictionValidation:Sql`) and the EF Data project (`Atlas.MatchPrediction.Test.Validation.Data` > `appsettings.json` > `ConnectionStrings:Sql`).
- Run EF migrations on the `.Data` project to ensure all the Homework tables are set up.
- In `Atlas.MatchPrediction.Test.Validation` > `local.settings.json`, set `Homework:MatchingGenotypesRequestUrl` with the URL to the debug function, `<ENV>-ATLAS-MATCH-PREDICTION-FUNCTION > MatchPatientDonorGenotypes`, where `<ENV>` is the Atlas instance which contain the MV4 HF sets.
- If your local instance no longer contains MV4 patient and donor data (from running the original MV4 set), you will need to re-import it, [as described above](#import-subject-data).

##### Import Homework Files
- Launch the `Atlas.MatchPrediction.Test.Validation` app, then invoke the http function: `Exercise4_3_CreateNewHomeworkSets`. See Swagger for the request model, which includes the location of the homework files.
- The successful request will return an array of IDs, one for each imported homework set.
- Copy these IDs as they serve as input to the next step.
- The SQL query, `SQL_Homework_1_HomeworkCheck.sql` in `\MiscTestingAndDebuggingResources\ManualTesting\MatchPredictionValidation\exercise4\` can be used to check import of the homework files.

##### Process the PDPs
- Launch the `Atlas.MatchPrediction.Test.Validation` app, then invoke the http function, `Exercise4_4_StartOrContinueHomeworkSets`, pasting the IDs of the homework sets you wish to process, within the request body.
- It is recommended to start this second function without debugging (and then close Visual Studio) to avoid the debug console running out of memory, as the result set from `MatchPatientDonorGenotypes` may be very large.
- Use `SQL_Homework_1_HomeworkCheck` to monitor the progress of PDP processing.

##### Reporting the Results
The following SQL queries in `\MiscTestingAndDebuggingResources\ManualTesting\MatchPredictionValidation\exercise4\` can be used to report/investigate results:
- `SQL_Homework_2_SubjectsWithMissingHla.sql` - selects patients and donors that have missing required HLA typings, and so would not have been included in the results of the original MV4 exercise.
- `SQL_Homework_3_ImputationResults.sql` - selects info for a given PDP:
- Original HLA phenotype, frequency metadata, and imputation summary: patient, donor.
- List of potential genotypes with likelihoods: patient, donor.
- List of patient/donor genotype combinations, with match counts and likelihoods.
- Match probabilities for total number of mismatches, and locus mismatches.

<br>

## Match Prediction Verification using simulated data
Expand Down

0 comments on commit 7b9a206

Please sign in to comment.