Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIH dictionary import #45

Closed
paynejd opened this issue Feb 28, 2018 · 100 comments
Closed

PIH dictionary import #45

paynejd opened this issue Feb 28, 2018 · 100 comments
Assignees
Labels
enhancement New feature or request openmrs

Comments

@paynejd
Copy link
Member

paynejd commented Feb 28, 2018

No description provided.

@maurya maurya added the enhancement New feature or request label Oct 24, 2019
@lnball
Copy link

lnball commented Mar 24, 2020

FYI @karuhanga

The PIHEMR dictionary has been cleaned up and in the OpenMRS dropbox. When we are ready to try again, I can recreate the latest version. There are some minor changes.

@paynejd
Copy link
Member Author

paynejd commented Mar 24, 2020 via email

@paynejd paynejd assigned bmamlin and paynejd and unassigned rkorytkowski Apr 26, 2021
@paynejd paynejd added this to the Sprint 10 milestone May 7, 2021
@paynejd paynejd modified the milestones: Sprint 10, Sprint 11 May 21, 2021
@jamlung-ri
Copy link
Member

Has been validated against OpenMRS schema and can be imported. @bmamlin has been doing some importing on Staging. If this work is done, the ticket can be closed.

@paynejd paynejd modified the milestones: Sprint 11, Sprint 12 Jun 4, 2021
@paynejd paynejd modified the milestones: Sprint 12, Sprint 13 Jun 18, 2021
@jamlung-ri jamlung-ri modified the milestones: Sprint 13, Sprint 14 Jul 2, 2021
@jamlung-ri
Copy link
Member

Moving to the next sprint since there are discrepancies remaining

@paynejd paynejd removed their assignment Jul 13, 2021
@paynejd paynejd modified the milestones: Sprint 14, Sprint 15 Jul 13, 2021
@bmamlin
Copy link

bmamlin commented Jul 30, 2021

The PIH dictionary was imported onto staging a couple months ago. This uncovered some issues with custom validation schema (#732), an issue with encoding of URLs for codes (#807), and the inability to delete a source if any mappings exist from other sources.

PIH-temp was imported a couple of weeks ago using the OpenMRS Custom Validation. After this import completed and an export of HEAD was downloaded, a few differences were discovered:

  • One missing concept: "Dextrose 5%" (3779) on OCL.
    • This is one of five retired concepts. The other retired concepts (e.g., 802) were imported successfully.
    • I thought the percent sign (%) might be an issue, but there are 30 concepts with percent signs in their names.
  • Two mappings that didn't match between import and OCL: 3779 → "Dextrose 5%" and 3779 → 7896. Both are mappings from the concept that didn't import (3779). I think these didn't match between the import and OCL export because in these mappings on OCL the from_concept_url isn't populated (maybe because there was an error importing concept 3779?).
  • Six mappings for the concept "Time units" (466) are missing on OCL. I'm not yet sure why these failed to import.
    • 466 –Q-AND-A458
    • 466 –Q-AND-A464
    • 466 –CONCEPT-SET460
    • 466 –CONCEPT-SET461
    • 466 –CONCEPT-SET →462
    • 466 –CONCEPT-SET463
Click for full diff
{
  "missing_concepts":[
    {"retired": true, "datatype": "N/A", "type": "Concept", "concept_class": "Misc", "source": "PIH-temp", "extras": {}, "names": [{"locale": "fr", "external_id": "3e754086-55c0-4862-8a50-26e0b6f8f512", "locale_preferred": true, "name": "Dextrose 5% fluide", "name_type": "FULLY_SPECIFIED"}, {"locale": "en", "external_id": "5c702054-6a5c-11e2-b6f9-aa00f871a3e1", "locale_preferred": true, "name": "Dextrose 5%", "name_type": "FULLY_SPECIFIED"}], "owner": "PIH", "owner_type": "Organization", "external_id": "5c6b1834-6a5c-11e2-b6f9-aa00f871a3e1", "id": "3779", "descriptions": []}
  ],
  "extra_concepts": [],
  "missing_mappings": [
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "0b24a340-15f5-102d-96e4-000c29c2a5d7", "source": "PIH-temp", "owner": "PIH", "map_type": "Q-AND-A", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/458/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "5013CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC", "source": "PIH-temp", "owner": "PIH", "map_type": "Q-AND-A", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/464/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "1865AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", "source": "PIH-temp", "owner": "PIH", "map_type": "CONCEPT-SET", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/461/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "1862AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", "source": "PIH-temp", "owner": "PIH", "map_type": "CONCEPT-SET", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/460/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "1860AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", "source": "PIH-temp", "owner": "PIH", "map_type": "CONCEPT-SET", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/463/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/466/", "external_id": "1864AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", "source": "PIH-temp", "owner": "PIH", "map_type": "CONCEPT-SET", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/462/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/3779/", "external_id": "5c703940-6a5c-11e2-b6f9-aa00f871a3e1", "source": "PIH-temp", "owner": "PIH", "map_type": "SAME-AS", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/Dextrose+5%25/"},
    {"retired": false, "from_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/3779/", "external_id": "5c706ce4-6a5c-11e2-b6f9-aa00f871a3e1", "source": "PIH-temp", "owner": "PIH", "map_type": "SAME-AS", "owner_type": "Organization", "type": "Mapping", "to_concept_url": "/orgs/PIH/sources/PIH-temp/concepts/7896/"}
  ],
  "extra_mappings": [
    {"external_id": "5c703940-6a5c-11e2-b6f9-aa00f871a3e1", "retired": false, "map_type": "SAME-AS", "source": "PIH-temp", "owner": "PIH", "owner_type": "Organization", "from_concept_code": "3779", "from_concept_name": null, "from_concept_url": "", "to_concept_code": "Dextrose 5%", "to_concept_name": null, "to_concept_url": null, "from_source_owner": "PIH", "from_source_owner_type": "Organization", "from_source_url": "/orgs/PIH/sources/PIH-temp/", "from_source_name": "PIH-temp", "to_source_owner": "PIH", "to_source_owner_type": "Organization", "to_source_url": "/orgs/PIH/sources/PIH-temp/", "to_source_name": "PIH-temp", "url": "/orgs/PIH/sources/PIH-temp/mappings/2557764/", "version": "2557769", "id": "2557764", "versioned_object_id": 2557764, "versioned_object_url": "/orgs/PIH/sources/PIH-temp/mappings/2557764/", "is_latest_version": true, "update_comment": null, "version_url": "/orgs/PIH/sources/PIH-temp/mappings/2557764/2557769/", "uuid": "2557769", "version_created_on": "2021-07-25T15:32:26.538712Z", "from_source_version": null, "to_source_version": null, "extras": {}, "type": "Mapping", "created_on": "2021-07-25T15:32:26.538712Z", "updated_on": "2021-07-25T15:32:26.548985Z", "created_by": "ocladmin", "updated_by": "ocladmin", "internal_reference_id": "2557769"},
    {"external_id": "5c706ce4-6a5c-11e2-b6f9-aa00f871a3e1", "retired": false, "map_type": "SAME-AS", "source": "PIH-temp", "owner": "PIH", "owner_type": "Organization", "from_concept_code": "3779", "from_concept_name": null, "from_concept_url": "", "to_concept_code": "7896", "to_concept_name": null, "to_concept_url": null, "from_source_owner": "PIH", "from_source_owner_type": "Organization", "from_source_url": "/orgs/PIH/sources/PIH-temp/", "from_source_name": "PIH-temp", "to_source_owner": "PIH", "to_source_owner_type": "Organization", "to_source_url": "/orgs/PIH/sources/PIH-temp/", "to_source_name": "PIH-temp", "url": "/orgs/PIH/sources/PIH-temp/mappings/2554070/", "version": "2554073", "id": "2554070", "versioned_object_id": 2554070, "versioned_object_url": "/orgs/PIH/sources/PIH-temp/mappings/2554070/", "is_latest_version": true, "update_comment": null, "version_url": "/orgs/PIH/sources/PIH-temp/mappings/2554070/2554073/", "uuid": "2554073", "version_created_on": "2021-07-25T15:30:53.725502Z", "from_source_version": null, "to_source_version": null, "extras": {}, "type": "Mapping", "created_on": "2021-07-25T15:30:53.725502Z", "updated_on": "2021-07-25T15:30:53.739779Z", "created_by": "ocladmin", "updated_by": "ocladmin", "internal_reference_id": "2554073"}
  ]
}

@snyaggarwal
Copy link
Contributor

@bmamlin The concept 3779 failed to import because of locales -- #732 (comment)

I will check the mappings.

@bmamlin
Copy link

bmamlin commented Jul 30, 2021

@bmamlin The concept 3779 failed to import because of locales -- #732 (comment)

Thanks @snyaggarwal. That makes sense. I'd like to confirm (with Andy & Jon) that this name uniqueness constraint should apply across retired concepts (e.g., if I want to retire a concept and replace it with an improved concept with the same name, I'd be forced to rename the old retired concept before I could add the new one).

@jamlung-ri jamlung-ri modified the milestones: Sprint 15, Sprint 16 Jul 30, 2021
@lnball
Copy link

lnball commented Jul 30, 2021

@bmamlin I think the issue is that the concept "Dextrose 5% (PIH:7896)" is retired but the names are not voided. I guess this is correct behavior but OCL doesn't like it. It conflicts with the 'en' synonym of "Dextrose 5% (CIEL:161250)". There could be data for the retired concept in various deployments so I cannot delete the retired one.

I can write liquibase to migrate the data, update any forms that used the old concept, then delete the old concept. Let me know if that would be helpful.

@bmamlin
Copy link

bmamlin commented Aug 4, 2021

@snyaggarwal I confirmed we want to ignore retired concepts (and ignore retired concept names if OCL allows for names to be retired) when checking for duplicates.

@snyaggarwal
Copy link
Contributor

@bmamlin ok that make sense. I will update the validator.

@bmamlin
Copy link

bmamlin commented Aug 4, 2021

@snyaggarwal did we figure out the issue with the 6 mappings that didn't go in? Is OCL preventing multiple mappings between the same two concepts?

@lnball my hunch is "Time Units" (PIH 6412) is an oddball concept that is both coded (with answers) and has concept set members (even though it's not a concept set based on its datatype).

@mseaton
Copy link

mseaton commented Dec 15, 2021

@bmamlin - I had a chance to do some testing with this today and things are looking much, much better. There is still an issue with around 12 concepts not making it into the final collection somehow. These are mostly but not entirely the same 12 concepts as I reported earlier in this thread. When I import the OCL package zip into a clean DB, I get errors trying to import mappings that refer to these concepts, and the concepts don't exist. And all of these concepts should exist. They are as follows:

As you can see, all of these concepts were successfully imported into the PIH source. But for whatever reason they are not part of the PIH collection. So I'm not really sure this is an issue with the import tool, or a broader OCL bug, or something else - thoughts?

Note, I also find it strange that in OCL, it says that the PIH Collection contains 5942 concepts, yet the database that we provided as a source contains 5960 concepts, which is then only 5948 concepts due to the 12 that are not imported that are listed above. Any idea why OCL lists the collection as 5942 and not 5948? See version 1.0.0 that I created here: https://app.staging.openconceptlab.org/#/orgs/PIH/collections/PIH/versions

@snyaggarwal
Copy link
Contributor

@mseaton OCL 5942 are active concepts and 6 are retired concepts. If you include retired ones the count is correct 5948

@bmamlin
Copy link

bmamlin commented Dec 20, 2021

As you can see, all of these concepts were successfully imported into the PIH source. But for whatever reason they are not part of the PIH collection. So I'm not really sure this is an issue with the import tool, or a broader OCL bug, or something else - thoughts?

Interesting. I created the PIH collection using:

PUT /orgs/PIH/collections/PIH/references/
{
  "data": {
    "expressions": [
      "/org/PIH/sources/PIH/concepts/*",
      "/org/PIH/sources/PIH/mappings/*"
    ]
  }
}

which should include all concepts and mappings from the PIH source within the collection. As a first step, I'll try deleting the PIH collection and recreating it to see if the problem is reproducible.

@bmamlin
Copy link

bmamlin commented Dec 20, 2021

Well, it looks like re-creating the PIH collection brought us back to the same place... so it's recreatable. Interestingly, it looks like, while all 12 are missing from the collection, 5 of the 12 concepts exist when fetched directly via the API. I'm not seeing an obvious pattern.

Source Collection in App API
Mobility Missing Exists
Creatinine (umol/L) Missing Exists
Total cholesterol Missing Missing
ABC Missing Exists
Urinary Cast, Hyaline Missing Missing
Pregnancy Missing Exists
Green Missing Missing
Negative Missing Exists
Protein Missing Missing
Case status Missing Missing
Urinalysis Missing Missing
Glucose tolerance test Missing Missing

@snyaggarwal, any idea why the 12 concepts above aren't being seen in the PIH collection on staging? I created them using the API call mentioned above to add all concepts & mappings to the collection (twice now) and these 12 are consistently misbehaving. I can find five of them as concepts through the collection using the API, the other seven are missing, and all 12 are missing from the collection in the app.

@lnball
Copy link

lnball commented Dec 20, 2021

Not obvious to me. You have questions, drug, misc, test. Assorted mappings. Thanks for the sleuthing @bmamlin

@snyaggarwal
Copy link
Contributor

@bmamlin Below are the reported failures for PIH collection add_references task:

{
"task-id": "677ca08f-3238-47ec-95df-2f89587129b9",
"state": "SUCCESS",
"result": {
"/orgs/PIH/sources/PIH/concepts/8887/5561799/": [
"Concept preferred name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/302/5563616/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/12299/5563483/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/7085/5564293/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/12060/5572237/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/7896/5572611/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/12270/5563020/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/1006/5563457/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/3188/5572841/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/11967/5563402/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/12659/5573081/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/10133/5569173/": [
"Concept fully specified name must be unique for same collection and locale."
],
"/orgs/PIH/sources/PIH/concepts/11689/5563408/": [
"Concept preferred name must be unique for same collection and locale."
]
}
}

@bmamlin
Copy link

bmamlin commented Dec 21, 2021

@snyaggarwal, thanks. That's interesting, since the PIH source also has the OpenMRS custom validation schema. If these were duplicates, I wouldn't expect them to have been allowed into the source. Do collections and sources use the same custom validation logic? If so, how could entries in a collection pass validation in the source and fail in the collection?

@snyaggarwal
Copy link
Contributor

@bmamlin Ya the OpenMRS validation for collection is not exactly the same -- here is the code

@bmamlin
Copy link

bmamlin commented Dec 21, 2021

Thanks @snyaggarwal. I think we're getting closer to the issue. I took one of the examples from your list of errors: the concept "Green". The PIH source contains Green (8887) and Green color (12572). The latter has a short name "Green". This is allowed, since the duplicate constraint for fully specified names and preferred names should only be checking within fully specified names and preferred names of the same locale, respectively. In other words, if one concept has a preferred name of "Green" in the en locale, then no other concept can also have a preferred name of "Green" in the en locale; however, it's okay for another concept to have a short name or a non-preferred synonym of "Green" in the en locale.

I believe the fix for this with source validation might be in this line, scoping the search to the specific name type. Can we do the same for collections (checking for duplicate FSNs should only look across FSNs in the same locale and checking for duplicate preferred names should only look across preferred names in the same locale)? I believe this will solve our problem.

@snyaggarwal
Copy link
Contributor

Yep working on it.

snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Dec 21, 2021
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Dec 21, 2021
snyaggarwal added a commit to OpenConceptLab/oclapi2 that referenced this issue Dec 21, 2021
@snyaggarwal
Copy link
Contributor

@bmamlin This is deployed on Staging now.

@mogoodrich
Copy link

Awesome, thanks so much @bmamlin @snyaggarwal @lnball

@bmamlin
Copy link

bmamlin commented Dec 22, 2021

Thanks @snyaggarwal. I've deleted the PIH collection on staging and am recreating it now. 🤞

@bmamlin
Copy link

bmamlin commented Dec 22, 2021

@snyaggarwal we seem to be much closer. The concept & mapping counts are now exact matches between PIH collection & source on Staging 🥳; however, some of the concepts don't show up in the UI as expected. For each of the following seven concepts, they seem to be missing from the collection (they don't show up as expected in the UI), yet they can be retrieved via the API:

Source Collection in App API
Total cholesterol Missing Exists
Urinary Cast, Hyaline Missing Exists
Green Missing Exists
Protein Missing Exists
Case status Missing Exists
Urinalysis Missing Exists
Glucose tolerance test Missing Exists

For example, the API returns /orgs/PIH/collections/PIH/concepts/8887/ with "id": "8887" but a search within the PIH collection for an exact match to 8887 comes up empty. Any idea what might explain that behavior?

BTW, I tried to get a full report of the task from /importers/bulk-import/?task=a53bc8dd-e9b7-4225-884f-d77fc82bbc27&result=json, but only get this back instead of a full list of successes & failures as expected:

{
    "queue": "default",
    "state": "SUCCESS",
    "task": "a53bc8dd-e9b7-4225-884f-d77fc82bbc27",
    "username": ""
}

@snyaggarwal
Copy link
Contributor

@bmamlin its may be because the indexing is still happening.

@snyaggarwal
Copy link
Contributor

@bmamlin I can see 8887 now. I would wait for indexing to finish. I will update here when that gets done.

@bmamlin
Copy link

bmamlin commented Dec 22, 2021

Ah. Awesome! It looks like it was an indexing delay after all. Everything is showing up as expected now!

@bmamlin
Copy link

bmamlin commented Dec 22, 2021

@mseaton @mogoodrich take another look at PIH on Staging. The dictionary (source & collection) are looking pretty good. Throw some rocks at it and see if it looks good. If it does, we can consider the import process as "working" and we can start tackling other issues (source handling, sync, etc.).

@lnball
Copy link

lnball commented Dec 22, 2021

What is this? The not-link symbol says "Target concept is not defined in OCL".

image

@bmamlin
Copy link

bmamlin commented Jan 3, 2022

What is this? The not-link symbol says "Target concept is not defined in OCL".

OCL term browser automatically creates a link for mappings to concepts that exist in OCL. Mappings to things that don't exist in OCL (like the code "Triage green") can't be linked. So, term browser is just saying "I can't take you to this concept." Showing the inability to link makes a little more sense when the mapping is to a code in a standard (like LOINC or ICD) or a concept in another implementation's dictionary that doesn't happen to be loaded into OCL rather than a mapping like this to an free text code PIH is using for convenience.

@mseaton
Copy link

mseaton commented Jan 7, 2022

@mseaton @mogoodrich take another look at PIH on Staging

Success!! Things are looking good @bmamlin - thanks! The number of concepts match exactly, and my comparisons only find a few issues, all of which are known and generally considered low priority or a non-priority. Those are that we lose concept version, concept retire reason, and voided concept names in the process of moving our concepts to OCL.

So yes, I think we can consider this "working" and that we have successfully imported the PIH dictionary into OCL and can get it back out again and imported into a new OpenMRS instance with the OCL subscription module.

Tackling the issue of source handling sounds like a good next area to focus on. Thanks again for all of the help and work on this.

@mseaton
Copy link

mseaton commented Jan 7, 2022

Just noticed we are coming up on the 4 year anniversery for this ticket - great to see it getting so close to resolution!

@bmamlin
Copy link

bmamlin commented Jan 14, 2022

Success!! Things are looking good @bmamlin - thanks!

Thanks so much, @mseaton. Given this success, I'm going to close this ticket to acknowledge the fundamental work of being able to import PIH into OCL has been successfully completed. We can track remaining steps of getting to a place where PIH can use OCL to manage their dictionaries through ongoing discussions in Talk and other tickets. Done in less than 4 years! 😄

@bmamlin bmamlin closed this as completed Jan 14, 2022
@paynejd
Copy link
Member Author

paynejd commented Jan 14, 2022

Wahooo! Congratulations everyone- it's been a long time coming ;)

@lnball
Copy link

lnball commented Jan 14, 2022

4 years! A new record???

Thanks so much, team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request openmrs
Projects
None yet
Development

No branches or pull requests