New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Same patient id assigned to multiple patients in a same instance #7670
Comments
Checking the instance, I'm noticing that the two documents were created in |
Very many things have changed in the code that affects how transitions work, that it would be impossible to know what the problem was without knowing exactly which version the server was running at the time the duplicate id documents were created. |
Hi @dianabarsan , per this comment, upgrade from 2.13.x to 3.7 was completed on December 5, 2019. We saw this document |
It was weekend and I was wrong, I was able to find the patient document for both duplicates. For the initial example we have:
For the second example, we have:
In both these examples, the "later" report ended up creating the patient. |
is it possible that the initially created contacts were accidentally deleted? |
I'm asking because I found a couple of issues that talk about deleting training data on this instance. |
Hi @dianabarsan, I can find older documents and patients as well on production instance and so I doubt that previous id was reassigned because old one got deleted. Even if there was a logic to reuse id as such, wouldn't it be a problem ? |
Then we have 2 problems here:
I'm not so sure, since I found issues about training data being mixed with real data, and even being deleted by mistake. |
I think the issue you linked is for another instance, and that might have happened for that particular instance. But, I don't believe same happened for 7 different project instances I listed here. I also searched for likely issues that you linked in above comment. |
Looking at these projects can we:
|
Hi @dianabarsan , I've shared you the requested information. Apart from this case, most of the issues appeared to have occurred within 2.x version. |
I think we have some evidence about why this is happening. Both API and Sentinel generate a list of patient_ids at the beginning and check for uniqueness, and don't re-check when attaching the patient_id to the new contact. This means that API and Sentinel can have the same patient_id in their "cache" (because at the moment of generating this cache, the patient_id is not used) and then as contacts are created, produce collisions. Thanks @binokaryg and @1yuv for the investigation! |
Hi @dianabarsan , as we saw similar problem in 4.x deployment medic/config-moh-nepal#1034 for the reason that you mentioned here. Is it posisble to make it configurable for only one (API or Sentinel) to generate and attach IDs? Or any other options for us to avoid collision ? |
I think we should allow both API and Sentinel to generate patient IDs to allow for hybrid SMS and smartphone deployments and instead change the algorithm somehow to not generate conflicts. Some ways I can think of are...
|
@1yuv What's the priority on this? Do you have a workaround? |
Hi @garethbowen , we haven't found the workaround for this yet. We've mToT starting 12th October for province and districts. Production date is not finalized yet, but it'll be around end of October. We expect workaround of fix to be available by the production time if not possible during ToT. |
On the path to horizontal scalability and high availability for API and/or we're going to have to solve this properly because every API will have its own cache and conflicts will get progressively more common. A more complete solutions would be to use a shared cache instead of a per-process one. This could be a doc in couchdb (which has its own set of problems) or a separate service (eg: redis) which has complexity cost. I can't help but feel this cache is a premature optimisation - making the view request to find a unique shortcode should be quick enough. Just generate 100 and query the view (like we do now), but just use the first one and throw the rest away. There's still a small chance of conflict if two patients are registered at the same time, but it's now a very small window. @dianabarsan I'd love your input here, both on a patch for backporting on a tight deadline, but also the more complete solution for when we have time to think it through... |
On throwing away rest of the codes and using few numbers to avoid collision, are we not already doing that? Because I checked in couple of instances and I see less than 10,000 patient ids that are 5 digit long, same with 6 digit long, and now 7 digit long patient IDs are being generated. Only on one instance there are 16,000 IDs that are 6 digit long and now 7 digit long ids are being generated now. Given, only 10 out of available 100 (10,000 from 99,999) ranges of codes are used already, I doubt on how much it will help in reducing collision by making 1 out of 100. |
When the ID generator gets too many collisions it bumps the length automatically. This is a little random so it could happen prematurely.
Remember 5 digit long numbers only have 4 digits of random and 1 checksum, so you would expect less than 10k patient IDs that are 5 digits long. Once the random collisions become too frequent, the length just gets bumped.
The change I'm suggesting is instead of finding 100 unique IDs and storing them in memory until they're all used up (which could take days), just generate one when needed and check if it's unique right away. That way we reduce the window of possible collision down from days to milliseconds. |
This was introduced in 3.5.0 in this issue but has only become a problem with projects running combined smartphone and sms deployments. |
Marking as high priority because the consequence of a duplicate ID can be serious for the patient. |
Merged and backported to |
I am reopenning this issue since we've seen same patient id assigned to multiple patients on the fix release (4.5.0) as well. medic/config-moh-nepal#1197. |
@1yuv From https://github.com/medic/config-moh-nepal/issues/1197 as Diana says it looks like one of the docs had a patient_id not generated by sentinel. Furthermore the one that replicated first was the one that ran the generate ID code. So we need to figure out how the patient_id got generated on the doc that replicated to the server second. Do you have any context that could help here? Is it possibly being generated offline? Or is the record imported from somewhere else? At this point I think the cause of the duplicate id is not the same as this original issue. |
I believe the reported issue was closed, it was training data that had duplicate ids. |
Describe the bug
I've seen that with same patient_id has been assigned to more than one patient. This causes a huge impact on the service being delivered as two patients will be treated as a single patient. This also has impact on impact monitoring and analytics.
To Reproduce
Steps to reproduce the behavior:
d935a8c916709b7b13a46e01b97fb9c2
64644cba05d134cef802c0202a61c855
97032
.Expected behavior
Each patient should have an unique patient id generated for them.
Environment
The text was updated successfully, but these errors were encountered: