-
-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PACER documents can belong to multiple cases #765
Comments
BTW, freelawproject/recap#174 (comment) is an extreme example, with a single order belonging to 5 civil cases. |
Ok, so making some headway here. Things that need to happen for this to go off as a success:
(I'll be editing and updating this comment as I identify more things to do.) |
I went through the failed documents that had this issue and reprocessed them all. Of 219, all but ten were processed successfully. This will have an even bigger effect on dockets, but I'm just going to let that take effect going forward, rather than reprocessing all the dockets we've already received (that would be a pretty big job for me and the server). |
(Split off from #2185)
This is going to be a rough one to fix properly, but it's worth figuring out. The basic problem as stated in the parent ticket, is:
I'm seeing this right now with
pacer_doc_id
12707472047, which occurs as an attachment in both:https://www.courtlistener.com/docket/4343877/in-re-state-street-bank-and-trust-co-fixed-income-funds-investment/?page=2#entry-105
And:
https://www.courtlistener.com/docket/4345781/yu-v-state-street-corp/#entry-58
Both involve "state street bank". The issue as it's hitting me today is that I can't add the document because I have a unique constraint on pacer_doc_id, and sure enough they both have the same value.
The solution here (as discussed in depth in the RECAP channel on Slack today) is to remove the unique constraint and just let the document exist in our system twice. We'll have to go through a fair bit of code to make sure this doesn't cause problems, but it's probably the right way forward.
Other solutions
The other ways forward are either:
Adding an alias field joining the RECAPDocument table to itself to handle this case. That could work, but it's kind of a mess. It's really not ideal.
Remodel the DB to pull apart the document data itself from any PACER metadata. This could work, but it adds a fourth level of joins to the model, it adds complexity, and it's a huge change.
The text was updated successfully, but these errors were encountered: