Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a script that monitors for discrepancies between master data and search index #34

Closed
mdorf opened this issue Aug 7, 2020 · 2 comments

Comments

@mdorf
Copy link
Member

mdorf commented Aug 7, 2020

When ncbo/bioportal-project#165 was implemented, we had discovered that the search index is not in sync with the master data (coming from a triple store) for multiple ontologies. We need a script that runs periodically (nightly?) and verifies that the search index contains ALL ontology data.

@mdorf mdorf self-assigned this Aug 7, 2020
mdorf added a commit that referenced this issue Aug 19, 2020
mdorf added a commit that referenced this issue Aug 27, 2020
mdorf added a commit that referenced this issue Aug 31, 2020
@graybeal
Copy link

Looks like these are the baddies today. I've gathered the following info from their summary pages, since the Admin page isn't updating. Looks like BMO, GENO, ADMO, CHIRO, and ORDO_PL might be worth at the logs before re-indexing, to see if it's obvious what went wrong.

For the rest of them (all 'ERROR INDEXED'), re-indexing might just repeat the original problem, but at least we'll know the scope of work that's left.

~/Downloads/index-synchronizer.log:2324: I, [2020-08-30T21:56:18.242463 #68167]  INFO -- : Ontology xxx is missing classes from the index. Queued for re-indexing.

BMO: 0.5 (Uploaded, Error Annotator)	11/03/2014
PDRO: unknown (Parsed, Metrics, Error Annotator, Error Indexed, Error Indexed Properties)	02/25/2020
PHAGE: 5.0 (Parsed, Metrics, Annotator, Error Indexed)	05/02/2016
GENO: unknown (Parsed, Annotator, Error Obsolete)	03/08/2020
FOODON: 0.4.5 (Parsed, Metrics, Error Indexed)	06/21/2020
ADMO: beta (Uploaded, Error Rdf Labels)	10/17/2018
CHIRO: unknown (Parsed, Indexed, Metrics, Annotator)	11/23/2015
OCHV: 1 (Parsed, Metrics, Annotator, Error Indexed)	01/21/2016
ORDO_PL: 3.0 (Uploaded, Error Rdf Labels)	07/06/2020
ABD: unknown (Parsed, Metrics, Annotator, Error Indexed, Error Diff)	09/13/2016
UPHENO: unknown (Parsed, Metrics, Annotator, Error Indexed)	01/24/2020
EO: 1.0 (Parsed, Metrics, Annotator, Error Indexed)	10/12/2015

@mdorf
Copy link
Member Author

mdorf commented Sep 1, 2020

Investigation report attached.
indexing_error_report.txt

@mdorf mdorf closed this as completed Sep 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants