-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API] Discrepancy between the total
annotations and the number of returned annotations
#7796
Comments
It looks like there is an entry in our Elasticsearch index which doesn't correspond to any annotation in Postgres. My guess is that at some point an annotation was deleted from Hypothesis but the deletion wasn't executed in Elasticsearch. The Comparing entries in our production DB for this URL and the Elasticsearch service (using Kibana (internal link)), I see there are 2 shared annotations in the select * from annotation where document_id in (select document_id from document_uri where uri_normalized = 'httpx://docdrop.org/ocr') The id of the annotation in Elasticsearch that doesn't appear in Postgres is |
The account https://hypothes.is/users/you.me.hypothesis.thisness.us has been deleted recently though - they even announced it with a Good Bye. Seems some trigger events on account deletion might be missing or something? Thanks for the lookup. |
Insight from @seanh about our Elasticsearch <-> Postgres sync:
|
Looking at former links bookmarked by the user via a Memento, deletion seems to have worked for all the links I've tested:
Yes, it might look like a db incident, or perhaps and edge case in the code. If you have an entrypoint in the code, I could give a look to spot a potential edge case. I'll keep looking at that buggy link to check the value of |
Testing with a Greasemonkey script for all the links of the page, links deletion of the deleted account appears ok: for each link, 🐒 GM Script for Hypothesis API deletion test// ==UserScript==
// @name Hypothesis API deletion test
// @version 1
// @grant none
// @include https://web.archive.org/web/20221223125432if_/https://hypothes.is/users/you.me.hypothesis.thisness.us
// ==/UserScript==
(async function() {
'use strict';
const links = document.querySelectorAll('.search-bucket-stats__val.search-bucket-stats__url > a.link--plain');
const brokenLinks = [];
for await (let { href } of links) {
const response = await fetch(`https://hypothes.is/api/search?uri=${encodeURIComponent(href)}`);
const { total, rows: { length } } = await response.json();
if (total === length) {
console.log('URI OK', href);
} else {
brokenLinks.push(href);
console.error('URI Error', href, total, length);
}
}
console.warn(brokenLinks.length + ' broken links', brokenLinks);
})(); -> Can't figure what caused that DB glitch. |
Erratum: I've had forgotten to rewrite the resource URL in the previous version of the script, and there are 8 discrepancies, though: 🐒 GM Script for Hypothesis API deletion test// ==UserScript==
// @name Hypothesis API deletion test
// @version 2
// @grant none
// @include https://web.archive.org/web/20221223125432if_/https://hypothes.is/users/you.me.hypothesis.thisness.us
// ==/UserScript==
(async function() {
'use strict';
const links = document.querySelectorAll('.search-bucket-stats__val.search-bucket-stats__url > a.link--plain');
const brokenLinks = [];
for await (let { href } of links) {
href = href.replace("https://web.archive.org/web/20221223125432/", "");
const response = await fetch(`https://hypothes.is/api/search?uri=${encodeURIComponent(href)}`);
const { total, rows: { length } } = await response.json();
if (total === length) {
console.log('URI OK', href, { total, length });
} else {
brokenLinks.push(href);
console.error('URI Error', href, { total, length });
}
}
console.warn(brokenLinks.length + ' broken links', brokenLinks);
})();
|
I encountered another instance of this in hypothesis/client#5219, which caused a major problem in the client, where the Notebook showed only a small fraction of the annotations in the group. |
Is there any update with this bug ? Can we rely on the |
No, nothing has happened since my last comments.
Keep iterating through pages with |
Alright. Thanks |
Querying the API for an URI, the API returns a
total
of 3 but with 2 annotations:Archived version of the API payload
The text was updated successfully, but these errors were encountered: