Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiate "not found" from cache hit vs actual not found #115

Closed
ronaldtse opened this issue Aug 21, 2023 · 7 comments
Closed

Differentiate "not found" from cache hit vs actual not found #115

ronaldtse opened this issue Aug 21, 2023 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@ronaldtse
Copy link
Contributor

Sometimes the cache of an entry "not found" can be found in reality, but the cache was stuck to an old result of "not found".

e.g.

Using relaton-nist 1.14.9
...
$ bundle exec relaton fetch 'NIST CSWP 6'
[relaton] (NIST CSWP 6) not found.
No matching bibliographic entry found

From @andrew2net:

the message [relaton] (NIST CSWP 6) not found. is from relaton gem, which acts as a cache. A response from the previous version of relaton-nist was stored in the cache. We had functionality that cleans a cache in case gem's version is changed, but we moved to schema version control. So it needs to run relaton db clear now.

Users will never be able to figure this out.

When the Relaton-xxx gem is updated, should the cache be wiped? At least the "not found" ones?

This information needs to be described in the output:

  • it should say "not found in cache, if you wish to ignore cache please run with "ignore cache" command or wipe the cache

In any case, we must differentiate a "cache hit not found" vs the "actual not found", at the library level. And provide an "ignore cache" option.

From relaton/relaton-nist#93

@ronaldtse ronaldtse added the bug Something isn't working label Aug 21, 2023
@andrew2net
Copy link
Contributor

andrew2net commented Sep 7, 2023

When the Relaton-xxx gem is updated, should the cache be wiped? At least the "not found" ones?

@ronaldtse we had the behavior when a flavor cache was wiped if the flavor gem's version was changed. @opoudjis initiated moving from the algorithm depended on a gem version. Let's ask Nick's opinion.
Why at least "not found"? It's not a rare case when we fix content of bibitems. In such cases the cache also needs to be updated.

@opoudjis
Copy link
Contributor

opoudjis commented Sep 8, 2023

I'm having trouble remembering and understanding the distinction, and I do not have the free time to explore this.

The requirement is simple: if the output now is going to be different from what it used to be, we need to refetch the record.

The problem is, the schema changes some of the time, and the gem is changing constantly. The gem was changing so frequently, that we decided not to use gem updates as the criterion for flushing the cache, but schema updates instead.

Andrej, I cannot make the call of when to do a flush, because that depends on the content being exported, which is your area. I think switching to schema update or minor relaton version change as the trigger is the right balance to make: gem updates were simply too frequent.

That said, it would also make sense never to store a Not Found record in the cache, so that it always tries to refind such a version. Not just when relaton is updated: always. Because the Not Found could well just be the site being down that day. And if the not found is happening all the time, then there is an issue that relaton needs to resolve; it can't just be ignored by the user.

(I vaguely think we used not to store Not Found in the cache, or at least ignore it. But it's your code, you tell me.)

@ronaldtse
Copy link
Contributor Author

@andrew2net any updates here? Thanks!

@andrew2net
Copy link
Contributor

andrew2net commented Sep 12, 2023

(I vaguely think we used not to store Not Found in the cache, or at least ignore it. But it's your code, you tell me.)

The cases when Not Found is not stored in cache are:

  • Timeout error when trying to get a page.
  • There is an issue with iso.org when the website returns a page with wrong content. The relaton-iso tries to fetch up to 10 times before it sends to the relaton a signal to store Not Found (it needs to be checked if the problem still exists).

The Not Found is stored in cases:

  • server returns HTTP code 404.
  • document hot found in the datasource/dataset

Of course, we can stop storing Not Found in a cache, but it will slow down fetching documents if the Not Found reference is fetched again. It may be not a big problem when the relaton is used locally. The Not Found was supposed to be useful in API mode as a cloud cache. I hope we will implement it finally. So let's disable the Not Found for local use.

As for triggering a cache flushing, it seems that both, schema and gem version based algorithms are not good enough. Maybe we can create cache version constants in each flavor gem and update the constants manually in case the cache should be flushed.

@ronaldtse @opoudjis any thoughts?

@opoudjis
Copy link
Contributor

As for triggering a cache flushing, it seems that both, schema and a gem version are not good enough. Maybe we can create cache version constants in each flavor gem and update the constants manually in case the cache should be flushed.

... But someone has to know when to set that constant manually and when not to. That someone would have to be be you. I don't like it though, it's one more dependency that can go wrong; and it is better to flush too often than not to flush often enough.

@andrew2net
Copy link
Contributor

... But someone has to know when to set that constant manually and when not to. That someone would have to be be you. I don't like it though, it's one more dependency that can go wrong; and it is better to flush too often than not to flush often enough.

I don't like it too. So let's flush the cache more often triggering it by gem version changing.

andrew2net added a commit to relaton/relaton-bib that referenced this issue Sep 19, 2023
andrew2net added a commit to relaton/relaton-iso-bib that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-3gpp that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-bipm that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-bsi that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-calconnect that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-ccsds that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-cen that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-cie that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-ecma that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-gb that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-iana that referenced this issue Sep 20, 2023
andrew2net added a commit to relaton/relaton-iec that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-ieee that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-ietf that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-iho that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-itu that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-jis that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-nist that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-oasis that referenced this issue Sep 21, 2023
andrew2net added a commit to relaton/relaton-ogc that referenced this issue Sep 22, 2023
andrew2net added a commit to relaton/relaton-omg that referenced this issue Sep 22, 2023
andrew2net added a commit to relaton/relaton-un that referenced this issue Sep 22, 2023
andrew2net added a commit to relaton/relaton-doi that referenced this issue Sep 22, 2023
andrew2net added a commit to relaton/relaton-cli that referenced this issue Sep 25, 2023
@andrew2net
Copy link
Contributor

Fixed in relaton and relaton-cli v1.16.1. For ignore cache use --no-cache opinion

$ relaton fetch --no-cache "NIST CSWP 6"

andrew2net added a commit to relaton/relaton-w3c that referenced this issue Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants