Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIKI EDITS ARE BROKEN!! Users being shown & editing OUTDATED cached versions of documents #2040

Closed
seabelis opened this issue Apr 11, 2019 · 40 comments
Assignees
Labels
Module: Memcache Issues related to the configuration or use of the Memcache subsystem. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Regression Theme: Editing Issues related to the user editing/wiki editing experience. [managed]

Comments

@seabelis
Copy link
Collaborator

seabelis commented Apr 11, 2019

Description

All changes: edits to editions or works, list creation, cover changes...are lost on save.

Relevant url?

all

Expectation

Changes should be saved.

Details
No error message, and all functionality seems to work at the time, but changes are lost on refresh.

Logged In
Crome Version 73.0.3683.86 (Official Build) (64-bit)
Win 7 Pro

@seabelis seabelis changed the title Cannot add, remove, or change covers on works or editions All changes lost on save. Apr 11, 2019
@mekarpeles mekarpeles added blocker Module: Cover Service Cover Store (book covers service) Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Theme: Editing Issues related to the user editing/wiki editing experience. [managed] labels Apr 11, 2019
@seabelis
Copy link
Collaborator Author

For some reason I have been able to make changes to this record alone. https://openlibrary.org/books/OL4723885M/Bunnicula

@tfmorris
Copy link
Contributor

I had a quick look at this and for the author page that I edited, the edits were saved, but not rendered in the HTML, even after a hard refresh, so it seems likely that this is some kind of caching issue.

@tfmorris
Copy link
Contributor

For the record that @seabelis edited (https://openlibrary.org/books/OL4723885M/Bunnicula) the new cover, subtitle, etc are rendered correctly, but none of the edits are listed in the edit history at the bottom of the page, although they are listed when you click through to the detailed history.

For the record I edited (https://openlibrary.org/books/OL4723885M/Bunnicula), the bad AKA remains and the edit also is not listed in the summary edit history, but, again, is shown on the detailed edit history page.

@seabelis
Copy link
Collaborator Author

And I was not able to add to a list, but was able to add to my reading log.

@mekarpeles mekarpeles added this to Features in 2019 Epics via automation Apr 12, 2019
@mekarpeles mekarpeles moved this from Features to Unbreak Now! in 2019 Epics Apr 12, 2019
@mekarpeles mekarpeles self-assigned this Apr 12, 2019
@mekarpeles
Copy link
Member

I'm unable to reproduce -- I just edited https://openlibrary.org/books/OL4723885M/Bunnicula/edit and added an illustrator Alan Daniel and it shows up correctly on the html and in the history.

Is this still an issue?

@seabelis
Copy link
Collaborator Author

seabelis commented Apr 12, 2019

Yes, still an issue. That example was the one record I have been able to edit. All others I've tried, for this and other works, are still uneditable.

@stefanauss
Copy link

I'm also experiencing this. One example of a record i've been unable to save edit to is https://openlibrary.org/books/OL12320650M/LAN_Switching_and_Wireless_CCNA_Exploration_Companion_Guide_(2nd_Edition)_(Companion_Guide). This is also the case for the work it is linked to.

The thing is the edits do appear in the history but not rendered at all. Also when showing history, the radio button to select what to compare with the diff tool exclude the unrendered edit: the last and second to last rendered edits are pre-selected.

@seabelis
Copy link
Collaborator Author

Adding new editions is possible. Data added to the edit form beyond what is collected on the "Add new" form is saved and displayed as usual. Subsequent edits after that initial save are lost but shown in the detailed edit log.

@seabelis
Copy link
Collaborator Author

But I see edition count for the work is not updated nor is the new edition displayed on the edition list. This is the edition I added. https://openlibrary.org/books/OL26846444M/Educated

@mekarpeles
Copy link
Member

I'm wondering if this is database or solr updater related

@seabelis
Copy link
Collaborator Author

I take it back. The subsequent edits to show and the edition now shows on the list of editions. Edition count is still incorrect.

@tfmorris
Copy link
Contributor

It looks to me like it's rendering old versions. When I look at @stefanauss' example, after editing it, it's unchanged, but if I click on my edit in the history and view https://openlibrary.org/books/OL12320650M/LAN_Switching_and_Wireless_CCNA_Exploration_Companion_Guide_(2nd_Edition)_(Companion_Guide)?v=11 it includes my edits.

A perhaps related weirdness is when I click through to the detailed version history, it comes up with version 5 & 6 selected for the compare, indicating that for some reason it thinks that version 6 is the latest version.

@cdrini
Copy link
Collaborator

cdrini commented Apr 12, 2019

Since this is a break of previous functionality, let's see what changed recently.

Prove provisioning a generic minimal xenail VM (e.g. of the ol-mem flavor) and add it to the ol cluster (e.g. as ol-mem4)

Could this be related, @mek ? Maybe ol-mem is misbehaving and serving outdated copies of the site? What changed exactly?

@tfmorris
Copy link
Contributor

That's an interesting hypothesis. It also makes me wonder if the new ol-mem pool member could be related to the recent bursts of 5xx errors.

@mekarpeles
Copy link
Member

@tfmorris that I don't feel super confident about as it feels like the problem as occurring before introducing ol-mem4

@cdrini
Copy link
Collaborator

cdrini commented Apr 12, 2019

Ok, we were discussing on slack, and @mekarpeles just closed the memcache in question. We want to bring it back in ~10min (because it slows the site); is anyone available now to confirm if it worked? https://openlibrary.org/books/OL12320650M/LAN_Switching_and_Wireless_CCNA_Exploration_Companion_Guide_(2nd_Edition)_(Companion_Guide) seems to be displaying correctly, but I forgot to check how it looked just before we took down the memcache 😭

@stefanauss
Copy link

stefanauss commented Apr 12, 2019 via email

@cdrini
Copy link
Collaborator

cdrini commented Apr 12, 2019

Thanks @stefanauss ! Ok, we need some harder proof that this is the true culprit, because this would be quite an annoying bug to fix. We're reinstating the memcache now. Please post links to any works/editions you see not updating properly! We'll do another takedown test later and use those to determine if that's what's causing the problem (assuming I actually remember to check them this time 😛).

@seabelis
Copy link
Collaborator Author

It looks like the edits I made earlier have caught up, but new edits are lost on this edition. https://openlibrary.org/books/OL26336755M/Bunnicula
The only one I've tried so far.

@seabelis
Copy link
Collaborator Author

Lost on this also. https://openlibrary.org/books/OL7729374M/Bunnicula

@cdrini
Copy link
Collaborator

cdrini commented Apr 12, 2019

@cdrini
Copy link
Collaborator

cdrini commented Apr 12, 2019

Ahhh, thank you @stefanauss ! (Sorry for the lag; GitHub didn't show me your comment :P) That helps a ton; we can search a bit further back for what might have caused this issue.

@cdrini cdrini removed Module: Cover Service Cover Store (book covers service) cover-service labels Apr 12, 2019
@seabelis
Copy link
Collaborator Author

That's funny. I updated the title of https://openlibrary.org/works/OL15102237W/Bon%C3%ADcula and it is not reflected on the work record itself but the new title is shown on the list view:
Fullscreen capture 4132019 93547 AM bmp

@cdrini
Copy link
Collaborator

cdrini commented Apr 14, 2019

Status update:

  • This issue is reproducible on our local dev environment; this should make finding the commit that caused it easier
  • It's unrelated to the xenial changes
  • This file describes how memcache invalidates/updates when an infobase edit occurs: openlibrary/olbase/events.py
    • On prod, we saw the "setting up infobase events for Open Library" log but neither the "Edited by" or "invalidating" logs
    • locally, all logs appeared at the right moment: local web log after editing an edition.txt. Note the log does display a save_many log which appears to be saving the wrong version (ctrl-f for revision" in the log file posted above).

Possible next steps:

  • Try going back in git's history to determine the offending commit. See git bisect; we probably won't be able to fully automate the process, but it will help (maybe copying some CURL commands from the browser (with editing cookies) might let use fully automatically bisect this?)

@cdrini
Copy link
Collaborator

cdrini commented Apr 15, 2019

@seabelis You edit pretty regularly; what was the earliest date you noticed this issue?

@cdrini
Copy link
Collaborator

cdrini commented Apr 15, 2019

Alright, I've been testing backwards trying to find a commit where the error doesn't occur (so I could do a git bisect), and I've been able to find it all the way back to Dec 3, 2018 before stopping :/ I think that's too far back, so that makes me think my method is wrong. Here's my detailed methodology: https://gist.github.com/cdrini/9ee2c78da213b38262b02bbd9e35b1b4 ; @hornc does that look correct? Am I making any silly docker mistakes? @seabelis is it possible this issue could have been existing since then (or possibly even before then)?

@seabelis
Copy link
Collaborator Author

seabelis commented Apr 15, 2019

@cdrini I reported it immediately. Prior to that edits were generally working as expected. The only thing I did notice, and maybe this is unrelated, is that for about the two days prior the system was taking a bit longer to process changes at the work level. I don't mean that the edits were not immediately seen on the records themselves, but that the tags took a long time to (???: I have no word for this: when you add a tag, normally it takes about +/- 15 min. to be reflected on the tag's view or if the work's cover is updated, for that to be seen on the list view). Instead of 15 minutes this process was taking a day or more.

If this existed prior to a few days ago, it was infrequent and either got by me were passed off as my own mistakes. If it had been frequent, I'm confident I'd have noticed.

@mekarpeles mekarpeles added the Module: Memcache Issues related to the configuration or use of the Memcache subsystem. [managed] label Apr 15, 2019
@mekarpeles mekarpeles changed the title All changes lost on save. WIKI EDITS ARE BROKEN!! Users being shown & editing OUTDATED cached versions of documents Apr 15, 2019
@LeadSongDog
Copy link

Further example, edit on an author page today:
https://openlibrary.org/authors/OL5867300A/Thom_McGuire?b=2&a=1&_compare=Compare&m=diff
is not reflected:
auth unchanged

A similar edit on 10 April had no such problem:
https://openlibrary.org/authors/OL439448A?m=diff&b=2
and is properly shown at:
https://openlibrary.org/authors/OL439448A/Chuck_Murphy

@seabelis
Copy link
Collaborator Author

@cdrini Most editions I've made significant changes to are on this list. It seems like edits through the 10 April were fine. https://openlibrary.org/people/seabelis/lists/OL125878L/verified I'd estimate the slow-down of work tags/subjects populating that I mentioned above started approximately 5-8 April.

@LeadSongDog
Copy link

Hmm, https://openlibrary.org/authors/OL5867300A.json?v=2 shows the edits, but https://openlibrary.org/authors/OL5867300A.json does not. What's going wrong here?

@cdrini
Copy link
Collaborator

cdrini commented Apr 16, 2019

The (temp) fix is on dev, if folks want to help with testing! ( https://dev.openlibrary.org/ )

@seabelis
Copy link
Collaborator Author

@cdrini Covers still seem to be an issue.

@seabelis
Copy link
Collaborator Author

Edits to edition data is working normally. Cannot add to lists.

@seabelis
Copy link
Collaborator Author

Edition edits appear in the work edits log. This was not the case before.

@mekarpeles
Copy link
Member

@hornc believes this issue has been resolved by reverting openlibrary.yml and infobase.yml on production to a state before ol-mem4 was added.

It's hypothesized perhaps infobase needed to be restarted on ol-home via supervisorctl after deploy (prior to restarting ol-web3 and ol-web4) in addition to restarting ol-mem*.

I'm very tentatively marking this as resolved and hoping doing so will cause others who are still noticing a problem to speak up here.

I'm opening a new issue for us to evaluate how many records / edits were affected during this time period (and what strategy we should take, e.g. potentially reverting edits during this time?

@tfmorris
Copy link
Contributor

tfmorris commented May 1, 2019

@hornc believes this issue has been resolved by reverting openlibrary.yml and infobase.yml on production to a state before ol-mem4 was added.

Is there more information on what the problematic change was so that we can avoid it in the future?

It's hypothesized perhaps infobase needed to be restarted on ol-home via supervisorctl after deploy (prior to restarting ol-web3 and ol-web4) in addition to restarting ol-mem*.

Can you (or @hornc) expand on this? Is the theory that different clients were operating with different memcached server lists, giving them inconsistent world views or something else?

I can definitely see a scenario where 2 cache servers with 50% of the cached entries each could continue to serve stale data to clients who thought that there were only two 2 servers if a new client started invalidating and updating 1/3 of its entries on a new third cache server (leaving the stale entries on the old server), but I'd like to confirm that that's what actually happened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module: Memcache Issues related to the configuration or use of the Memcache subsystem. [managed] Priority: 1 Do this week, receiving emails, time sensitive, . [managed] Regression Theme: Editing Issues related to the user editing/wiki editing experience. [managed]
Projects
Librarian Issues
  
Closed
2019 Epics
  
Done (or Needs Review)
Development

Successfully merging a pull request may close this issue.

7 participants