Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit Actions don't complete #17

Closed
WardCunningham opened this issue Oct 3, 2014 · 20 comments
Closed

Edit Actions don't complete #17

WardCunningham opened this issue Oct 3, 2014 · 20 comments

Comments

@WardCunningham
Copy link
Member

I'm seeing edit actions silently fail. This seems to be associated with large payloads like Factory plugin posting a dropped image. I believe the same root cause is in play for this report, an explicit fork of a remote page to the origin server that also silently fails. I've not seen the problem for images or forks using the ruby/sinatra server.

From the client side using Chrome's inspector the PUT hangs pending for four minutes and then eventually fails. The wiki-client code catches the failure and stores the remote page in browser local storage under the origin name. That's something, but not the desired result.

image

From the server side the express log doesn't show a PUT for some time. Presumably the PUT is in progress until jQuery ajax times out and closes the connection. jQuery tries several times over the four minute period. None of the express logs show any status code being returned.

image

The payload for this PUT (if I read Chrome's inspector correctly) is one url encoded line of 24399 characters.

I'm running wiki with one argument, -f. Wiki -v reports:

wiki: 0.3.5
wiki-server: 0.2.1
wiki-client: 0.2.12
wiki-plugin-activity: 0.1.1
wiki-plugin-bars: 0.1.3
wiki-plugin-bytebeat: 0.1.1
wiki-plugin-calculator: 0.1.3
wiki-plugin-calendar: 0.1.2
wiki-plugin-changes: 0.1.2
wiki-plugin-chart: 0.2.1
wiki-plugin-code: 0.1.1
wiki-plugin-data: 0.1.1
wiki-plugin-efficiency: 0.1.2
wiki-plugin-factory: 0.1.1
wiki-plugin-favicon: 0.1.1
wiki-plugin-federatedwiki: 0.1.1
wiki-plugin-force: 0.1.1
wiki-plugin-future: 0.1.1
wiki-plugin-html: 0.0.2
wiki-plugin-image: 0.1.1
wiki-plugin-line: 0.1.2
wiki-plugin-linkmap: 0.1.2
wiki-plugin-logwatch: 0.1.1
wiki-plugin-map: 0.1.3
wiki-plugin-mathjax: 0.1.1
wiki-plugin-metabolism: 0.1.1
wiki-plugin-method: 0.1.5
wiki-plugin-pagefold: 0.1.1
wiki-plugin-paragraph: 0.1.1
wiki-plugin-parse: 0.1.1
wiki-plugin-pushpin: 0.1.1
wiki-plugin-radar: 0.1.1
wiki-plugin-reduce: 0.1.1
wiki-plugin-reference: 0.1.1
wiki-plugin-report: 0.1.1
wiki-plugin-rollup: 0.1.1
wiki-plugin-scatter: 0.1.2
wiki-plugin-twadio: 0.1.2
wiki-plugin-txtzyme: 0.1.4
wiki-plugin-video: 0.1.1

The server is running on CentOS release 6.5 (Final). I've seen similar behavior running on Mac OS.

This would seem to be some protocol confusion between jQuery and Node.js. This seems unlikely so I am unsure of how to continue debugging. Suggestions welcome. Pull requests would be delightful.

@paul90
Copy link
Member

paul90 commented Oct 3, 2014

Not been able to recreate this, at least with images that are quite a bit bigger. I have noticed that posting a very large image (3.2 MB) does get a server error (that is not displayed in the browser), and interestingly is still displayed on the page in the browser, but not saved locally. So, it looks as if the client error handling is probably lacking.

Is it possible to put some steps together to recreate this error? Maybe time to think about generative testing?

@WardCunningham
Copy link
Member Author

I've only seen this at work, not in my normal development environment which includes many versions of both servers on multiple platforms. When it happens, it appears to be repeatable.

It is happening with my CentOS server here at work. I've tried Chrome, Safari and Firefox from my work laptop and Chrome from my personal (development) laptop. All fail with the same symptoms.

@michaelarthurcaulfield
Copy link

What I've seen with images (and also with students in my class) is that images larger than 15k fail to upload (but display on the page as if they have until reload). I can actually replicate this, showing how a 14K version of a 16K image that fails will load fine.

This is with my Amazon small (not micro) server, Ubuntu/NodeJS across Mac, PC, Firefox, Chrome. My students rate it as one of the most frustrating aspects of fedwiki (mainly because they thought the images were uploaded, they come to class and find they are not)..

I'm shocked to see paul90 is getting large images up.

@michaelarthurcaulfield
Copy link

Actually, I might be wrong, here is a page failing on all sizes (sorry no log):

http://screencast.com/t/mnDerXOr3xn

Here is are two pages which when refreshed load a local version:

http://screencast.com/t/foNnPtUuqjx

When local changes are cleared there ends up being no image uploaded to the server.

@michaelarthurcaulfield
Copy link

Wait, here we go: 9k image uploads, 14k does not.

http://screencast.com/t/cabcazfm61I

@paul90
Copy link
Member

paul90 commented Oct 4, 2014

Thank you @michaelarthurcaulfield

The large files are almost certainly due to using a local test server - a smaller files that work locally do not on a remote site (on OpenShift). After about 2 minutes the remote site returns a 502 'Bad Gateway' error, and the content gets saved to local storage. Using tail -f to look at the server logs, the log entry for the request appear at the instant that the 502 is returned by the proxy timing out.

I think the difference in time before the fail is probably due to a reverse proxy configuration. Files of this size should not be taking so long to upload, so I suspect there is either an untrapped error somewhere, or...

@paul90
Copy link
Member

paul90 commented Oct 4, 2014

Have added some comments to the code on my OpenShift test server - logs a message at a few pertinent points.

Have added a few log messages, at the start of the put processing.

While these messages appear when doing a normal action, they don't appear when either adding a failing image, or adding a lot of text to a paragraph. Though after awhile PUT /page/t17/action 200 120078ms appears in the server log - and repeats if I don't navigate away from the page.

This seems most stange 😕

@WardCunningham
Copy link
Member Author

This issue is both serious and confounding. Serious because of work loss. Confounding because it does not appear to be a direct consequence of our logic.

We overlap the screen update with the server update. Without this overlap editing would be sluggish. Regretfully this leads to the misleading behavior that is so frustrating. The journal does not update until the server operation completes. This provides a very small indication that the server (or network) is not responding.

One might note that images could be stored more compactly but that is not a solution to this problem. We do have an ambition to handle images smartly where only a thumbnail would be included in the action and high-resolution data would be queued for background upload to an asset manager.

@WardCunningham
Copy link
Member Author

@paul90 I wonder if you can duplicate the error running your new version of Express on OpenShift?

Also, I have been googling for hints and have found some.

"process.on('uncaughtException', ...) can make the a node app hang. I think one of the modules I've been using had a debug mode that enabled that event." post

"You're correct - express's body parsing turns the request stream into a regular object. The solution is either to buffer it, as I was suggesting, or to deal with the request before the parsing, as you ended up doing. Glad you figured it out!" post

@paul90
Copy link
Member

paul90 commented Oct 4, 2014

It is not just images, large blocks of text can cause the same problem.

I suspect that the original error, and the one I spotted above with the 3.2MB image, are different problems. I suspect that the large image causes a problem with breaking the local storage size limit...

Trying Express 4 is an idea, I have noticed some mention of limits on bodyParser though I don't think we are near them, at least with the none-huge files that fail.

Given that there are files that upload when using a local server, and fail with a remote one, there is also the possibility that whatever is sitting in front of express is causing problems.

@WardCunningham
Copy link
Member Author

This screencast (from above) shows the failing behavior well. Notice that the edit action (pencil) shows up in the journal immediately when he drops the small file, but not with the large file.

http://screencast.com/t/cabcazfm61I

@paul90
Copy link
Member

paul90 commented Oct 4, 2014

@WardCunningham good idea to try the Express 4 version. It works 😃

I would suggest comparing the Express 4 with the Express 3 - but the image only uploads to the former. This is a 35 KB image.

Still not sure why the Express 3 version works with this file using a local server (on OS/X) and doesn't on other platforms. Guess some platform differences, or timing, or some combination.

@michaelarthurcaulfield
Copy link

I updated everything last nght (node, fedwiki) and now I'm getting error-free uploads on all platforms (Win/Chrome, Win/Firefox, ChromeOS/Chrome, Ubuntu/Firefox). Having the ability to upload pics is nice! I'll try similar on the college server, if it works then my image upload problems are resolved.

Related discovery while testing, the maximum size of a page that can be forked seems to be platform specific, e.g. this 2MB page

http://journal14.hapgood.net:3000/view/welcome-visitors/view/recent-changes/view/tl-521-rocking-it

Can't be forked without a "Request too large" error in Win/Firefox or Win/Chrome but Ubuntu/Firefox has no problem with it. In general I don't think we should be building 2MB pages (this was a test), but it was interesting to see.

@WardCunningham
Copy link
Member Author

@michaelarthurcaulfield, I assume you are not yet using @paul90's express-4 branch, or are you?

@paul90
Copy link
Member

paul90 commented Oct 19, 2014

In general I don't think we should be building 2MB pages (this was a test)...

What was the upload limit with Express 3?

In Express 4 the default limit is only 100kb, currently in the code this is overridden to raise it to 1024kb.

Not sure what is a sensible value to use.

@WardCunningham
Copy link
Member Author

Would 10MB be out of the question?

I don't know why we would want a lower limit unless large uploads block async activity in the server. Could this be so?

Perhaps a tight limit is there to prevent abuse on publicly writeable sites. I'm less concerned about this since only the owner writes on a fedwiki site. (Of course farms that indiscriminately create new sites are fundamentally open to all kinds of abuse.)

I've suggested alternative asset handling for images that includes background upload via websockets.
http://ward.fed.wiki.org/view/image-assets

A background strategy would be more complex for forked pages since one is likely to start editing immediately after forking.

@michaelarthurcaulfield
Copy link

What a sensible limit is depends on how we use non-text assets. After saying we shouldn't generally build 2MB pages it occurred to me that you might want 2MB of data on the page as a CSV data source. Also there are certainly image outputs to science that need to be lossless and can therefore will be big, e.g. Histology slides (https://en.wikipedia.org/wiki/Histology) -- although an asset manager could assist.

Forking images along with pages is one of the super smooth things I point to when showing fedwiki to people, because the process we go through now to copy something from one blog to another (copy text, download all images, upload all images, manually tweak img src tags to point to new images) is ridiculous. And if you don't do that crazy process, image rot is a huge problem. I'd love to preserve forking with full images for the average page. Still, 2MB would likely be exception.

@michaelarthurcaulfield
Copy link

@WardCunningham Express4 -- no, just the last npm update. I will check the node update I did to make sure I didn't screw anything up.

@WardCunningham
Copy link
Member Author

Re: Sensible Limits

There is no limit which we won't eventually want to exceed. Certainly our current approach to images makes this too frequent an occurrence. We have discussed image alternatives and should act on at least some of them soon. However, for the really huge datasets, we are going to have to look outside of wiki.

I'll call these tough cases "stationary resources". We've mocked-up or even prototyped several of these immobile resources that properly deviate from our preferred sharing semantics. I've written a page with pointers to this work. I'm now thinking learning to live within limits is an important next step for us.

http://ward.fed.wiki.org/stationary-resources.html

@paul90
Copy link
Member

paul90 commented May 12, 2020

Closing this issue. Was most likely caused by the image squeeze not working, this was fixed in the latest client update - v0.20.2.

@paul90 paul90 closed this as completed May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants