Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drive: intermittent crashes with: stream error: stream ID ; INTERNAL_ERROR #3631

Open
ncw opened this issue Oct 16, 2019 · 9 comments

Comments

@ncw
Copy link
Collaborator

@ncw ncw commented Oct 16, 2019

In the forum issues were reported with v1.49.4.

The issues looked like this

2019/09/30 13:36:34 INFO  : Google drive root 'crypt': Failed to get StartPageToken: Get https://www.googleapis.com/drive/v3/changes/startPageToken?alt=json&prettyPrint=false&supportsAllDrives=true: stream error: stream ID 1923; INTERNAL_ERROR
2019/09/30 13:42:04 INFO  : Google drive root 'crypt': Failed to get StartPageToken: Get https://www.googleapis.com/drive/v3/changes/startPageToken?alt=json&prettyPrint=false&supportsAllDrives=true: stream error: stream ID 1929; INTERNAL_ERROR
2019/10/02 19:36:29 ERROR : IO error: open file failed: Get https://www.googleapis.com/drive/v3/files/XXX?alt=media: stream error: stream ID 1217; INTERNAL_ERROR

This is an internal error from the go standard library HTTP/2 code.

v1.49.4 was accidentally compiled with go1.13 which enabled http/2 by default for google drive for reasons not clear from the changelog.

Rolling the compiler back to go1.12 with no other code changes fixes the problem which is what was released as v1.49.5

An attempt at a workaround making these errors retriable was put in f9f9d50 but this did not seem effective.

Problem statement

  • go1.12 runs reliably with google drive over HTTP/1
  • go1.13 runs unreliably with google drive over HTTP/2
@ncw ncw added this to the v1.50 milestone Oct 16, 2019
ncw added a commit that referenced this issue Oct 16, 2019
Before this change when rclone was compiled with go1.13 it used HTTP/2
to contact drive by default.

This causes lockups and INTERNAL_ERRORs from the HTTP/2 code.

This is a workaround disabling the HTTP/2 code on an option.

It can be re-enabled with `--drive-disable-http2=false`

See #3631
@ivandeex

This comment has been minimized.

Copy link
Collaborator

@ivandeex ivandeex commented Oct 17, 2019

It can be a good idea to add drive test involving simple upload/download scenario with --drive-disable-htp2=false (or internal test with similar setting), ignore it in config.yaml and collect long-term stats.

@ncw

This comment has been minimized.

Copy link
Collaborator Author

@ncw ncw commented Oct 17, 2019

It can be a good idea to add drive test involving simple upload/download scenario with --drive-disable-htp2=false (or internal test with similar setting), ignore it in config.yaml and collect long-term stats.

I think setting up a reproduction test is a good idea... I tried with a simple mount which did directory listings, but I think it needs more than that. I haven't had time to set up something more complex yet.

@ivandeex

This comment has been minimized.

Copy link
Collaborator

@ivandeex ivandeex commented Oct 21, 2019

Just FYI
A number of CVEs was filed against HTTP2 recently. Large providers like Google are no doubt refreshing their http/2 stack. https://blog.cloudflare.com/on-the-recent-http-2-dos-attacks/

@thestigma

This comment has been minimized.

Copy link

@thestigma thestigma commented Oct 26, 2019

Interesting. So this whole thing could simply be a temporary issue outside of rclone then.

I will be re-enabling HTTP/2 for my normal use and seeing how it goes - if it has stabilized ect.
If I see more stalling I will make sure to make a note here about it.
(will be testing from version 1.50)

@ncw

This comment has been minimized.

Copy link
Collaborator Author

@ncw ncw commented Oct 26, 2019

Interesting. So this whole thing could simply be a temporary issue outside of rclone then.

It could be, yes.

There is probably a bug in the HTTP/2 code in the go standard library which is being triggered by a bug in the google drive HTTP/2 code - that is my guess!

I will be re-enabling HTTP/2 for my normal use and seeing how it goes - if it has stabilized ect.
If I see more stalling I will make sure to make a note here about it.
(will be testing from version 1.50)

Great - thanks.

@ncw ncw modified the milestones: v1.50, v1.51 Oct 26, 2019
@thestigma

This comment has been minimized.

Copy link

@thestigma thestigma commented Nov 1, 2019

While using --drive-disable-htp2=false

I got this today
ReadFileHandle.Read error: low level retry 1/10: stream error: stream ID 1345; INTERNAL_ERROR

it did not result in a full stall however. it managed to recover it seems.

I will set
-drive-disable-htp2=true
and continue monitoring

@ncw

This comment has been minimized.

Copy link
Collaborator Author

@ncw ncw commented Nov 2, 2019

@thestigma I just uploaded v1.50.1 which was compiled with go1.13.4. This has an HTTP/2 fix which could plausibly be a fix for this issue.

@thestigma

This comment has been minimized.

Copy link

@thestigma thestigma commented Nov 2, 2019

Cool!
I will instead continue testing on
-drive-disable-htp2=false
with version 1.50.1

I assume that flag is still "true" by default for the time being - until we can more clearly verify it works again?

@ncw

This comment has been minimized.

Copy link
Collaborator Author

@ncw ncw commented Nov 5, 2019

Cool!
I will instead continue testing on
-drive-disable-htp2=false
with version 1.50.1

Thank you.

I assume that flag is still "true" by default for the time being - until we can more clearly verify it works again?

I won't change the default until we are sure it is working properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.