Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make stdcopy.StdWriter thread safe. #20706

Merged
merged 1 commit into from Feb 28, 2016

Conversation

calavera
Copy link
Contributor

Stop using global variables as prefixes to inject the writer header.
That can cause issues when two writers set the length of the buffer in
the same header concurrently.

Fixes #19950 (~~maybe?~~totally 🤘)

Signed-off-by: David Calavera david.calavera@gmail.com

@cpuguy83
Copy link
Member

Not fixed :(

@calavera
Copy link
Contributor Author

@cpuguy83 are you using @dpiddy's script to test it? I don't get that error anymore after applying this patch, but I might be missing something.

@cpuguy83
Copy link
Member

Yes I am. It took a couple of runs but I did get it... also getting a panic from writing to the buffer (slice out of bounds).

@calavera
Copy link
Contributor Author

yeah, I saw the panic, but I think it's a completely different issue, caused by using a bounded buffer.

@calavera
Copy link
Contributor Author

@cpuguy83 can you try with this other script? bytes.Buffer is not thread safe on its own: https://gist.github.com/calavera/7d52bbfd677f898f8145

Please, make sure you're importing the right package and not a precompiled one without this patch.

@calavera calavera changed the title Make stdcopy.StdWriter concurrency safe. Make stdcopy.StdWriter thread safe. Feb 26, 2016
@danp
Copy link

danp commented Feb 26, 2016

@calavera your script has a bug, it is calling Write on the buffer in Read

@danp
Copy link

danp commented Feb 26, 2016

Using your branch I still get things like:

time="2016-02-26T14:58:54-04:00" level=debug msg="framesize: 4" 
out
time="2016-02-26T14:58:54-04:00" level=debug msg="framesize: 4" 
out
time="2016-02-26T14:58:54-04:00" level=debug msg="framesize: 4" 
err
time="2016-02-26T14:58:54-04:00" level=debug msg="framesize: 4" 
time="2016-02-26T14:58:54-04:00" level=debug msg="framesize: 1701999114" 
time="2016-02-26T14:58:54-04:00" level=debug msg="Extending buffer cap by 1701966346 (was 32777)" 
time="2016-02-26T14:58:56-04:00" level=debug msg="Corrupted frame: [111 117 116 10 1 0 0 0 0 0 0 4 111 117 116 ...]"

Running with GOMAXPROCS=1 seems to make it work.

@calavera
Copy link
Contributor Author

Thanks @dpiddy it should be fixed now.

I wonder what's different in my environment, I don't get any error and I don't have GOMAXPROCS set :/

➜  stdtests  echo $GOMAXPROCS

➜  stdtests
``

@danp
Copy link

danp commented Feb 26, 2016

I had to add this back to your script to get the debug output:

logrus.StandardLogger().Level = logrus.DebugLevel

Can you try that?

@calavera
Copy link
Contributor Author

okay, I finally got that corruption error.

@danp
Copy link

danp commented Feb 26, 2016

Cool. I think it would be good to eventually fill out the focused test script a bit to include a single net.Conn for stdout & stderr. Much like you needed to lock reads and writes on the buffer I think something similar will need to be done when using a single net.Conn.

@calavera
Copy link
Contributor Author

I still think this change fixes the original input header issue, fwiw 😛

binary.BigEndian.PutUint32(w.prefix[4:], uint32(len(buf)))
n1, err = w.Writer.Write(w.prefix[:])

header := [8]byte{0: w.prefix}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be [stdWriterPrefixLen]byte{...} instead of [8]byte{...}?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe same for 0 -> stdWriterFdIndex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are a few hardcoded values that we might be able to replace by constants.

@danp
Copy link

danp commented Feb 26, 2016

Ah, looking over it now I see that stdcopy's Write calls Write twice, once for the header and once for the frame. Part of the bigger issue? Two stdcopy writers could interleave their header and frame writes to the same underlying writer.

@calavera
Copy link
Contributor Author

I'm trying to change that.

@calavera calavera force-pushed the remove_concurrent_access_to_stdtypes branch from c16a782 to 987891a Compare February 26, 2016 21:23
@calavera
Copy link
Contributor Author

@dpiddy, @cpuguy83 I made some changes, do you mind take a look?

@danp
Copy link

danp commented Feb 26, 2016

I think that's done it! I was able to run this new test script which uses a tls.Conn for over a minute without weirdness.

@LK4D4
Copy link
Contributor

LK4D4 commented Feb 26, 2016

LGTM

Stop using global variables as prefixes to inject the writer header.
That can cause issues when two writers set the length of the buffer in
the same header concurrently.

Stop Writing to the internal buffer twice for each write. This could
mess up with the ordering information is written.

Signed-off-by: David Calavera <david.calavera@gmail.com>
@calavera calavera force-pushed the remove_concurrent_access_to_stdtypes branch from 987891a to 443a5c2 Compare February 26, 2016 21:52
@calavera
Copy link
Contributor Author

🙌 thanks for checking @dpiddy 🎉

@cpuguy83
Copy link
Member

LGTM

@icecrime
Copy link
Contributor

retest this

@icecrime
Copy link
Contributor

retest that

@icecrime
Copy link
Contributor

please retest this

@icecrime
Copy link
Contributor

test this please

@icecrime
Copy link
Contributor

Great process, ⭐⭐⭐⭐⭐, would test again.

@cpuguy83
Copy link
Member

Maybe sudo? :)

@icecrime
Copy link
Contributor

Aaaaaaaand it's failing. Halp @SvenDowideit!

@calavera
Copy link
Contributor Author

test this please 🙏

@cpuguy83 cpuguy83 added this to the 1.10.3 milestone Feb 27, 2016
@runcom
Copy link
Member

runcom commented Feb 27, 2016

test this please

@cpuguy83
Copy link
Member

Test is a known failure on win2lin.
Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Unrecognized input header" interrupting docker client
7 participants