Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/syncthing: Add selectable sha256 package (fixes #3613, fixes #3614) #3617

Closed
wants to merge 5 commits into from
Closed

Conversation

calmh
Copy link
Member

@calmh calmh commented Sep 22, 2016

Purpose

This adds autodetection of the fastest hashing library on startup, thus handling the performance regression. It also adds an environment variable to control the selection, STHASHING=standard (Go standard library version, avoids SIGILL crash when the minio library has bugs on odd CPUs), STHASHING=minio (to force using the minio version) or unset for the default autodetection.

Testing

Works for me...

This adds autodetection of the fastest hashing library on startup, thus
handling the performance regression. It also adds an environment
variable to control the selection, STHASHING=standard (Go standard
library version, avoids SIGILL crash when the minio library has bugs on
odd CPUs), STHASHING=minio (to force using the minio version) or unset
for the default autodetection.
@AudriusButkevicius
Copy link
Member

We could catch the sigill in monitor, and restart with alternative implementation and potentially set it in stone/config.

@AudriusButkevicius
Copy link
Member

AudriusButkevicius commented Sep 22, 2016

Also, because we are benching both implementations, perhaps it would be nice to print both hashing package performances?

@calmh
Copy link
Member Author

calmh commented Sep 22, 2016

That's a good idea, didn't think of that, will look into it!

@scienmind
Copy link
Contributor

Regarding the startup hash bench:

The bench itself is quite short, and as noted on the forum discussion about the new hash algo. is is short enough to not bump up the CPU up to its full speed (on relatively powerful CPUs).

I tested it on few laptops - and unless either the PC is already under load or its on "Crazy Super Performance Power Plan", the clock speed won't bump up to max during the hash test, degrading the results. Ranging from 50MB/s (starting ST when PC is fully idle) up to 300+MB/s (when some CPU core is already busy before launching ST).

The point is - it may lead to wrong hash algorithm selection, giving an advantage to whichever one will run last. So it may be a good idea to either make hash test longer, or do few iterations of hash benchs, alternating the algorithm used.

@calmh
Copy link
Member Author

calmh commented Sep 22, 2016

Good point. It runs three iterations, but first one algo and then the other, and only for 75ms each. They should be interleaved.

@canton7
Copy link
Member

canton7 commented Sep 22, 2016

How about running a longer test on first start, then saving the results? You'd need a good strategy for catching people changing their CPU, although I don't imagine that'd be a common occurrence.

@scienmind
Copy link
Contributor

scienmind commented Sep 22, 2016

If by first start you mean "the first", when the ID and config are born, it's not optimal:
If anyone (I do :) runs a portable version, they'd be effectively changing CPUs...

(one of my ST instances is a a portable SyncTrayzor on a portable HDD that travels across bunch of PCs...)

@calmh
Copy link
Member Author

calmh commented Sep 22, 2016

It doesn't need to be that exhaustive, nobody dies if we get it wrong. A few hundred milliseconds at startup should be sufficient if done correctly.

@calmh
Copy link
Member Author

calmh commented Sep 23, 2016

Mmkay, so this now does the relevant benchmarking and reporting:

jb@unu:~/s/g/s/s/test $ STHASHING=standard syncthing -home h1 | grep perfo
[I6KAH] 17:58:19 INFO: Single thread hash performance is 346 MB/s using crypto/sha256 (0.00 MB/s using minio/sha256-simd).
^C
jb@unu:~/s/g/s/s/test $ STHASHING=minio syncthing -home h1 | grep perfo
[I6KAH] 17:58:28 INFO: Single thread hash performance is 357 MB/s using minio/sha256-simd (351 MB/s using crypto/sha256).
^C
jb@unu:~/s/g/s/s/test $ syncthing -home h1 | grep perfo
[I6KAH] 17:58:37 INFO: Single thread hash performance is 362 MB/s using minio/sha256-simd (355 MB/s using crypto/sha256).
^C

It also detects and acts on SIGILL, but doesn't do anything permanently about it. Also doesn't work when not running with the monitor process, but hey...

jb@unu:~/s/g/s/syncthing $ syncthing -home h1 
[monitor] 18:09:27 INFO: Starting syncthing
[H73EW] 18:09:27 INFO: syncthing v0.14.7+4-ge2cacc3-dirty-sha256 "Dysprosium Dragonfly" (go1.7.1 darwin-amd64) jb@unu.kastelo.net 2016-09-23 15:59:14 UTC
[H73EW] 18:09:27 INFO: My ID: H73EWZC-BE7F57I-2KBRW7K-YZ56BFF-GREHSBC-65GOKDE-N3Q5YH7-65N5YAK
SIGILL: illegal instruction
[monitor] 18:09:27 WARNING: 
*******************************************************************************
* Crash due to illegal instruction detected. This is most likely due to a CPU *
* incompatibility with the high performance hashing package. Switching to the *
* standard hashing package instead. Please report this issue at:              *
*                                                                             *
*   https://github.com/syncthing/syncthing/issues                             *
*                                                                             *
* Include the details of your CPU.                                            *
*******************************************************************************

[monitor] 18:09:27 INFO: Syncthing exited: exit status 2
[monitor] 18:09:28 INFO: Starting syncthing
[H73EW] 18:09:28 INFO: syncthing v0.14.7+4-ge2cacc3-dirty-sha256 "Dysprosium Dragonfly" (go1.7.1 darwin-amd64) jb@unu.kastelo.net 2016-09-23 15:59:14 UTC
[H73EW] 18:09:28 INFO: My ID: H73EWZC-BE7F57I-2KBRW7K-YZ56BFF-GREHSBC-65GOKDE-N3Q5YH7-65N5YAK
[H73EW] 18:09:28 INFO: Single thread hash performance is 339 MB/s using crypto/sha256 (0.00 MB/s using minio/sha256-simd).
[H73EW] 18:09:28 INFO: Starting deadlock detector with 20m0s timeout
[H73EW] 18:09:28 INFO: Ready to synchronize czxwh-ckkfq (readwrite)

Tested the SIGILL handling by changing some random bytes in the sha256 package assembly.

@AudriusButkevicius
Copy link
Member

AudriusButkevicius commented Sep 23, 2016

So does it switch to standard as it crashes? Asking as I see a return statement there.

@calmh
Copy link
Member Author

calmh commented Sep 23, 2016

Yes. The stdout reader sets the environment variable and returns (instead of continuing to process the rest of the sigill crash). The main loop in the monitor waits for the stdout and stderr readers to complete, then attempts another launch, at which point the STHASHING=standard variable takes effect and the start up works.

So it'll crash once on every monitor startup, with the message pointing to github. For those who notice we may get the bug report, for those who don't it should mostly just work. For those who run it under a service with -no-restart they'll come here to complain and we'll have to accept the bug report with CPU details and point towards setting STHASHING=standard.

@AudriusButkevicius
Copy link
Member

@st-review merge

@st-review
Copy link

👌 Merged as d328e0f. Thanks, @calmh!

@st-review st-review closed this Sep 23, 2016
st-review pushed a commit that referenced this pull request Sep 23, 2016
This adds autodetection of the fastest hashing library on startup, thus
handling the performance regression. It also adds an environment
variable to control the selection, STHASHING=standard (Go standard
library version, avoids SIGILL crash when the minio library has bugs on
odd CPUs), STHASHING=minio (to force using the minio version) or unset
for the default autodetection.

GitHub-Pull-Request: #3617
@calmh calmh deleted the sha256 branch September 24, 2016 07:02
@st-review st-review added the frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion label Nov 15, 2017
@syncthing syncthing locked and limited conversation to collaborators Nov 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants