Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/GS for release Windows builds #18

Closed
lnicola opened this issue Mar 18, 2017 · 16 comments
Closed

/GS for release Windows builds #18

lnicola opened this issue Mar 18, 2017 · 16 comments

Comments

@lnicola
Copy link

lnicola commented Mar 18, 2017

The Windows makefile enables /GS (buffer security checks) for release builds. I won't argue in favor of security over performance or the other way around, but is this intentional or should it be /GS-?

@lemire
Copy link
Owner

lemire commented Mar 18, 2017

Please submit a pull request if you think you may improve things. I would not want buffer security checks myself.

Note that as of Visual Studio 2017, the correct approach would be to provide a CMake file.

@lnicola
Copy link
Author

lnicola commented Mar 18, 2017

I haven't checked the generated code, but I'll send a pull request if /GS- was intended.

It's nice that VS 2017 has CMake support, although I haven't checked it.

@PeterBocan
Copy link

Why wouldn't you have buffer security check? It's a good way to know, whether your stack is corrupted, and if so, it will shut down the program.

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

@PeterBocan It doesn't look like simdcomp uses too may stack buffers. As such, it's probably the caller's job to check the stack cookies.

I submitted this because I was looking into integrating simdcomp with another build system and I wanted to replicate the Makefile settings as a starting point. /GS caught my eye, as the rest of the compiler options seem geared towards aggressive optimizations.

@PeterBocan
Copy link

Ah, I see. It doesn't have to be a only stack buffer overflow, but any stack frame manipulation.

The purpose of /GS is pushing a magic value onto stack and once you try to return from that function, that given value is checked, and if the mismatch occurs, then it is shut down by the processor.

It's a (another) good layer of security measure against executing malicious code on the stack. I think it's worth to have enabled it.

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

Sure, but look at the code: https://github.com/lemire/simdcomp/blob/master/src/simdfor.c

I don't think there's much potential for stack overflows there. For comparison, here are the things /GS applies to: https://msdn.microsoft.com/en-us/library/8dbf701c.aspx#Anchor_3

@weltling
Copy link
Collaborator

weltling commented Mar 20, 2017

@lnicola probably a more convenient way were to introduce a config flag, like passing DISABLE_SECURITY_FLAGS=1 to makefile. Then you'd be easy to integrate the build elsewhere. Passing security flags is a common good practice, especially on Windows, you know. Indeed, some security cflags are even missing, that are available in newer VS versions.

@lemire maybe a step further could be to even move to cmake completely? Probably no big diff for autotools, as there are no deps to check. But we could fe automate dso name creation, etc. Otherwise, vs2017 is fine as editor by simply moving the folder into the project pane. Otherwise, as a vim user, i'd probably not care much :)

Thanks.

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

Okay, so I checked. As I suspected, /GS doesn't matter much here:

File /GS size /GS- size Difference
avxbitpacking.o 3 758 3 754 -4
simdbitpacking.o 669 183 669 175 -8
simdcomputil.o 15 685 15 679 -6
simdfor.o 618 379 618 371 -8
simdintegratedbitpacking.o 1 511 188 1 511 184 -4
simdpackedsearch.o 3 764 3 760 -4
simdpackedselect.o 3 764 3 760 -4

4 bytes of difference can be explained by the -GS<NUL> string showing up in the object files. I don't know where the other 2-4 bytes come from, but I think this is a moot point.

@lemire
Copy link
Owner

lemire commented Mar 20, 2017

@weltling

maybe a step further could be to even move to cmake completely?

Yes. But given that we already have something that works well, it is not urgent.

@lemire
Copy link
Owner

lemire commented Mar 20, 2017

@lnicola

Any impact on the performance?

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

@lemire I haven't tried, but I suspect it makes no difference to the generated code.

@lemire
Copy link
Owner

lemire commented Mar 20, 2017

@lnicola So you recommend closing this issue, then?

@PeterBocan
Copy link

I am for closing that. The performance hit is negligible. As I said, it's only a few assembly instructions.

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

Please disregard my comment about the object size, it's not valid since I was using /GL. I'll post as soon as I'm able to build the benchmark.

@lnicola
Copy link
Author

lnicola commented Mar 20, 2017

To be honest, the run to run variation seemed far too large for these numbers to mean anything. In any case, I'm fine with closing this.

/GS
benchmarking search
bit width = 0, fast search function time = 21, naive time = 39 , fast with length time = 7
bit width = 1, fast search function time = 67, naive time = 127 , fast with length time = 73
bit width = 2, fast search function time = 70, naive time = 85 , fast with length time = 106
bit width = 3, fast search function time = 54, naive time = 174 , fast with length time = 74
bit width = 4, fast search function time = 88, naive time = 160 , fast with length time = 74
bit width = 5, fast search function time = 74, naive time = 207 , fast with length time = 62
bit width = 6, fast search function time = 57, naive time = 101 , fast with length time = 62
bit width = 7, fast search function time = 100, naive time = 100 , fast with length time = 87
bit width = 8, fast search function time = 57, naive time = 95 , fast with length time = 82
bit width = 9, fast search function time = 80, naive time = 121 , fast with length time = 61
bit width = 10, fast search function time = 79, naive time = 140 , fast with length time = 81
bit width = 11, fast search function time = 95, naive time = 119 , fast with length time = 81
bit width = 12, fast search function time = 55, naive time = 123 , fast with length time = 61
bit width = 13, fast search function time = 55, naive time = 124 , fast with length time = 80
bit width = 14, fast search function time = 97, naive time = 134 , fast with length time = 60
bit width = 15, fast search function time = 75, naive time = 119 , fast with length time = 78
bit width = 16, fast search function time = 88, naive time = 113 , fast with length time = 59
bit width = 17, fast search function time = 121, naive time = 141 , fast with length time = 81
bit width = 18, fast search function time = 56, naive time = 102 , fast with length time = 101
bit width = 19, fast search function time = 99, naive time = 124 , fast with length time = 78
bit width = 20, fast search function time = 118, naive time = 143 , fast with length time = 105
bit width = 21, fast search function time = 92, naive time = 103 , fast with length time = 93
bit width = 22, fast search function time = 82, naive time = 103 , fast with length time = 99
bit width = 23, fast search function time = 58, naive time = 106 , fast with length time = 63
bit width = 24, fast search function time = 78, naive time = 173 , fast with length time = 85
bit width = 25, fast search function time = 75, naive time = 127 , fast with length time = 61
bit width = 26, fast search function time = 75, naive time = 114 , fast with length time = 64
bit width = 27, fast search function time = 58, naive time = 127 , fast with length time = 101
bit width = 28, fast search function time = 58, naive time = 123 , fast with length time = 94
bit width = 29, fast search function time = 80, naive time = 127 , fast with length time = 116
bit width = 30, fast search function time = 81, naive time = 175 , fast with length time = 92
bit width = 31, fast search function time = 58, naive time = 124 , fast with length time = 86
bit width = 32, fast search function time = 73, naive time = 108 , fast with length time = 50
benchmarking select
bit width = 0, fast select function time = 32, naive time = 15
bit width = 1, fast select function time = 62, naive time = 44
bit width = 2, fast select function time = 83, naive time = 63
bit width = 3, fast select function time = 38, naive time = 43
bit width = 4, fast select function time = 76, naive time = 59
bit width = 5, fast select function time = 63, naive time = 62
bit width = 6, fast select function time = 92, naive time = 61
bit width = 7, fast select function time = 62, naive time = 62
bit width = 8, fast select function time = 56, naive time = 58
bit width = 9, fast select function time = 59, naive time = 79
bit width = 10, fast select function time = 57, naive time = 79
bit width = 11, fast select function time = 97, naive time = 79
bit width = 12, fast select function time = 72, naive time = 45
bit width = 13, fast select function time = 32, naive time = 49
bit width = 14, fast select function time = 94, naive time = 49
bit width = 15, fast select function time = 51, naive time = 50
bit width = 16, fast select function time = 71, naive time = 57
bit width = 17, fast select function time = 56, naive time = 49
bit width = 18, fast select function time = 60, naive time = 51
bit width = 19, fast select function time = 60, naive time = 67
bit width = 20, fast select function time = 78, naive time = 81
bit width = 21, fast select function time = 80, naive time = 82
bit width = 22, fast select function time = 79, naive time = 82
bit width = 23, fast select function time = 78, naive time = 84
bit width = 24, fast select function time = 76, naive time = 63
bit width = 25, fast select function time = 79, naive time = 68
bit width = 26, fast select function time = 80, naive time = 67
bit width = 27, fast select function time = 81, naive time = 85
bit width = 28, fast select function time = 118, naive time = 71
bit width = 29, fast select function time = 46, naive time = 54
bit width = 30, fast select function time = 64, naive time = 52
bit width = 31, fast select function time = 58, naive time = 53
bit width = 32, fast select function time = 3, naive time = 60

/GS-
benchmarking search
bit width = 0, fast search function time = 5, naive time = 46 , fast with length time = 6
bit width = 1, fast search function time = 80, naive time = 95 , fast with length time = 111
bit width = 2, fast search function time = 54, naive time = 149 , fast with length time = 141
bit width = 3, fast search function time = 81, naive time = 164 , fast with length time = 75
bit width = 4, fast search function time = 83, naive time = 107 , fast with length time = 113
bit width = 5, fast search function time = 94, naive time = 174 , fast with length time = 96
bit width = 6, fast search function time = 71, naive time = 131 , fast with length time = 77
bit width = 7, fast search function time = 99, naive time = 106 , fast with length time = 81
bit width = 8, fast search function time = 73, naive time = 143 , fast with length time = 62
bit width = 9, fast search function time = 81, naive time = 140 , fast with length time = 93
bit width = 10, fast search function time = 71, naive time = 139 , fast with length time = 100
bit width = 11, fast search function time = 109, naive time = 189 , fast with length time = 115
bit width = 12, fast search function time = 74, naive time = 138 , fast with length time = 75
bit width = 13, fast search function time = 54, naive time = 137 , fast with length time = 94
bit width = 14, fast search function time = 56, naive time = 141 , fast with length time = 99
bit width = 15, fast search function time = 81, naive time = 143 , fast with length time = 103
bit width = 16, fast search function time = 76, naive time = 117 , fast with length time = 61
bit width = 17, fast search function time = 76, naive time = 107 , fast with length time = 62
bit width = 18, fast search function time = 58, naive time = 106 , fast with length time = 62
bit width = 19, fast search function time = 58, naive time = 105 , fast with length time = 91
bit width = 20, fast search function time = 57, naive time = 124 , fast with length time = 61
bit width = 21, fast search function time = 57, naive time = 121 , fast with length time = 62
bit width = 22, fast search function time = 59, naive time = 120 , fast with length time = 60
bit width = 23, fast search function time = 58, naive time = 111 , fast with length time = 61
bit width = 24, fast search function time = 58, naive time = 105 , fast with length time = 62
bit width = 25, fast search function time = 57, naive time = 107 , fast with length time = 62
bit width = 26, fast search function time = 58, naive time = 112 , fast with length time = 61
bit width = 27, fast search function time = 58, naive time = 143 , fast with length time = 61
bit width = 28, fast search function time = 57, naive time = 110 , fast with length time = 60
bit width = 29, fast search function time = 59, naive time = 113 , fast with length time = 62
bit width = 30, fast search function time = 58, naive time = 113 , fast with length time = 60
bit width = 31, fast search function time = 58, naive time = 114 , fast with length time = 61
bit width = 32, fast search function time = 47, naive time = 92 , fast with length time = 53
benchmarking select
bit width = 0, fast select function time = 5, naive time = 16
bit width = 1, fast select function time = 41, naive time = 44
bit width = 2, fast select function time = 63, naive time = 44
bit width = 3, fast select function time = 78, naive time = 50
bit width = 4, fast select function time = 85, naive time = 46
bit width = 5, fast select function time = 59, naive time = 46
bit width = 6, fast select function time = 89, naive time = 46
bit width = 7, fast select function time = 38, naive time = 46
bit width = 8, fast select function time = 70, naive time = 43
bit width = 9, fast select function time = 57, naive time = 47
bit width = 10, fast select function time = 58, naive time = 47
bit width = 11, fast select function time = 39, naive time = 48
bit width = 12, fast select function time = 81, naive time = 46
bit width = 13, fast select function time = 78, naive time = 47
bit width = 14, fast select function time = 81, naive time = 48
bit width = 15, fast select function time = 54, naive time = 49
bit width = 16, fast select function time = 52, naive time = 42
bit width = 17, fast select function time = 35, naive time = 49
bit width = 18, fast select function time = 79, naive time = 49
bit width = 19, fast select function time = 84, naive time = 50
bit width = 20, fast select function time = 40, naive time = 49
bit width = 21, fast select function time = 87, naive time = 51
bit width = 22, fast select function time = 90, naive time = 51
bit width = 23, fast select function time = 89, naive time = 52
bit width = 24, fast select function time = 57, naive time = 48
bit width = 25, fast select function time = 69, naive time = 52
bit width = 26, fast select function time = 64, naive time = 53
bit width = 27, fast select function time = 64, naive time = 53
bit width = 28, fast select function time = 61, naive time = 91
bit width = 29, fast select function time = 94, naive time = 53
bit width = 30, fast select function time = 42, naive time = 76
bit width = 31, fast select function time = 42, naive time = 54
bit width = 32, fast select function time = 4, naive time = 37

@lemire
Copy link
Owner

lemire commented Mar 20, 2017

Closing per request of the OP.

@lemire lemire closed this as completed Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants