Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the C API compiler-agnostic #184

Open
Snaipe opened this issue Dec 8, 2016 · 5 comments
Open

Make the C API compiler-agnostic #184

Snaipe opened this issue Dec 8, 2016 · 5 comments

Comments

@Snaipe
Copy link
Owner

Snaipe commented Dec 8, 2016

Currently the C API only works on MSVC and GNU compilers, as they rely on compiler extensions to put test data in specific sections.

It would be interesting to remove this dependency and make the API truely C99/C++11-compliant.

This would be done by scaning the data section of the executable for a specific magic number, and then validate the contents of the structure with a checksum to get the confidence that it isn't, at least, a false positive.

@Daniel-Abrecht
Copy link

As far as I know, there is no standardized way to get the start and the end of the data section, and the change could therefore not make the API truly C99/C++11-compliant. The only way I know of to make the API truly C/C++-compliant would be to introduce a new feature into the C standard, and I as far as I know the next C standard won't add any new features.

Also, I don't like the idea to search for the datas and eventually getting wrong datas, even if it is really unlikely. However, I think it would be interesting to have that as a fall-back if the other methods are unavailable, if that can even be done compiler independent.

@Snaipe
Copy link
Owner Author

Snaipe commented Dec 8, 2016

As far as I know, there is no standardized way to get the start and the end of the data section, and the change could therefore not make the API truly C99/C++11-compliant.

This is actually out of the scope of the language standard -- by removing nonstandard compiler extensions in the API, it would become compliant by definition (since any C99/C++11 compiler would be able to compile user programs that uses the criterion API, which isn't true at the moment). How the data is represented inside an executable is up to the ABI, not the C and C++ standards.

It's true that there isn't one standardized way of retrieving a section, however this falls down to the executable format used by the host platform; which as far as I know remains primarily ELF, Mach-O, and PE/COFF (The rest being either very old and unused, or never having substantial market share).

Also, I don't like the idea to search for the datas and eventually getting wrong datas, even if it is really unlikely.

Actually, the correct way to tackle that is by using a checksum field that validates the content of the data following it -- that way, you can eliminate false positives.

The advantage of implementing this would be to move the "portability" concerns out of the API and into the runtime: the API would effectively be 100% ISO compliant, as in any C and C++ compiler implementing the ISO C99 and ISO C++11 standards would be able to compile user tests that use Criterion.

On the other hand, we don't effectively "need" to do this: relying on MSVC and GNU extensions works, and virtually everyone uses Visual Studio or a GNU-compatible compiler (GCC, ICC, Clang, MinGW to mention the most used), so it might not be completely worth it to implement such a feature.

@Daniel-Abrecht
Copy link

Also, I don't like the idea to search for the datas and eventually getting wrong datas, even if it is really unlikely.

Actually, the correct way to tackle that is by using a checksum field that validates the content of the data following it -- that way, you can eliminate false positives.

No, completely eliminating false positives is impossible. Assuming the magic number is 2 bytes, and the checksum 2 bytes, there would we a theoretical chance of around 1 in (2^(4*8))/(data_section_size *8) for a false positive. For a 1MiB Data-section, that would be a 1 in ~4096
chance. With a 4 byte checksum, it would be exponentially lower, a 1 in ~268435456 chance. Sure, such a small chance doesn't matter, but no matter how big the checksum, a small chance remains.

It's true that there isn't one standardized way of retrieving a section, however this falls down to the executable format

But wouldn't that just make it platform dependent instant of compiler dependent?

@Snaipe
Copy link
Owner Author

Snaipe commented Dec 9, 2016

No, completely eliminating false positives is impossible. Assuming the magic number is 2 bytes, and the checksum 2 bytes, there would we a theoretical chance of around 1 in (2^(4*8))/(data_section_size *8) for a false positive. For a 1MiB Data-section, that would be a 1 in ~4096
chance. With a 4 byte checksum, it would be exponentially lower, a 1 in ~268435456 chance. Sure, such a small chance doesn't matter, but no matter how big the checksum, a small chance remains.

Right. I think the chance is actually lower than that considering that the checksum is working on very small data sizes (sizeof (void *) * 4 at maximum), and we can further decrease the chance of having a false positive by including a "pointer to itself" and validating it, like malloc heaps do. We can never really have a collision-proof solution, but this might be "good enough".

Now, making this optional as you mentionned could be interesting to consider.

But wouldn't that just make it platform dependent instant of compiler dependent?

More exactly "executable format dependent", as multiple platforms can share the same executable file format (e.g. a lot of unices using ELF), but this is nothing new. Criterion always has been dependent on the ABI to actually retrieve section limits for test data (test data, suite data, and hook data to be more specific) and to initialize sandboxes so this would remove the additional compiler dependency.

@Snaipe
Copy link
Owner Author

Snaipe commented Jan 20, 2017

I think going on forward the best bet we have is to scan dynamic symbols for special names (maybe starting with criterion_user_) in the executable. This approach will have the advantage of not using the nonstandard section attribute, and instead use the ABI to dynamically find the data we need. In this case, no need to try to scan our sections for a magic number and something that could be a test.

This also shouldn't be a problem for PE targets (where the symbol table is never populated by MSVC), as tests can be dllexported by the Test macro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants