Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple functions can be assigned the same integer key #8

Closed
cdaley opened this issue Oct 12, 2014 · 3 comments
Closed

Multiple functions can be assigned the same integer key #8

cdaley opened this issue Oct 12, 2014 · 3 comments

Comments

@cdaley
Copy link

cdaley commented Oct 12, 2014

Hello,

I get the error "Fatal Error: duplicate keys found for counter_seconds" when running a Byfl-instrumented version of Source Extractor (http://www.astromatic.net/software/sextractor). I am using version 604cc74 of Byfl. This error does not happen in earlier versions of Byfl such as c2a8038, bb3ed5f and 005efea.

The error is produced by bf_record_key function and is happening because there are two source files in the Source Extractor package which are named misc.c: src/misc.c and src/levmar/misc.c. These files are assigned the same module identifier string "misc.opt.bc" and so have the same hash value of this string and therefore the same initial random number seed. (I can avoid the abort by renaming one of the misc.c source files in the Source Extractor package, modifying the Makefiles and rebuilding.)

Is the integer key collision worthy of a fatal error? Can the error be changed to a warning message instead? Will the collision simply lead to incorrect performance data for the functions that clash? This seems like it would not be a problem if we choose to collect performance data at program granularity only, i.e. without -bf-by-func option. One suggestion to reduce the likelihood of duplicate random number seeds is to use a string of the full path to the source file.

Thanks,
Chris

@spakin spakin closed this as completed in 5330f9a Oct 15, 2014
@spakin
Copy link
Member

spakin commented Oct 16, 2014

Chris,

Please confirm that 5330f9a really does fix the problem you're observing. Your bug report nearly drove me and Rob Aulwes (who wrote the Byfl function-hashing code) mad because function names are mapped to random numbers; we couldn't figure out how filenames could play a role in that. We had forgotten that the seed for the random-number generator is a hash of llvm::Module::getModuleIdentifier(), which is normally the filename. 5330f9a uses a more complicated seed function including time and process ID in hopes of eliminating key clashes (at least with very high probability).

I consider key collision worthy of a fatal error. Even though it implies only that data will get jumbled across functions—and not even that if neither -bf-by-func not -bf-call-stack is used—I'd think it would be baffling to users to suddenly get garbled performance results. With the improved seed function, my hope is that key collisions should never happen in our lifetime.

As always, thanks for the bug report,
— Scott

@cdaley
Copy link
Author

cdaley commented Oct 19, 2014

Hello Scott,

5330f9a fixes the problem, thanks!

Chris

@spakin
Copy link
Member

spakin commented Oct 20, 2014

Chris,

Great! Thanks for the re-testing.

— Scott

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants