Hey so, we've talked about this a bit but I wanted to document why lighthouse supporting other formats would be nice while it was still fresh.
So, drcov is useful in that there are easy, cross-platform tools to generate it, however it has some pretty significant shortcomings which I'm running into. Specifically drcov is made up of a header which gives the module maps and then a series of tuples (module id, bb offset, bb size). The main issue here is the bb size field. If you're generating a trace with someone that is aware of the bb sizes (e.g. a dbi), this is all cool, however if youre dumping a trace from something that is not bb aware (e.g. an emulator or collecting code coverage via sampling) you just have a list of PC values.
Assuming you have have a module map and a list of PC values there are a few things you could do:
- Tell Lighthouse that every instruction is a single bb, however with variable length instructions this requires a disassembler
- Use a disassembler to go from PC -> Block, then get the block size and base from that. This is what I've most commonly implemented but it's a pain in the ass: it requires IDA to have all covered code be in recognized functions, and doesnt properly handle cases where you have calls in the middle of a block (if the call does not return, the entire block is still pained)
Basically both these require pre-processing the coverage in IDA before loading, which is doable but is a pain in the ass.
So, I'm pretty agnostic with regards to what the actual format is, but the feature request is the ability to load any coverage data format which can be generated from the module mappings and a list of PC values.
Hey so, we've talked about this a bit but I wanted to document why lighthouse supporting other formats would be nice while it was still fresh.
So, drcov is useful in that there are easy, cross-platform tools to generate it, however it has some pretty significant shortcomings which I'm running into. Specifically drcov is made up of a header which gives the module maps and then a series of tuples (module id, bb offset, bb size). The main issue here is the bb size field. If you're generating a trace with someone that is aware of the bb sizes (e.g. a dbi), this is all cool, however if youre dumping a trace from something that is not bb aware (e.g. an emulator or collecting code coverage via sampling) you just have a list of PC values.
Assuming you have have a module map and a list of PC values there are a few things you could do:
Basically both these require pre-processing the coverage in IDA before loading, which is doable but is a pain in the ass.
So, I'm pretty agnostic with regards to what the actual format is, but the feature request is the ability to load any coverage data format which can be generated from the module mappings and a list of PC values.