introduce region type and load address support #25

sriemer · 2014-05-08T20:41:53Z

This patch set extends scanmem to determine the region type and the load address of libraries and the executable. This information is displayed additionally in the lregions command. The default type is a single region misc. The single region types heap and stack are also easy to determine but code and exe are special:
code is a library or an executable which consists of 3 or 4 regions:
0: r,x - .text; 1: r - .rodata; 2: r,w - .data; 3: r,w - .bss
The regions 1, 2 and 3 are consecutive in memory (often even region 0 as well). This is how it can be determined that these regions belong together. The start of the .text region is used as the load address of the binary. Subtracting it from a match address within the .data or .bss section results in the address used within the binary. This helps bypassing address space layout randomization (ASLR) in combination with position independent code (PIC) or position independent executable (PIE). The list command is extended to show that in an additionally displayed region info for found matches. The output format changes. So the GUI is changed as well. The exe type is a subtype of code and is just the executable.

The region type can be one of the following: misc (default), code (belongs to library or executable), exe (belongs to the executable), heap (is the heap region), stack (is the stack region) The types misc, heap and stack are single regions which can be determined easily. But code and exe are special. In the executable or when loading a library, there are often three or four consecutive regions belonging to it: 0: read,exec; 1: read; 2: read,write; 3: read,write (no file name) 0: .text; 1: .rodata; 2: .data; 3: .bss Scanmem only cares for regions 2 and 3. But with address space layout randomization (ASLR) and PIC/PIE the load address of the code (start of region 0) is randomized. Subtracting the load address from a found memory address within a code region can show us the static memory address within the binary. This is why the load address is important and should be determined. For other types the region start is used. Further commits using the region type and the load address follow.

The pointer format should not only be used for the 'list' command but e.g. for the 'lregions' command as well. Also '%20p' is too long as, on x86_64 at least, there are only 6 bytes used for the memory address. We can also get rid of the '0x' to shorten it even further. So change it to '%12lx' for 64 bit and '%8lx' on 32 bit. We've also noticed that there is no 'ULONGMAX' define. The correct name would be 'ULONG_MAX'. So rename it and add a compiler warning in case it's not defined.

This gives us the information if a region belongs to a library or an executable and to which location these are loaded into memory.

It makes things much easier if the 'list' command not only shows the match info but the associated region info as well. Display the region ID and the region type to get a feeling in which kind of memory region a match is located. Subtracting the load address of the executable or library from the match address allows us to calculate a match offset which is the variable address within the binary. This helps bypassing address space layout randomization (ASLR). So display it. The GUI needs to be changed in following commits to recognize the new format.

The format of the scanmem 'list' command has changed. So the GUI has to be changed as well. The match offset from the code load address or region start as well as the region type are also useful in the ScanResult_TreeView. So add them to new columns.

coolwanglu · 2014-05-10T07:17:29Z

maps.c

some of these variables can be moved into the loop below.

Which ones?
I don't see a single one!
All these are required for multiple regions and not for a single region.
code_regions is set upon .text region and then incremented or reset to 0 in further regions. is_exe is also used for all 3 or 4 regions belonging to the binary and not detected again and again. prev_end is obviously used to hold the end address of the previous region. The load_addr is also only set upon .text region unless there is a region type different from code or exe.

Sorry I didn't realized that initialization here. Usually I'd move them right above the while statement, but it's ok this way.

coolwanglu · 2014-05-10T07:41:06Z

I just made the final review, please take a look at my comments.

coolwanglu · 2014-05-16T08:57:42Z

handlers.c

why 12, shouldn't it be 16?

Because I haven't ever seen that the full 8 bytes are used for virtual memory addresses. If that should be the case in the future on any architecture, we can increase it. It is only about indention.

Well, better to be safe here.

introduce region type and load address support

coolwanglu · 2014-05-16T09:04:04Z

Sorry for my lag these days, I've been very busy.

Thank you for your efforts and cooperation in this long process, I just tried to be careful especially when I don't have my build environment availabe.

This could be a very useful and powerful feature.

sriemer · 2014-05-16T15:20:30Z

Thank you, too! This means much to me!
For me this is already a very useful and powerful feature! :-) I made some bigger changes to the dynamic memory discovery process in my ugtrain recently. So I had to retest the example configs doing a lot of memory scanning with PIE and without PIE, with static memory and with dynamic memory. It is so cool to see the region type right away for matches! With PIE and the "exe" type I just have to put the found match offset into the config. As ugtrain has the same method to get the load address, it just has to add it back to the address from the config if PIE is detected. It's the only game trainer on Linux or even there is which supports PIE. :-) Now, I can finally document the memory discovery with PIE and make the next release. :-) Btw.: iOS on iPhones also always uses PIE. Ubuntu has it since 13.04 as a default.

sriemer added 5 commits May 8, 2014 22:35

lregions: show region type and load address

1a4f662

This gives us the information if a region belongs to a library or an executable and to which location these are loaded into memory.

coolwanglu reviewed May 10, 2014
View reviewed changes

coolwanglu reviewed May 16, 2014
View reviewed changes

coolwanglu added a commit that referenced this pull request May 16, 2014

Merge pull request #25 from sriemer/for-wanglu

9afd46a

introduce region type and load address support

coolwanglu merged commit 9afd46a into coolwanglu:master May 16, 2014

sriemer deleted the for-wanglu branch July 22, 2014 05:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

introduce region type and load address support #25

introduce region type and load address support #25

Uh oh!

sriemer commented May 8, 2014

Uh oh!

coolwanglu May 10, 2014

Uh oh!

sriemer May 10, 2014

Uh oh!

coolwanglu May 16, 2014

Uh oh!

coolwanglu commented May 10, 2014

Uh oh!

coolwanglu May 16, 2014

Uh oh!

sriemer May 16, 2014

Uh oh!

coolwanglu May 16, 2014

Uh oh!

coolwanglu commented May 16, 2014

Uh oh!

sriemer commented May 16, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

introduce region type and load address support #25

introduce region type and load address support #25

Uh oh!

Conversation

sriemer commented May 8, 2014

Uh oh!

coolwanglu May 10, 2014

Choose a reason for hiding this comment

Uh oh!

sriemer May 10, 2014

Choose a reason for hiding this comment

Uh oh!

coolwanglu May 16, 2014

Choose a reason for hiding this comment

Uh oh!

coolwanglu commented May 10, 2014

Uh oh!

coolwanglu May 16, 2014

Choose a reason for hiding this comment

Uh oh!

sriemer May 16, 2014

Choose a reason for hiding this comment

Uh oh!

coolwanglu May 16, 2014

Choose a reason for hiding this comment

Uh oh!

coolwanglu commented May 16, 2014

Uh oh!

sriemer commented May 16, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants