-
Notifications
You must be signed in to change notification settings - Fork 18
introduce region type and load address support #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The region type can be one of the following: misc (default), code (belongs to library or executable), exe (belongs to the executable), heap (is the heap region), stack (is the stack region) The types misc, heap and stack are single regions which can be determined easily. But code and exe are special. In the executable or when loading a library, there are often three or four consecutive regions belonging to it: 0: read,exec; 1: read; 2: read,write; 3: read,write (no file name) 0: .text; 1: .rodata; 2: .data; 3: .bss Scanmem only cares for regions 2 and 3. But with address space layout randomization (ASLR) and PIC/PIE the load address of the code (start of region 0) is randomized. Subtracting the load address from a found memory address within a code region can show us the static memory address within the binary. This is why the load address is important and should be determined. For other types the region start is used. Further commits using the region type and the load address follow.
The pointer format should not only be used for the 'list' command but e.g. for the 'lregions' command as well. Also '%20p' is too long as, on x86_64 at least, there are only 6 bytes used for the memory address. We can also get rid of the '0x' to shorten it even further. So change it to '%12lx' for 64 bit and '%8lx' on 32 bit. We've also noticed that there is no 'ULONGMAX' define. The correct name would be 'ULONG_MAX'. So rename it and add a compiler warning in case it's not defined.
This gives us the information if a region belongs to a library or an executable and to which location these are loaded into memory.
It makes things much easier if the 'list' command not only shows the match info but the associated region info as well. Display the region ID and the region type to get a feeling in which kind of memory region a match is located. Subtracting the load address of the executable or library from the match address allows us to calculate a match offset which is the variable address within the binary. This helps bypassing address space layout randomization (ASLR). So display it. The GUI needs to be changed in following commits to recognize the new format.
The format of the scanmem 'list' command has changed. So the GUI has to be changed as well. The match offset from the code load address or region start as well as the region type are also useful in the ScanResult_TreeView. So add them to new columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these variables can be moved into the loop below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which ones?
I don't see a single one!
All these are required for multiple regions and not for a single region.
code_regions
is set upon .text
region and then incremented or reset to 0 in further regions. is_exe
is also used for all 3 or 4 regions belonging to the binary and not detected again and again. prev_end
is obviously used to hold the end address of the previous region. The load_addr
is also only set upon .text
region unless there is a region type different from code
or exe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't realized that initialization here. Usually I'd move them right above the while
statement, but it's ok this way.
I just made the final review, please take a look at my comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 12, shouldn't it be 16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I haven't ever seen that the full 8 bytes are used for virtual memory addresses. If that should be the case in the future on any architecture, we can increase it. It is only about indention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, better to be safe here.
introduce region type and load address support
Sorry for my lag these days, I've been very busy. Thank you for your efforts and cooperation in this long process, I just tried to be careful especially when I don't have my build environment availabe. This could be a very useful and powerful feature. |
Thank you, too! This means much to me! |
This patch set extends scanmem to determine the region type and the load address of libraries and the executable. This information is displayed additionally in the
lregions
command. The default type is a single regionmisc
. The single region typesheap
andstack
are also easy to determine butcode
andexe
are special:code
is a library or an executable which consists of 3 or 4 regions:0: r,x - .text; 1: r - .rodata; 2: r,w - .data; 3: r,w - .bss
The regions 1, 2 and 3 are consecutive in memory (often even region 0 as well). This is how it can be determined that these regions belong together. The start of the .text region is used as the load address of the binary. Subtracting it from a match address within the .data or .bss section results in the address used within the binary. This helps bypassing address space layout randomization (ASLR) in combination with position independent code (PIC) or position independent executable (PIE). The
list
command is extended to show that in an additionally displayed region info for found matches. The output format changes. So the GUI is changed as well. Theexe
type is a subtype ofcode
and is just the executable.