Bring merge sort and insertion sort cmp function semantics together #17473

anisse · 2020-08-16T13:02:43Z

Merge sort uses cmp (a, b) < 0 for its first test branch, and insertion
sort cmp (a, b) > 0 ; which means the 0 boundary goes in one case in one
branch, and in the other sort function in the other branch.

We keep the semantics of the insertion sort, because it allows stability between
the two sort functions for equal elements.

Update tests that were broken because of wrong register ordering.

Your checklist for this pull request

I've read the guidelines for contributing to this repository
I made sure to follow the project's coding style
I've added tests that prove my fix is effective or that my feature works (if possible)
I've updated the documentation and the radare2 book with the relevant information (if needed)

Detailed description

Register order changed on arm32 after adding more register at the end: everything was seemingly in a random order. It's because the list went over 43 elements, and started using merge sort instead of insertion sort. This PR make merge sort behave properly with compare functions that return a bool.

Test plan

...

Closing issues

This should unblock test regression from PR #17462

shlr/sdb/src/ls.c

anisse · 2020-08-17T17:08:56Z

I rebased on master and removed sdb modifications after sending radareorg/sdb#212

XVilka · 2020-08-19T03:18:35Z

Please rebase on top of the master and sync SDB here since that PR was merged.

XVilka · 2020-08-20T05:14:53Z

Seems it has broken the reverse debugger:

[XX] db/archos/linux-x64/dbg_step_back dbg.stepback
R2_NOPLUGINS=1 radare2 -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -d -e dbg.bpsysign=true -Qc 'db main
db 0x004028fe
dc
dts+
dc
dsb
dsb
dr rbx,rcx,rdx,r12,rip
dk 9
' bins/elf/analysis/ls-linux-x86_64-zlul
-- stdout
@@ -1,5 +1,5 @@
 0x00000001
 0x00000001
-0x00000001
+0x0000a401
 0x00404870

-- stderr
Process with PID 33840 started...
= attach 33840 33840
bin.baddr 0x00400000
Using 0x400000
asm.bits 64
hit breakpoint at: 0x4028a0
Reading 4096 byte(s) from 0x0061c000...
Reading 4096 byte(s) from 0x0061d000...
Reading 135168 byte(s) from 0x02466000...
Reading 4096 byte(s) from 0x7fa88960b000...
Reading 16384 byte(s) from 0x7fa88960c000...
Reading 4096 byte(s) from 0x7fa889814000...
Reading 4096 byte(s) from 0x7fa889a18000...
Reading 4096 byte(s) from 0x7fa889c89000...
Reading 8192 byte(s) from 0x7fa88a075000...
Reading 16384 byte(s) from 0x7fa88a077000...
Reading 4096 byte(s) from 0x7fa88a282000...
Reading 4096 byte(s) from 0x7fa88a4a8000...
Reading 4096 byte(s) from 0x7fa88a282000...
Reading 4096 byte(s) from 0x7fa88a4a8000...
Reading 8192 byte(s) from 0x7fa88a4a9000...
Reading 28672 byte(s) from 0x7fa88a6b9000...
Reading 4096 byte(s) from 0x7fa88a6d3000...
Reading 4096 byte(s) from 0x7fa88a6d4000...
Reading 135168 byte(s) from 0x7ffe012d9000...
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
r_reg_get_value: Bit size 256 not supported
...

anisse · 2020-08-20T08:37:55Z

Yes, I didn't notice it at first because the test is also failing on my machine (Fedora 32), but differently. I saw the same result before and after the PR. I'll install Ubuntu to try to reproduce it.

anisse · 2020-08-20T16:25:08Z

I installed Ubuntu bionic in chroot, and the test passes on this branch, I'll try something else.

anisse · 2020-08-20T16:32:30Z

I think it may be because I don't have AVX extensions on my (old) CPU, see logs of a github action on my account, with additional debug info:

[…]
r_reg_get_value: Bit size 256 not supported for reg ymm10 
r_reg_get_value: Bit size 256 not supported for reg ymm11 
r_reg_get_value: Bit size 256 not supported for reg ymm12 
r_reg_get_value: Bit size 256 not supported for reg ymm13 
r_reg_get_value: Bit size 256 not supported for reg ymm14 
[…]

https://github.com/anisse/radare2/pull/3/checks?check_run_id=1005021519

Now I need to find a machine with those or debug through github actions :-/

anisse · 2020-08-23T22:46:21Z

Apparently glibc has a function to do a faster memset using avx512: vzeroupper. That's what the trace debugger is hitting, trying to get the value of the ymm0-15 registers, and failing because this isn't implemented.

I'm pretty sure this PR accidentally fixed something (I don't know what) which is now making this trace debugger test fail. I wouldn't mind a second pair of eyes for this.

I could add a fake case for getting the value of a 256bits register, just like 128bits (which isn't really implemented). What do you think ?

ret2libc · 2020-08-25T10:47:29Z

I'm not sure about this. By definition in r_list.h, RListComparator should return -1, 0, 1, so I think it is just wrong to pass a RListComparator that returns a bool. IMHO it is wrong to use cmp_order as in test_list.c and assume to have good results by returning a bool. @thestr4ng3r what do you think?

anisse · 2020-08-25T12:20:32Z

I'm not sure about this. By definition in r_list.h, RListComparator should return -1, 0, 1, so I think it is just wrong to pass a RListComparator that returns a bool. IMHO it is wrong to use cmp_order as in test_list.c and assume to have good results by returning a bool. @thestr4ng3r what do you think?

I've thought about this as well, but decided against this, for the following reasons:

there are comparators that rely on the "wrong" behaviour. And this behaviour works for insertion sort.
both insertion sort and merge sort should have the same behaviour
historically, there was only insertion sort, and merge sort was added later, with a different semantic at the 0 boundary. The "definition" in r_list.h was also added later, without fixing all the comparators. It was also probably added as consequence of this issue, without fixing the core issue.

I could provide a different PR that changes the comparators, but IMHO it would require changing the insertion sort as well, to stop working as it always did, since the "code is the API", and the behaviour should be the same for both functions to avoid any surprise once the list size changes; ideally this should be propagated in sdb as well, that already took this fix. What do you think ?

Right now there are lists in the code that aren't properly sorted: I've found the register lists for x86, aarch64, and the new one for arm32 in PR #17462 .

ret2libc · 2020-08-25T12:51:48Z

Right now there are lists in the code that aren't properly sorted: I've found the register lists for x86, aarch64, and the new one for arm32 in PR #17462 .

I think they are not properly sorted because of this bool/int mess. If you convert the comparator functions to use -1/0/1, they should work.

I'm not sure about this. By definition in r_list.h, RListComparator should return -1, 0, 1, so I think it is just wrong to pass a RListComparator that returns a bool. IMHO it is wrong to use cmp_order as in test_list.c and assume to have good results by returning a bool. @thestr4ng3r what do you think?

I've thought about this as well, but decided against this, for the following reasons:
* there are comparators that rely on the "wrong" behaviour. And this behaviour works for insertion sort.

The fact that it works is just a side effect, IMO. However, in general in C comparator functions like strcmp and similar return -1/0/1 (a negative, 0, positive number actually), this is why having the same semantics seems better to me.

* both insertion sort and merge sort should have the same behaviour

I agree, and I think they do, if the right comparator function (that is one that returns -1/0/1) is used in both cases. If merge or insertion sort don't work properly, provided with a -1/0/1 comparator function, than that is definitely a bug.

* historically, there was only insertion sort, and merge sort was added later, with a different semantic at the 0 boundary. The "definition" in r_list.h was also added later, without fixing all the comparators. It was also probably added as consequence of this issue, without fixing the core issue.

I didn't dig into the history, but the different semantic at the 0 boundary is probably only a "small issue" and probably should affect only the relative sorting of two elements with the same value according to cmp. Anyway, RListComparator is used not only in insertion/merge_sort and the same comparator function should work well in all those functions (e.g. r_list_sort, but also r_list_find). A comparator function that returns a > b would return false both when a == b and when a < b. False is interpreted as 0 and r_list_find uses !cmp to check whether an element was found in a list. So r_list_find would not work correctly. I think this is just one example, but in general I do not see it as a good thing to have a boolean disguised as a int.

All this to say that if there are problems when moving from insertion to merge sort, it is probably just because the comparator function passed to insertion_sort abused the way the sorting is implemented, but those are actually wrong. We should change those comparator functions. You can also see that you are actually casting a boolean (a > b) to int (which is the return value of RListComparator) and this usually should raise a red flag. That said, it is also ok to ensure that the behaviour when two elements are evaluated to be the same (RListComparator returns 0) is consistent between merge and insertion sort.

anisse · 2020-08-25T20:53:21Z

* historically, there was only insertion sort, and merge sort was added later, with a different semantic at the 0 boundary. The "definition" in r_list.h was also added later, without fixing all the comparators. It was also probably added as consequence of this issue, without fixing the core issue.
I didn't dig into the history, but the different semantic at the 0 boundary is probably only a "small issue" and probably should affect only the relative sorting of two elements with the same value according to cmp. Anyway, RListComparator is used not only in insertion/merge_sort and the same comparator function should work well in all those functions (e.g. r_list_sort, but also r_list_find). A comparator function that returns a > b would return false both when a == b and when a < b. False is interpreted as 0 and r_list_find uses !cmp to check whether an element was found in a list. So r_list_find would not work correctly. I think this is just one example, but in general I do not see it as a good thing to have a boolean disguised as a int.

It's not a small issue, once merge sort is used, lists aren't sorted at all. You can look at the test fixes to see how the order changes.

All this to say that if there are problems when moving from insertion to merge sort, it is probably just because the comparator function passed to insertion_sort abused the way the sorting is implemented, but those are actually wrong. We should change those comparator functions. You can also see that you are actually casting a boolean (a > b) to int (which is the return value of RListComparator) and this usually should raise a red flag. That said, it is also ok to ensure that the behaviour when two elements are evaluated to be the same (RListComparator returns 0) is consistent between merge and insertion sort.

I agree it's a comparator bug, but this PR has the advantage of entirely fixing this bug class. It will reappear again, because casting the function once to (RListComparator) is slightly easier than casting both const void*.

Here is my proposal:

keep this behaviour change for merge sort, but also fix the bad RListComparators. I've found about ~14 entries, I'll fix them in a separate patch in the PR.
I'll remove the test to show that it isn't acceptable use of r_list_sort

anisse · 2020-08-25T21:54:33Z

I've updated the PR with my proposal and changed the main description.

libr/core/agraph.c

ret2libc · 2020-08-26T07:20:09Z

I've updated the PR with my proposal and changed the main description.

Thanks! Much better now ;) Let's wait for CI. Please just fix that small comment (unless there is a reason to keep it like that), then it's ok for me.

test/db/anal/vars

test/db/archos/linux-x64/dbg_drt

libr/core/cconfig.c

This fixes the trace debugger test by removing the content of rdx, which changes on Fedora glibc, or recent Ubuntu with glibc AVX2 support. Ideally this test should be modified to depend less on the system libc.

trufae

Change the commit title and add ##anal i think this is an important change and it shuold be in the changelog, the message should make clear the function variable and reflines order is what is mainly affected. Do you have a screenshot of the change in reflines after this change? is the change in sdb correct or should be changed again because of the cmp <=

libr/core/cconfig.c

ret2libc · 2020-08-28T10:08:31Z

test/db/cmd/types

@@ -1334,9 +1334,9 @@ var char ** var_20h @ rbp-0x20
 var int64_t var_14h @ rbp-0x14
 var void * var_10h @ rbp-0x10
 var int64_t var_4h @ rbp-0x4
-arg int argc @ rdi


I still think there is something wrong here. I would expect variables and arguments to be shown in the "right order", not just sorted by register names. I know this depends on aBI, etc. but it kinda made sense to have rdi, rsi, rdx, while I find this new sorting inappropriate in this particular context. @XVilka @thestr4ng3r @kazarmy WDYT?

I agree, the argument order feels reversed. I'm not sure how to address that here though.

this needs a little bit more investigation. Maybe it's just the way it is, but maybe we are missing something.

anisse · 2020-08-28T10:15:41Z

Change the commit title and add ##anal i think this is an important change and it shuold be in the changelog, the message should make clear the function variable and reflines order is what is mainly affected.

Should it be for all commits ?

Do you have a screenshot of the change in reflines after this change?

I'm not sure how to test, but here is a small example; before:

$ ./binr/radare2/radare2 -e "asm.describe=false" -e "scr.color=0" -Qc "pd 12 @ 0x00065f70" test/bins/mach0/Alamofire-stripped
            0x00065f70      81040f58       ldr x1, 0x84000
            0x00065f74      60070094       bl 0x67cf4
        ┌─< 0x00065f78      e00000b4       cbz x0, 0x65f94
       ┌──> 0x00065f7c      1f0013eb       cmp x0, x19
      ┌───< 0x00065f80      80000054       b.eq 0x65f90
      │╎│   0x00065f84      ff060094       bl 0x67b80
      │└──< 0x00065f88      a0ffffb5       cbnz x0, 0x65f7c
      │┌──< 0x00065f8c      02000014       b 0x65f94
      └───> 0x00065f90      e0030032       orr w0, wzr, 1
       └└─> 0x00065f94      fd7b41a9       ldp x29, x30, [sp, 0x10]
            0x00065f98      f44fc2a8       ldp x20, x19, [sp], 0x20
            0x00065f9c      c0035fd6       ret

after:

$ ./binr/radare2/radare2 -e "asm.describe=false" -e "scr.color=0" -Qc "pd 12 @ 0x00065f70" test/bins/mach0/Alamofire-stripped
            0x00065f70      81040f58       ldr x1, 0x84000
            0x00065f74      60070094       bl 0x67cf4
        ┌─< 0x00065f78      e00000b4       cbz x0, 0x65f94
       ┌──> 0x00065f7c      1f0013eb       cmp x0, x19
      ┌───< 0x00065f80      80000054       b.eq 0x65f90
      │╎│   0x00065f84      ff060094       bl 0x67b80
      │└──< 0x00065f88      a0ffffb5       cbnz x0, 0x65f7c
      │┌──< 0x00065f8c      02000014       b 0x65f94
      └───> 0x00065f90      e0030032       orr w0, wzr, 1
       └└─> 0x00065f94      fd7b41a9       ldp x29, x30, [sp, 0x10]
            0x00065f98      f44fc2a8       ldp x20, x19, [sp], 0x20
            0x00065f9c      c0035fd6       ret

(no change)

is the change in sdb correct or should be changed again because of the cmp <=

Yes, I've made sure we keep the same behaviour with sdb.

…0 or 1 ##anal

…#anal Merge sort uses cmp (a, b) < 0 for its first test branch, and insertion sort cmp (a, b) > 0 ; which means the 0 boundary goes in one case in one branch, and in the other sort function in the other branch. It makes it possible to support compare function that return true/false instead of -1/0/1; although this isn't an acceptable use of RListComparator, this prevents future bugs from appearing, because this works with insertion sort, but not merge sort. The main advantage of this patch is that both sort functions should sort equal elements the same way. This stability is important for zignatures for example.

anisse · 2020-08-28T10:21:34Z

Change the commit title and add ##anal i think this is an important change and it shuold be in the changelog, the message should make clear the function variable and reflines order is what is mainly affected.

Should it be for all commits ?

I did it for commits that impact sort order and test fixes.

libr/reg/reg.c

ret2libc · 2020-09-02T07:05:59Z

@XVilka @thestr4ng3r @trufae someone please have another look at this and merge if you think it's ok!

XVilka

The comparison change looks good to merge. What looks wrong is the reversed order of the arguments/etc like @ret2libc pointed out.

radare · 2020-09-03T00:02:00Z

test/db/archos/linux-x64/dbg_drt

+xmm1h
+xmm1
+ds
+xmm1l


i think this order is messed up

It is, but that's because the register offsets are the same for xmm1l and ds : 184

radare2/libr/debug/p/native/linux/reg/linux-x64.h

Lines 159 to 161 in 2dfa75c

"xmm@fpu xmm1 .128 176 16\n"

"fpu xmm1h .64 176 8\n"

"fpu xmm1l .64 184 8\n"

radare2/libr/debug/p/native/linux/reg/linux-x64.h

Line 101 in 2dfa75c

"seg@gpr ds .64 184 0\n"

radare · 2020-09-03T00:02:34Z

test/db/archos/linux-x64/dbg_drt

-ymm14
-ymm13
-ymm12
-ymm11


its properly (reversed) sorted

Considering the original functions wants to sort in order, this is the issue this PR fixes:
https://github.com/radareorg/radare2/blob/master/libr/reg/reg.c#L200-L203

You can verify this by always using insertion sort instead here:
https://github.com/radareorg/radare2/blob/master/libr/util/list.c#L573-L577

radare · 2020-09-03T00:03:00Z

test/db/archos/linux-x64/dbg_drt

 r12
+dr3
+mxcr_mask


random order :?

This corresponds to the offset of the register: dr3 is at 24 and mxcr_mask at 28:
dr3: https://github.com/radareorg/radare2/blob/master/libr/anal/p/anal_x86_cs.c#L3576
mxcr_mask: https://github.com/radareorg/radare2/blob/master/libr/anal/p/anal_x86_cs.c#L3604

Sorry, wrong links, this is the file:

radare2/libr/debug/p/native/linux/reg/linux-x64.h

Line 108 in 2dfa75c

"drx dr3 .64 24 0\n"

radare2/libr/debug/p/native/linux/reg/linux-x64.h

Line 136 in 2dfa75c

"fpu mxcr_mask .32 28 0\n"

ret2libc · 2020-09-03T07:05:23Z

@XVilka @trufae as said in previous comments, we think those "random" orders are just because they were never really sorted in the first place. If you look at those tests, the original order was not really better than the new one. It seems like somehow they seemed sorted, but there is no real sorting underneath. Probably those commands should sort the things before printing them themselves.

trufae · 2020-09-04T22:44:09Z

lgtm

ret2libc · 2020-09-06T17:53:05Z

@anisse can you make the PR ready if it's ok to merge? Just to be sure you don't intend to add anything to this. Thanks again for this!

dismissing because trufae has accepted the changes.

anisse requested review from ret2libc, thestr4ng3r and trufae as code owners August 16, 2020 13:02

anisse mentioned this pull request Aug 16, 2020

Add minimal armv7 and aarch32 VFP and NEON support ##esil #17462

Merged

4 tasks

github-actions bot added the r2r Regression tests label Aug 16, 2020

ret2libc reviewed Aug 16, 2020

View reviewed changes

shlr/sdb/src/ls.c Show resolved Hide resolved

anisse force-pushed the mergesort branch from b44905d to a7d18c2 Compare August 17, 2020 17:07

anisse force-pushed the mergesort branch from a7d18c2 to d42dc86 Compare August 19, 2020 14:26

anisse force-pushed the mergesort branch from d42dc86 to edc1b91 Compare August 25, 2020 10:41

anisse requested a review from XVilka as a code owner August 25, 2020 10:41

anisse force-pushed the mergesort branch from edc1b91 to 10f7726 Compare August 25, 2020 21:51

ret2libc reviewed Aug 26, 2020

View reviewed changes

libr/core/agraph.c Outdated Show resolved Hide resolved

anisse force-pushed the mergesort branch from 10f7726 to 7e73e43 Compare August 26, 2020 21:31

ret2libc reviewed Aug 27, 2020

View reviewed changes

test/db/anal/vars Show resolved Hide resolved

test/db/archos/linux-x64/dbg_drt Show resolved Hide resolved

ret2libc suggested changes Aug 28, 2020

View reviewed changes

libr/core/cconfig.c Outdated Show resolved Hide resolved

ret2libc self-assigned this Aug 28, 2020

Trace debugger test: fix for distro with different glibc

c3d93d4

This fixes the trace debugger test by removing the content of rdx, which changes on Fedora glibc, or recent Ubuntu with glibc AVX2 support. Ideally this test should be modified to depend less on the system libc.

anisse added 2 commits August 28, 2020 10:15

Double regs: add fake support for 256 bits registers

c935dcd

Update SDB with merge sort and meson fixes

b92b2b9

anisse force-pushed the mergesort branch from 7e73e43 to d76be47 Compare August 28, 2020 08:15

trufae reviewed Aug 28, 2020

View reviewed changes

ret2libc reviewed Aug 28, 2020

View reviewed changes

anisse added 3 commits August 28, 2020 12:19

RListComparator: fix sort comparators that return bool to return -1, …

977364d

…0 or 1 ##anal

Fix tests that were impacted by wrong register ordering ##anal

53e9320

anisse force-pushed the mergesort branch from d76be47 to 53e9320 Compare August 28, 2020 10:20

ret2libc reviewed Aug 31, 2020

View reviewed changes

libr/reg/reg.c Show resolved Hide resolved

ret2libc approved these changes Sep 1, 2020

View reviewed changes

XVilka approved these changes Sep 2, 2020

View reviewed changes

radare previously requested changes Sep 3, 2020

View reviewed changes

XVilka marked this pull request as draft September 3, 2020 00:06

trufae approved these changes Sep 4, 2020

View reviewed changes

anisse marked this pull request as ready for review September 6, 2020 18:07

ret2libc merged commit a4c76ff into radareorg:master Sep 9, 2020

anisse deleted the mergesort branch September 24, 2020 21:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring merge sort and insertion sort cmp function semantics together #17473

Bring merge sort and insertion sort cmp function semantics together #17473

anisse commented Aug 16, 2020 •

edited

anisse commented Aug 17, 2020

XVilka commented Aug 19, 2020

XVilka commented Aug 20, 2020

anisse commented Aug 20, 2020 •

edited

anisse commented Aug 20, 2020

anisse commented Aug 20, 2020

anisse commented Aug 23, 2020

ret2libc commented Aug 25, 2020

anisse commented Aug 25, 2020 •

edited

ret2libc commented Aug 25, 2020

anisse commented Aug 25, 2020

anisse commented Aug 25, 2020

ret2libc commented Aug 26, 2020

trufae left a comment

ret2libc Aug 28, 2020

anisse Aug 28, 2020

ret2libc Aug 28, 2020

anisse commented Aug 28, 2020

anisse commented Aug 28, 2020

ret2libc commented Sep 2, 2020

XVilka left a comment

radare Sep 3, 2020

anisse Sep 3, 2020

radare Sep 3, 2020

anisse Sep 3, 2020

radare Sep 3, 2020

anisse Sep 3, 2020

anisse Sep 3, 2020 •

edited

ret2libc commented Sep 3, 2020

trufae commented Sep 4, 2020

ret2libc commented Sep 6, 2020

	"xmm@fpu xmm1 .128 176 16\n"
	"fpu xmm1h .64 176 8\n"
	"fpu xmm1l .64 184 8\n"

Bring merge sort and insertion sort cmp function semantics together #17473

Bring merge sort and insertion sort cmp function semantics together #17473

Conversation

anisse commented Aug 16, 2020 • edited

anisse commented Aug 17, 2020

XVilka commented Aug 19, 2020

XVilka commented Aug 20, 2020

anisse commented Aug 20, 2020 • edited

anisse commented Aug 20, 2020

anisse commented Aug 20, 2020

anisse commented Aug 23, 2020

ret2libc commented Aug 25, 2020

anisse commented Aug 25, 2020 • edited

ret2libc commented Aug 25, 2020

anisse commented Aug 25, 2020

anisse commented Aug 25, 2020

ret2libc commented Aug 26, 2020

trufae left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anisse commented Aug 28, 2020

anisse commented Aug 28, 2020

ret2libc commented Sep 2, 2020

XVilka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anisse Sep 3, 2020 • edited

Choose a reason for hiding this comment

ret2libc commented Sep 3, 2020

trufae commented Sep 4, 2020

ret2libc commented Sep 6, 2020

anisse commented Aug 16, 2020 •

edited

anisse commented Aug 20, 2020 •

edited

anisse commented Aug 25, 2020 •

edited

anisse Sep 3, 2020 •

edited