Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined behaviour in telfhash causes different results in big-endian machines. #1874

Closed
plusvic opened this issue Feb 8, 2023 · 3 comments
Labels

Comments

@plusvic
Copy link
Member

plusvic commented Feb 8, 2023

Describe the bug

The test case below produces a different result in big-endian machines. Part of the issue was in the ELF module itself, but it was fixed in 32ae80d. However, after that fix I discovered another issue in the implementation of telfhash.

yara/tests/test-elf.c

Lines 263 to 270 in 32ae80d

assert_true_rule_file(
"import \"elf\" \
rule test { \
condition: \
elf.telfhash() == \
\"T174B012188204F00184540770331E0B111373086019509C464D0ACE88181266C09774FA\" \
}",
"tests/data/elf_with_imports");

The telfhash produced for that test case in big-endian machines is:

T174B021188204F00184540770331E0B111373086019509C464D0ACE88181266C09774FA

Notice that they are very similar, except for the third byte in the hash, where the nibbles are swapped:

T174B012..
T174B021..

The issue is related to this union

union
{
unsigned char qb;
struct
{
unsigned char q1ratio : 4;
unsigned char q2ratio : 4;
} QR;
} Q;

Apparently the order of fields q1ratio and q2ratio depend on the machine endianness, but the those fields are later read using the union's qb member, which causes a different value for qb. This looks like telfhash is relying on undefined behaviour of the C compiler.

Screenshots
If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: All, depend on machine endianness.
  • YARA version: 4.3.0
@plusvic plusvic added the bug label Feb 8, 2023
@plusvic
Copy link
Member Author

plusvic commented Feb 8, 2023

@HoundThe, FYI.

@plusvic
Copy link
Member Author

plusvic commented Feb 22, 2023

Fixed in b2612f5

@plusvic plusvic closed this as completed Feb 22, 2023
@plusvic
Copy link
Member Author

plusvic commented Feb 22, 2023

This issue was solved in https://github.com/trendmicro/tlsh in a different way. They simply changed the order of fields q1ratio and q2ratio in big endian systems. See:

trendmicro/tlsh@f03e509#diff-54a1a479e5038435cbb29bff5813fe612863ab07ab5bdc7e0685aa89ac31735dR123-R131

However, I think their solution is not general enough, as it only works if __SPARC or _AIX macros are defined. I'm not sure whether these macros are defined in a Linux running on the S390X platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant