Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XOR Values. #1745

Merged
merged 2 commits into from
Jul 21, 2022
Merged

XOR Values. #1745

merged 2 commits into from
Jul 21, 2022

Conversation

wxsBSD
Copy link
Collaborator

@wxsBSD wxsBSD commented Jul 6, 2022

When the xor modifier is used we have not displayed (or even kept) the xor key.
This diff adds a -X option to the CLI that will display the xor key. To do this
I am recording the xor key in _yr_scan_xor_compare() and _yr_scan_xor_wcompare()
and then populating that in the YR_MATCH structure. This way it is available to
the consumers of libyara to handle how they see fit.

The yara command is getting a -X argument which displays the key (as a hex value, I find that easier to see in my brain) which will add an extra field to the output when an xor string is found, but nothing when a non-xor string is found. See this rule and output for an example:

wxs@mbp yara % cat rules/xor.yara
rule a {
  strings:
    $a = "This program cannot"
    $b = "This program cannot" xor(0-5)
  condition:
    any of them
}
wxs@mbp yara % ./yara -sLX rules/xor.yara tests/data/xor.out
a tests/data/xor.out
0x4:19:$a: This program cannot
0x4:19:$b:0x00: This program cannot
0x1c:19:$b:0x01: Uihr!qsnfs`l!b`oonu
0x34:19:$b:0x02: Vjkq"rpmepco"acllmv
0x4c:19:$b:0x03: Wkjp#sqldqbn#`bmmlw
0x64:19:$b:0x04: Plmw$tvkcvei$gejjkp
0x7c:19:$b:0x05: Qmlv%uwjbwdh%fdkkjq
wxs@mbp yara %

As you can see, $a is not using the xor modifier so the fourth field is the string contents, but the other lines have 5 fields because they are xor strings and as such get the xor key in the 4th field. I'm a bit torn on if this is the right way to do it or not. The other option I considered was always including an xor value even if the string is not an xor string. That would look like this:

wxs@mbp yara % ./yara -sLX rules/xor.yara tests/data/xor.out
a tests/data/xor.out
0x4:19:$a:0x00: This program cannot
0x4:19:$b:0x00: This program cannot
0x1c:19:$b:0x01: Uihr!qsnfs`l!b`oonu
0x34:19:$b:0x02: Vjkq"rpmepco"acllmv
0x4c:19:$b:0x03: Wkjp#sqldqbn#`bmmlw
0x64:19:$b:0x04: Plmw$tvkcvei$gejjkp
0x7c:19:$b:0x05: Qmlv%uwjbwdh%fdkkjq
wxs@mbp yara %

Notice that the $a string has a 4th field that is the xor key, even though it is not using the xor modifier. This makes the output consistent but at the cost of being confusing to users. It doesn't make sense, to me, for there to be a field which is the xor value if the string is not an xor string.

I'll be adding support for exposing the xor key in yara-python if this PR is accepted.

When the xor modifier is used we have not displayed (or even kept) the xor key.
This diff adds a -X option to the CLI that will display the xor key. To do this
I am recording the xor key in _yr_scan_xor_compare() and _yr_scan_xor_wcompare()
and then populating that in the YR_MATCH structure. This way it is available to
the consumers of libyara to handle how they see fit.

I'll be adding support for exposing this in yara-python if this PR is accepted.
@wxsBSD
Copy link
Collaborator Author

wxsBSD commented Jul 7, 2022

Just realized I forgot to add some tests for this. Let me know what you think and I'll add tests if you like it.

@plusvic
Copy link
Member

plusvic commented Jul 14, 2022

What if instead of printing the plain xor key, we print something like xor(key), for example:

0x64:19:$b:xor(0x04): Plmw$tvkcvei$gejjkp

This provides more context about the meaning of the 0x04. Also, when -X is used I would print the xor key for all strings even if they don't have the xor modifier. If you are processing the output with awk or some other program, the different number of "columns" in the output makes it harder to handle.

Per suggestion from Victor, always display the xor key when -X is specified,
even if the string is not an xor string. This makes it more consistent to parse
with common tools because the number of fields will always be the same.

Also, specify that it is an xor key using "xor(0x01)" format.
@plusvic plusvic merged commit 0c1cbee into VirusTotal:master Jul 21, 2022
fengjixuchui added a commit to fengjixuchui/yara that referenced this pull request Jul 21, 2022
@wxsBSD wxsBSD deleted the xor_value branch July 21, 2022 13:07
@vthib vthib mentioned this pull request Dec 27, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants