Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARA UI & Service Updates #55

Closed
eljeffeg opened this issue Mar 16, 2023 · 17 comments
Closed

YARA UI & Service Updates #55

eljeffeg opened this issue Mar 16, 2023 · 17 comments
Assignees
Labels
assess We still haven't decided if this will be worked on or not bug Something isn't working

Comments

@eljeffeg
Copy link

eljeffeg commented Mar 16, 2023

  1. It would be nice if the Signature Search also searched on Source. So if I search bartblaze or reversinglabs, I'd get some Yara hits for Signatures. I couldn't find the Yara (perhaps due to a 10,000 limit on Signatures) so I was searching on the Source and kept getting 0 hits. I didn't realize I needed to search Yara. I did end up finding them by clicking on the source fingerprint.

  2. Some of my yara source updates don't appear to be processing the yara rules. Here are couple public repositories that I've added that say they've downloaded, but show no signatures.

@eljeffeg eljeffeg added assess We still haven't decided if this will be worked on or not bug Something isn't working labels Mar 16, 2023
@cccs-jp
Copy link
Contributor

cccs-jp commented Mar 16, 2023

for number 1 you can do: source:reversinglabs (case sensitive)
@cccs-sgaron Will investigate indexing all the text of the rule body in the futur.
@cccs-rs for #2 :)

@cccs-rs
Copy link
Contributor

cccs-rs commented Mar 17, 2023

Regex seems wrong, should be .*\.yara$? I'll test it with the sources mentioned.

@cccs-rs
Copy link
Contributor

cccs-rs commented Mar 17, 2023

@eljeffeg
Copy link
Author

Sorry, I did have the dot in front .*.yara$, but I tried to add the backslash but same non-result.

@cccs-rs
Copy link
Contributor

cccs-rs commented Mar 17, 2023

Source Configurations (typo on the second but still matched the file path):
image

Which yielded signatures being import for both sources:
image

Is there anything in the yara-updates container to suggest what the issue could be?

@eljeffeg
Copy link
Author

No errors in the logs. For all purposes that I can tell, it completes successfully. There are just no signatures when I click on the fingerprint. Status: Skipped.
Screenshot 2023-03-17 at 9 41 41 AM
Screenshot 2023-03-17 at 9 41 32 AM
Screenshot 2023-03-17 at 9 41 48 AM

@cccs-rs
Copy link
Contributor

cccs-rs commented Mar 20, 2023

Issue partially resolved by clearing the cached entries in Redis. Adding a cache-less update mechanism has been created as a ticket.

@eljeffeg
Copy link
Author

eljeffeg commented Mar 24, 2023

Documenting what @cccs-rs and I have discussed in chat (and also a reminder). The Redis cache resolved 2 of my 3 issues with the yara repositories. The last repo reported an error of 'utf-8' codec can't decode byte 0xbb in position 908: invalid start byte. as it read the files.

The line causing the exception is here: https://github.com/CybercentreCanada/assemblyline-service-yara/blob/3061ac084004db4cb514d192ab88fd142ad9d09e/yara_/update_server.py#L96

The readlines() causes an exception as it finds a character it can't decode. I did verify the offending files are utf-8 files that just have unknown characters in them. Here is an example.

What I guess I'd suggest is maybe a couple things...

  1. Might make sense to Include the reading of a file in the try / except statement, so even if there is an error on a file, it continues to read the other rules in the repo.

  2. Probably would be good to define errors in the open(errors=) so that it can hopefully use the rule as intended. I'm not sure if they should be set to ignore, replace, or maybe surrogateescape? https://docs.python.org/3/library/functions.html#open

@malvidin
Copy link

malvidin commented Mar 30, 2023

Based on these comments, I recommend ignoring any encoding errors.
VirusTotal/yara#1770 (comment)
VirusTotal/yara-python#136 (comment)

I expect that the offending files are CP1252, but the invalid characters are in comments, which YARA ignores.

@eljeffeg
Copy link
Author

I did have one that appeared in the rule itself. Don't know if it was intentional or not, but it looked like this.

$str_05 = "TESTING"
$str_06 = "SOFTWARE\\Clients\\Mail"
$str_07 = "8.8.8.8"
$str_08 = "<(:<\\Documents and Settings\\all users\\Application Data\\�"
$str_09 = "C:\\ProgramData\\Microsoft\\RAC\\"

@malvidin
Copy link

$str_08 = "<(:<\Documents and Settings\all users\Application Data\�"

Do you have a link to the source, or can you upload that portion of the rule here? Strings pasted inline won't retain the source encoding.

@eljeffeg
Copy link
Author

eljeffeg commented Mar 31, 2023

It's not public, but here is a modified file based on that rule. I have 4 yara files that fail on the reading due to a character encoding.
testgen.txt

@malvidin
Copy link

The character in the file you uploaded includes the Unicode replacement character. It appears that the original character was lost.

I can replicate the issue by converting the file to CP-1252 and adding the 0xBB byte in that position, but any successful compilation and matches with YARA with rule files using non UTF8 encodings should not be relied upon.

test_cp1252.txt

@eljeffeg
Copy link
Author

Yeah, it was a BB hex for that one, which comes across in my hex editor as ».

@eljeffeg
Copy link
Author

I'm not overly concerned if one rule in a thousand doesn't work - just wasn't sure what the best way to handle it is. It does need to be handled in some way though so it doesn't cause an exception and abort the entire repo read.

@malvidin
Copy link

Looks like I was behind on the status of the rule encodings. But there appears to be a consistent desire from the development team to only support UTF-8 in the future.
VirusTotal/yara@68eb237

iso-8859-1 will accept any byte, but the values in [\x80-\xFF] may be incorrect if a different encoding was used at the source.

                try:
                    with open(file, 'r') as f:
                        f_lines = f.readlines()
                except UnicodeDecodeError as exc:
                    with open(file, 'r', encoding='iso-8859-1') as f:
                        f_lines = f.readlines()
                    self.log.error(f"File could not be loaded as UTF-8: {file}")

@cccs-rs
Copy link
Contributor

cccs-rs commented Apr 12, 2023

v4.4.0.stable5 of the YARA service will include performing a surrogateescape when reading the contents of YARA files with non-UTF8 characters

Services w/ updaters built from v4.4.0.stable5 of Assemblyline will now be able to instruct the Scaler about when to scale a service.

If a service depends on the bundle sent by the updater, indicated by wait_for_update flag, (ie. YARA) then the service is only considered 'active' and ready for scaling only if the updater deems it so (ie. it has at least 1 downloaded source).

If you notice this issue still persists, feel free to re-open the issue with more information! 😁

@cccs-rs cccs-rs closed this as completed Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assess We still haven't decided if this will be worked on or not bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants