Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of incomplete rules #159

Merged
merged 18 commits into from
May 26, 2021
Merged

Conversation

TadeasKucera
Copy link
Contributor

@TadeasKucera TadeasKucera commented May 18, 2021

This PR solves issue #45 and allows parsing of:

  • incomplete rules (we can get TokenStream after parsing)
  • rules containing unknown identifiers (we can get text, text_formatted and condition)
  • YARA files importing non-existent files (we can get text, text_formatted and condition)

: _type(type), _name(name), _documentation(documentation), _dataType(dataType)
{
}
/// @}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this constructor public, so I could construct a new Symbol instance in parser_driver.cpp on line 1635.

@@ -12,6 +12,8 @@
#include "yaramod/types/token_type.h"
#include "yaramod/utils/filesystem_operations.h"

#include <iostream>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started printing through std::cerr instead of throwing ParserError when parsing fails in IncompleteMode.This only happens if a YARA rule is not finished (missing }, etc.). It prints the error as Warning but we can still access the TokenStream as we can see in ParseIncompleteRuleNotFinished cpp parser test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing from libraries is generally bad practice. It's much better if library returns the errors/warnings to the user and they can decide what to do with them. Imagine a situation that you don't want to print to stderr or simply stderr is missing because the application is daemon. How would you know about such error/warning?

const auto& parentTokenText = parentExpr->getSymbolToken()->getText();
parentExpr->setSymbol(std::make_shared<ArraySymbol>(parentTokenText, ExpressionType::Undefined));
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why so complicated? I needed the tormatted text to still look nice, which means I had to replace some Symbol base class instances with instances of particular derived classes. Also it was necessary to determine correct types of some Token instances (not everywhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thing is, that the IdExpression::setSymbol method also updates the pointer stored in the Token associated with this Symbol instance.

else
{
std::cerr << "Warning: '" << e.getErrorMessage() << "'\nThis Warning would be an Error if not parsing in Incomplete mode." << std::endl;
return true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate any reviewer's opinien on if this printout is actually wanted. Thank you in advance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said earlier, let the user handle the prints themselves. But I'd argue here that it's pointless to show something like this to an user. Incomplete mode is opt-in so user must know what they are doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to stop printing these Warnings, thank you for the suggestion.

}

TEST_F(ParserTests,
ParseIncompleteRuleNotFinished) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how incomplete rule can be parsed. The TokenStream instance is still accessible.

'''
self.assertEqual(expected, yara_file.text_formatted)

def test_include_undefined_file_in_incomplete_mode(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already had some file including tests in this file so I added it here in Python.

@TadeasKucera TadeasKucera marked this pull request as ready for review May 20, 2021 08:49
@TadeasKucera TadeasKucera requested a review from metthal May 20, 2021 08:49
@@ -12,6 +12,8 @@
#include "yaramod/types/token_type.h"
#include "yaramod/utils/filesystem_operations.h"

#include <iostream>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing from libraries is generally bad practice. It's much better if library returns the errors/warnings to the user and they can decide what to do with them. Imagine a situation that you don't want to print to stderr or simply stderr is missing because the application is daemon. How would you know about such error/warning?

else
{
std::cerr << "Warning: '" << e.getErrorMessage() << "'\nThis Warning would be an Error if not parsing in Incomplete mode." << std::endl;
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said earlier, let the user handle the prints themselves. But I'd argue here that it's pointless to show something like this to an user. Incomplete mode is opt-in so user must know what they are doing.

@TadeasKucera
Copy link
Contributor Author

Thank you for your comments. I removed the Warning when ParsingError for the reason you stated.

@TadeasKucera TadeasKucera requested a review from metthal May 20, 2021 12:57
@TadeasKucera TadeasKucera merged commit 5c8dd98 into master May 26, 2021
@TadeasKucera TadeasKucera deleted the parsing_of_incomplete_rules branch May 26, 2021 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants