Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate UIDs for AST nodes with Expression base class #218

Merged
merged 5 commits into from
Jun 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/rtd/parsing_rulesets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -302,7 +302,8 @@ Let's say we want to print each function that is in called in the rule condition
Expression types
****************

There are a lot of expression types that you can visit. Here is a list of them all:
Each expression type has its own unique id (uid). These uids are unique only within scope of a single rule,
this allows to identify specific node in the AST for extra processing. There are a lot of expression types that you can visit. Here is a list of them all:

**String expressions**

Expand Down
3 changes: 3 additions & 0 deletions include/yaramod/parser/parser_driver.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include <pog/pog.h>

#include "yaramod/parser/file_context.h"
#include "yaramod/parser/uid_generator.h"
#include "yaramod/parser/value.h"
#include "yaramod/types/expressions.h"
#include "yaramod/types/meta.h"
Expand Down Expand Up @@ -204,6 +205,8 @@ class ParserDriver
bool _sectionStrings = false; ///< flag used to determine if we parse section after 'strings:'
bool _escapedContent = false; ///< flag used to determine if a currently parsed literal contains hexadecimal byte (such byte must be unescaped in getPureText())

UidGenerator _uidGen;

ParserMode _mode; ///< Parser mode.

Features _features; ///< Used to determine whether to include Avast-specific or VirusTotal-specific symbols or to skip them
Expand Down
32 changes: 32 additions & 0 deletions include/yaramod/parser/uid_generator.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/**
* @file src/parser/uid_generator.h
* @brief Declaration of class UidGenerator.
* @copyright (c) 2022 Avast Software, licensed under the MIT license
*/

#pragma once

#include <cstdint>

namespace yaramod {

/**
* Class that deterministically generates
* up to 2^64 unique IDs for AST nodes
*
* The IDs are unique for a given input
* so only pair (input; node) has a UID
* This means UidGenerator has to be reset
* For every new input
*/
class UidGenerator
{
public:
std::uint64_t next() { return _counter++; }
void reset() { _counter = 0; }

private:
std::uint64_t _counter = 0;
};

} // namespace yaramod
4 changes: 4 additions & 0 deletions include/yaramod/types/expression.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

#pragma once

#include <cstdint>
#include <memory>
#include <sstream>
#include <string>
Expand Down Expand Up @@ -56,6 +57,7 @@ class Expression

/// @name Getter methods
/// @{
std::uint64_t getUid() const { return _uid; }
Expression::Type getType() const { return _type; }
std::string getTypeString() const
{
Expand Down Expand Up @@ -86,6 +88,7 @@ class Expression

/// @name Setter methods
/// @{
void setUid(std::uint64_t uid) { _uid = uid; }
void setType(Expression::Type type) { _type = type; }
void setTokenStream(const std::shared_ptr<TokenStream>& ts) { _tokenStream = ts; }
/// @}
Expand Down Expand Up @@ -157,6 +160,7 @@ class Expression

private:
Type _type; ///< Type of the expression
std::uint64_t _uid;
};

}
Loading