Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support percent notation in grammar. #43

Merged
merged 2 commits into from
Dec 10, 2021

Conversation

wxsBSD
Copy link
Contributor

@wxsBSD wxsBSD commented Nov 4, 2021

Add support for "x% of them" notation to gyp.

While I was here I noticed that the protobuf generated code was being put into
the github.com/VirusTotal/gyp/pb directory, which was causing various problems
when trying to build things as they all expected to find the protobuf related
code in the pb/ directory. To deal with this I modified the Makefile so that the
protobuf generated code is going in the pb/ directory yet the go_package remains
as github.com/VirusTotal/gyp/pb. I'm not sure if this is going to cause problems
with builds of other things using bazel that depend upon this so it may be good
to get @Zohiartze to review this so we can avoid a repeat of #42.

Add support for "x% of them" notation to gyp.

While I was here I noticed that the protobuf generated code was being put into
the github.com/VirusTotal/gyp/pb directory, which was causing various problems
when trying to build things as they all expected to find the protobuf related
code in the pb/ directory. To deal with this I modified the Makefile so that the
protobuf generated code is going in the pb/ directory yet the go_package remains
as github.com/VirusTotal/gyp/pb. I'm not sure if this is going to cause problems
with builds of other things using bazel that depend upon this so it may be good
to get @Zohiartze to review this so we can avoid a repeat of VirusTotal#42.
parser/grammar.y Outdated
@@ -824,6 +824,13 @@ expression
Strings: $3,
}
}
| primary_expression '%' _OF_ string_set
{
$$ = &ast.PercentOf{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we do in the yara compiler is validate that if the percentage is defined that it is between 1 and 100 (inclusive). I'm not doing that here but I suspect I should.

@plusvic
Copy link
Member

plusvic commented Nov 4, 2021

Instead of creating a new expression PercentOf I think that a better solution is reusing the Of expression with a new type of quantifier Percentage, where Percentage is an expression itself. The advantages are twofold:

  • While using the AST you don't need to take into account that there are two kinds of nodes for the of expression, you just have an Of node that covers all cases.
  • If we ever introduce something like for x% of (...) we don't need to add yet another kind of node, we simply reuse the Percentage node and use it with the already existing ForOf.

@Zohiartze
Copy link
Contributor

@wxsBSD thank you for the fix and the heads up, as of my tests everything is all right with bazel 👍

@wxsBSD
Copy link
Contributor Author

wxsBSD commented Nov 11, 2021

Sorry for the delay, been a bit burnt out and unable to bring myself to finish this work the past week. I think I can get it done though...

Instead of creating a new expression PercentOf I think that a better solution is reusing the Of expression with a new type of quantifier Percentage, where Percentage is an expression itself.

To be clear, you're talking about making a new Of Expression in the protobuf definitions and in the modifying the Of struct in the AST?

@plusvic
Copy link
Member

plusvic commented Nov 15, 2021

No, the Of structure in the AST doesn't need to be modified. The Of structure has a Quantifier field, which right now can contain any kind of Expression.

What I mean is having a special kind of Expression (let's call it Percentage) for percentages. The x% portion in the x% of them expression will be represented by a Percentage node in the AST. When the Quantifier field in Of is of type Percentage that means that we have an expression like x% of ..... If we have x of them, the Quantifier will be some other type, like LiteralInteger.

In turn, Percentage can be just a wrapper for the expression corresponding to the x. I mean, Percentage can be declared as:

type Percentage struct {
  Expression
}

This way Percentage is simply wrapping any other expression. The wrapped expression is the AST for the x, while Percentage represents x%.

I'm not sure if my explanation is clear enough, do you get what I mean?

@plusvic
Copy link
Member

plusvic commented Nov 15, 2021

This way you don't need to change the existing Of structure AST. The way in which you express something like x% of them if by assigning a node of type Percentage to the Quantifier field.

@wxsBSD
Copy link
Contributor Author

wxsBSD commented Nov 16, 2021

I'm not sure if my explanation is clear enough, do you get what I mean?

I think I understand this now. I will take a shot at this in the next couple of days and update this PR.

@wxsBSD
Copy link
Contributor Author

wxsBSD commented Nov 16, 2021

The Of structure has a Quantifier field, which right now can contain any kind of Expression.

The Of structure looks like this:

// Of is an Expression representing a "of" operation. Example:
//   <quantifier> of <string_set>
type Of struct {
	Quantifier *Quantifier
	Strings    Node
}

The Quantifier field needs to be of type *Quantifier, which is not an Expression. It is its own type which wraps an Expression:

// Quantifier is an Expression used in for loops, it can be either a numeric
// expression or the keywords "any" or "all".
type Quantifier struct {
	Expression
}

I know you said Of does not need to be modified but if I want to use the Quantifier field wouldn't I need to change the type to Expression so that anything which satisfies the Expression implementation can be used? The idea being that both Quantifier and Percentage AST types implement it?

Turns out that trying to learn golang while hacking on abstract things like this is difficult. ;)

@plusvic
Copy link
Member

plusvic commented Nov 18, 2021

The Quantifier field in Of can keep its *Quantifier type, but the type Quantifier type can wrap any kind of Expression, so you can do things like:

quantifier := &Quantifier{KeywordAll}
quantifier := &Quantifier{&LiteralInteger{Value: 1}}

Similarly you could do:

type Percentage struct {
  Expression
}


quantifier := &Quantifier{&Percentage{&LiteralInteger{Value: 1}}}

When you say 1% of ..., the Quantifier instance, instead of containing a LiteralInteger embedded, would contain a Percentage which in turns contains a LiteralInteger.

Address comments from Victor, which I understand conceptually but may have
implemented in a weird way. ;)
@wxsBSD
Copy link
Contributor Author

wxsBSD commented Dec 9, 2021

Sorry for the delay, I didn't see your response until a few days ago but I think I finally got it. This is ready for review now.

@@ -971,6 +973,14 @@ string_enumeration_item
;


percent_expression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

percent_expression is not needed, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this clause in bd33cde

@plusvic plusvic merged commit 51d2ad1 into VirusTotal:master Dec 10, 2021
@wxsBSD wxsBSD deleted the new_grammar_features branch December 11, 2021 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants