Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to add a description for every feature #39

Merged
merged 5 commits into from
Jul 2, 2020
Merged

Conversation

Ana06
Copy link
Member

@Ana06 Ana06 commented Jun 30, 2020

Enable associate context for all features. This was called symbol before and only enabled for number, offset and bytes. Example:

  features:
    - and:
      - or:
        - api: kernel32.LoadLibrary = I am an inline description
        - api: kernel32.GetModuleHandle
        - api: kernel32.GetModuleHandleEx
        - api: ntdll.LdrLoadDll
      - or:
        - api: kernel32.GetProcAddress
        - api: ntdll.LdrGetProcedureAddress
      - optional:
        - characteristic(indirect call): true
		  description: I am a description

Screen Shot 2020-06-30 at 14 41 45

Inline syntax is not supported for strings.

This is not enabled for strings with regular expressions, as they are not a feature. It is needed to either make RegExp a feature or implement this for RegExp as well. I haven't implement this yet because I have some things to discuss about this implementation. I'll add it once is this is clarified (it can be as part of this PR or a different one).

Closes #5

@Ana06 Ana06 added the enhancement New feature or request label Jun 30, 2020
@Ana06 Ana06 requested a review from mike-hunhoff June 30, 2020 12:52
@Ana06
Copy link
Member Author

Ana06 commented Jun 30, 2020

I am thinking that it would be a good idea to use a different color for all the addresses. This would also help making clearer which the comment is.

For example:
Screen Shot 2020-06-30 at 15 02 46

instead of:
Screen Shot 2020-06-30 at 14 52 04

@williballenthin @mr-tz @mike-hunhoff what do you think? 🤔

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, can you please add tests showing all features support descriptions now as expected

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
capa/features/__init__.py Outdated Show resolved Hide resolved
capa/render/__init__.py Show resolved Hide resolved
capa/rules.py Show resolved Hide resolved
capa/rules.py Outdated Show resolved Hide resolved
capa/rules.py Outdated Show resolved Hide resolved
@Ana06
Copy link
Member Author

Ana06 commented Jun 30, 2020

@mr-tz

nice, can you please add tests showing all features support descriptions now as expected

That's a great idea! 👍 However I am not sure how to test it. I could write a tests in test_rules.py checking that the evaluation of the rule still works if there are descriptions, but that is not really testing that the descriptions work. I think we do not have tests to test the json output, do we? Because that would be a good place to test that the description is included. 🤔 Any idea how/where to write tests for this? 🙏

@Ana06 Ana06 force-pushed the ana-description branch 2 times, most recently from 7f08d3e to 543e424 Compare July 1, 2020 13:05
@Ana06
Copy link
Member Author

Ana06 commented Jul 1, 2020

With this changes the json for characteristic looks like:
{"feature": {"characteristic": "indirect call(True)", "type": "characteristic"}, "type": "feature"}
Is this ok? It makes sense to me. It is a string in all other features, so I think having an array only for this case, just complicate using capa.

And the output with the -vv option:
characteristic(indirect call(True))

I would however change it to
characteristic: indirect call(True)

for consistency with other features. For example:
api: kernel32.TerminateProcess

README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
capa/rules.py Show resolved Hide resolved
Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

muchas gracias!

@mr-tz
Copy link
Collaborator

mr-tz commented Jul 1, 2020

@mr-tz

nice, can you please add tests showing all features support descriptions now as expected

That's a great idea! 👍 However I am not sure how to test it. I could write a tests in test_rules.py checking that the evaluation of the rule still works if there are descriptions, but that is not really testing that the descriptions work. I think we do not have tests to test the json output, do we? Because that would be a good place to test that the description is included. 🤔 Any idea how/where to write tests for this? 🙏

For now testing if the parsing for all features works should be good. This could be done in test_rules.py

@mr-tz
Copy link
Collaborator

mr-tz commented Jul 1, 2020

With this changes the json for characteristic looks like:
{"feature": {"characteristic": "indirect call(True)", "type": "characteristic"}, "type": "feature"}
Is this ok? It makes sense to me. It is a string in all other features, so I think having an array only for this case, just complicate using capa.

And the output with the -vv option:
characteristic(indirect call(True))

I would however change it to
characteristic: indirect call(True)

for consistency with other features. For example:
api: kernel32.TerminateProcess

Yes, I like the consistent format for the JSON. Didn't we try to remove the True part before when displaying? May be related to #39 (comment)?

@Ana06
Copy link
Member Author

Ana06 commented Jul 1, 2020

@mr-tz

Didn't we try to remove the True part before when displaying? May be related to #39 (comment)?

It was actually removed from the -vv option, but I understood that @williballenthin wanted back from this comment:
#39 (comment)

But I think I got it wrong. Maybe he meant the json... but shouldn't we remove it from the json if it is only used internally?

@williballenthin
Copy link
Collaborator

for -vv format, i think the best case is characteristic(indirect call) because it mirrors what would be written in the rule (actually its characteristic(indirect call): True but i think we decided True is currently redundant).

i think this one is ok: characteristic: indirect call and i do like how it mirrors other features (sidebar: should we port this syntax to rules? it relies on the assumption that characteristics are always True, which has been consistent recently).

in the json, i'd like to see the True separated from the indirect call. this is so that a renderer can decide how to format the value, without parsing the string and looking for ( characters. as discussed above, we can maybe even remove the True all together, though keeping it around is a little more explicit.

does this make sense? is it reasonable?

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

characteristic changes almost identical to #63! see comments, not sure how to best deal with this race-condition 😄

capa/features/__init__.py Outdated Show resolved Hide resolved
capa/rules.py Outdated Show resolved Hide resolved
Enable associate context for all features. This was called symbol before
and only enabled for `number`, `offset` and `bytes`.

This is not enabled for strings with regular expressions, as they are
not a feature.
@Ana06
Copy link
Member Author

Ana06 commented Jul 2, 2020

I finished the deletion of value for Characteristic. I took the freeze_serialize function from @williballenthin's PR and added @williballenthin as co-author.

I have also added support for inline descriptions in count and tests

@Ana06
Copy link
Member Author

Ana06 commented Jul 2, 2020

There are still a few things missing related to this PR, but I would prefer to merge this first and open issues/PRs for them afterwards:

  • Supporting descriptions for RegExp. This is not done yet as they are not a feature. It is needed to either make RegExp a feature or implement this for RegExp as well.

  • Use a different color for all the addresses. This would also help making clearer which the comment is.
    Screen Shot 2020-06-30 at 15 02 46

  • rather than using an inline string ' = ' that is prone to typo and cannot be used with "find references", use a constant like DESCRIPTION_SEPARATOR = "' = "' and use this throughout the code. (from @williballenthin's comment)

This PR is modifying/refactoring core parts of code and the merge of almost any other PR causes merge conflicts.

README.md Show resolved Hide resolved
Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comments, otherwise looks good to me

Ana06 and others added 4 commits July 2, 2020 16:50
As the `__str__` method is not used anymore in the output, the
description implementation needs to be adapted.
Get rid of `True` in characteristic (rules, output and json) as it is
implicit. This way, the same syntax is used for characteristic as for
the rest of the features.

Co-authored-by: William Ballenthin <william.ballenthin@fireeye.com>
```
count(number(2 = AF_INET/SOCK_DGRAM)): 2
```
Test if the parsing of feature succeeds with every time of description.
@Ana06 Ana06 merged commit f6b54be into master Jul 2, 2020
@Ana06 Ana06 deleted the ana-description branch July 2, 2020 15:07
@williballenthin
Copy link
Collaborator

🪂

@Ana06
Copy link
Member Author

Ana06 commented Jul 2, 2020

💃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Associate context with a string
3 participants