Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add capafmt utility for consistent formatting of rules #8

Closed
williballenthin opened this issue Jun 22, 2020 · 6 comments · Fixed by #22
Closed

add capafmt utility for consistent formatting of rules #8

williballenthin opened this issue Jun 22, 2020 · 6 comments · Fixed by #22
Assignees
Labels
enhancement New feature or request

Comments

@williballenthin
Copy link
Collaborator

it would be nice to format rules with a consistent style.

this includes:

  • whitespacing, especially with lists
  • order meta before features

by default, python yaml emits keys alphabetically. as an example:

rule:
  meta:
    att&ck:
    - Defense Evasion::Obfuscated Files or Information T1027.002
    author: william.ballenthin@fireeye.com
    examples:
    - CD2CBA9E6313E8DF2C1273593E649682
    - Practical Malware Analysis Lab 01-02.exe_:0x0401000
    mbc:
    - Anti-Static Analysis::Software Packing
    name: packed with UPX
    namespace: anti-analysis/packer/upx
    scope: file
  features:
  - or:
    - section: UPX0
    - section: UPX1

this wold look nicer:

rule:
  meta:
    name: packed with UPX
    namespace: anti-analysis/packer/upx
    author: william.ballenthin@fireeye.com
    att&ck:
    - Defense Evasion::Obfuscated Files or Information T1027.002
    mbc:
    - Anti-Static Analysis::Software Packing
    examples:
    - CD2CBA9E6313E8DF2C1273593E649682
    - Practical Malware Analysis Lab 01-02.exe_:0x0401000
    scope: file
  features:
  - or:
    - section: UPX0
    - section: UPX1
@williballenthin williballenthin added the enhancement New feature or request label Jun 22, 2020
@williballenthin williballenthin self-assigned this Jun 22, 2020
@williballenthin
Copy link
Collaborator Author

williballenthin commented Jun 22, 2020

concern: a linter that reformats rules and may re-order lines probably doesn't handle block comments very well, e.g.:

or:
    # only on win10
    string: "This program cannot be run in WinXP mode."
    # only on winxp
    string: "This program cannot be run in Win10 mode."

we should try to avoid using implicit ordering like this in our rules. if we need to group comments, maybe we should do:

or:
    or:
        # only on win10
        string: "This program cannot be run in WinXP mode."
    or:
        # only on winxp
        string: "This program cannot be run in Win10 mode."

although in this case, the inner or blocks don't really mean "or". so maybe we can introduce another keyword, like block or commented, that contains a single statement and can be used for grouping comments?


similar problem for sequences of comments, like:

or:
    # only on win10
    # but not after 20H1
    string: "This program cannot be run in WinXP mode."

i wonder if comments even get extracted into the AST...


YAML doesn't support block comments

@mr-tz
Copy link
Collaborator

mr-tz commented Jun 23, 2020

Great point! I would even prefer additional whitespace in subpoints:

rule:
  meta:
    name: packed with UPX
    namespace: anti-analysis/packer/upx
    author: william.ballenthin@fireeye.com
    att&ck:
      - Defense Evasion::Obfuscated Files or Information T1027.002
    mbc:
      - Anti-Static Analysis::Software Packing
    examples:
      - CD2CBA9E6313E8DF2C1273593E649682
      - Practical Malware Analysis Lab 01-02.exe_:0x0401000
    scope: file
  features:
    - or:
      - section: UPX0
      - section: UPX1

I'd vote against using or to group comments. So maybe this will need to be responsibility of the user, if they really want it.
Otherwise, the description syntax (string: "'This program cannot be run in WinXP mode.' = only on win10, but not after 20H1") or a line-comment should be used.

@williballenthin
Copy link
Collaborator Author

i agree. lets punt on supporting comments like this unless they become critical for some reason. maybe it won't actually be a problem.

@williballenthin
Copy link
Collaborator Author

williballenthin commented Jun 23, 2020

i also like the additional indentation in lists. need to do some research into how to tweak the pyyaml serialization.

yaml/pyyaml#234
https://stackoverflow.com/a/39681672/87207
https://github.com/adrienverge/yamllint

@williballenthin
Copy link
Collaborator Author

williballenthin commented Jun 24, 2020

pyyaml completely drops inline comments during deserialization. we currently have 175 inline comments in our rules. see mandiant/capa-rules#1

we could use ruamel.yaml instead which tries to maintain comments. is this a losing battle?

@mr-tz
Copy link
Collaborator

mr-tz commented Jun 24, 2020

I think it's worthwhile keeping these comments to document and further enrich rules.

I haven't looked into what this entails, but let's try to strike a good balance between implementation workload and provided benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants