Skip to content

pre-filter strings, bytes based on whats found in the file #2126

@williballenthin

Description

@williballenthin

To avoid searching for strings/bytes that won't ever be found at a particular scope, we could first check that each string/bytes is present somewhere in the file.

If its not, then we can partially evaluate some rule logic (like and statements) to see if further logic can be pruned and/or rules skipped.

For example, we have HTTP User-Agent rules that contain tons of strings that match under a single or. If none are present in the file, we can skip the whole rule.

We'd want to ensure that the up-front scan to find the file matches doesn't take much time, and that it doesn't outweigh any performance improvements. Remember we may have hundreds or thousands of terms to look for. We can also use evaluation counts to show that less logic needs to be matched when some branches are pruned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceRelated to capa's performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions