Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Chirps to Scan LLM APIs for Security Issues #148

Open
4 tasks
rseveymant opened this issue Aug 10, 2023 · 1 comment
Open
4 tasks

Extend Chirps to Scan LLM APIs for Security Issues #148

rseveymant opened this issue Aug 10, 2023 · 1 comment
Labels
backlog An issue that has been accepted and will be added to a future release epic help wanted Extra attention is needed

Comments

@rseveymant
Copy link
Member

Title: Extend Chirps to Scan LLM APIs for Security Issues

Description:

Chirps currently provides functionality to scan next-generation AI systems and checks for issues with the vectorDB. We need to extend this capability to scan LLM (Language Model) APIs for specific security-related issues such as Prompt Injection, DDOS, and other potential vulnerabilities.

Requirements:

  1. Scan LLM APIs for Prompt Injection: Identify and report potential vulnerabilities that could lead to unauthorized or malicious prompt injections.
  2. Detect DDOS Vulnerabilities: Determine weak points in the API that could be exploited for distributed denial of service (DDOS) attacks. We need to ensure that this is never enabled by default, and we have safe guards to mitigate bringing systems down.
  3. Integration with Existing System: The new feature should integrate seamlessly with the existing scanning mechanisms and user interface in Chirps.
  4. Authentication and Permissions: Ensure secure connections to LLM APIs by implementing appropriate authentication methods.
  5. Customizable Scan Options: Allow users to customize the scan based on their specific requirements and preferences.
  6. Output and Reporting: Integrate the results into Chirps' existing reporting or logging system.

Tasks:

  • Design and implement a new scanner module for LLM APIs.
  • Develop tests to cover all new functionality.
  • Integrate the new feature with the existing Chirps system.
  • Write user and developer documentation.

Acceptance Criteria:

  • The system must be able to successfully scan LLM APIs for the defined vulnerabilities without impacting existing functionality.
  • All new code must be covered by unit and integration tests.
  • The feature must be documented to facilitate both use and future development.
@rseveymant rseveymant added help wanted Extra attention is needed backlog An issue that has been accepted and will be added to a future release labels Aug 10, 2023
@zimventures
Copy link
Contributor

zimventures commented Aug 14, 2023

From an architecture perspective, here is what needs to occur for us to support this new functionality - and beyond.

The current workflow is that a scan walks through a list of policies, executing each rule within the policy. Each rule is responsible for performing a query against an asset, executing a regular expression against the query result, and logging a finding if a match occurs.

Conceptually, the LLM API scanning functionality will follow the same steps:

  • For each API endpoint (asset)
    • For each policy
      • Perform some kind of rule evaluation (whatever that looks like)
      • If the rule evaluates to true - log a finding

Changed Entities

In order to support new asset, rule, and Result and Finding types, each of those existing models will need to abstracted out into a base type. For example: we will now have VectorSearchAsset and APIAsset - both of which inherit from a BaseAsset type.

Asset Application Changes

The BaseAsset model currently has some fields that are specific to vector search functionality, notably the REQUIRES_EMBEDDINGS class variable and search() abstract method. Both of these should be moved out into a new VectorSearchAsset class, which will in turn inherit from BaseAsset. The existing MantiumAsset, PineconeAsset and RedisAsset classes will be updated to inherit from the new VectorSearchAsset class.
A new APIAsset class will be created for interacting with LLM's via an API.
If the new APIAsset's are correctly added to forms.py, then the call to asset_from_html_name() should ensure that no changes will need to be made to the views for this application.

After digging in further, this refactor is not needed. This refactor really would only be for removing the REQUIRES_EMBEDDING functionality, which can remain where it is.

Policy Application Changes

The existing Policy and PolicyVersion models will not be modified. What's changing here are the Rule's that belong to a policy. A new BaseRule class will contain the name, severity and policy fields.

A new RegexRule model will be created to move regular expression specific values out of the existing Rule model. It will contain query_string, query_embedding, and regex_test fields.

Additional rule types will be crafted to support the new LLM Scanning and DDOS Vulnerability functionality.

Configurable Severity

While we're doing the refactor, it would be useful to update the severity field to be its own model (instead of just an IntegerField).
The new Severity model would contain:

  • name (str)
  • value (int)
  • color (hexidecimal color value)

Out of the box, the system will define some basic severity levels (low, medium, high, critical, etc...)
A view will exist for severities to be edited, created and archived (they can not be deleted).

Scan Application Changes

Since a scan can now execute on multiple asset types as well as varying policy rule types, the result and any findings must also be abstracted to support those.

The Result model will be refactored into a BaseResult class containing only the scan_asset and rule (now a BaseRule) members. The text field will be pulled out into a RegexResult class.

The Finding model will be refactored into a BaseFinding model, containing a pointer to the result (BaseResult) that it belongs to. A new RegexFinding will contain the rest of the existing fields from the Finding model (source_id, offset, length) and the methods text(), surrounding_text(), and with_highlight().

New Job Types

We'd tossed around the idea of adding new Celery tasks to handle the new rule types, but that will not accommodate the case where an asset is scanned with multiple rule types. Instead, we will simply refactor out the rule-specific logic (regex, ddos, conversational penn-test, etc) into discreet modules that are used by the task.

New Functionality!

Finally, all of this refactoring leads us to a system whereby new asset and rule types can be added, without any impact on existing functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog An issue that has been accepted and will be added to a future release epic help wanted Extra attention is needed
Projects
Status: No status
Status: Backlog
Development

No branches or pull requests

2 participants