Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(asm): add ssrf support for urllib.request #8776

Merged

Conversation

christophe-papazian
Copy link
Contributor

@christophe-papazian christophe-papazian commented Mar 26, 2024

Following #8568, this PR add support for SSRF for exploit prevention.

  1. Add support for SSRF using urllib.request in standard Python API
  2. Improve handling of parameters for exploit prevention (positioned or named)
  3. Add endpoints and new unit tests for SSRF
  4. Add preliminary support for iast in threat hatch tests

This feature is still private and disabled. Corresponding tests were run locally and on the CI before being marked skipped.

APPSEC-51853

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.
  • If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@christophe-papazian christophe-papazian added changelog/no-changelog A changelog entry is not required for this PR. ASM Application Security Monitoring labels Mar 26, 2024
@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented Mar 26, 2024

Datadog Report

Branch report: christophe-papazian/ssrf_support_for_exploit_prevention
Commit report: a8e22c5
Test service: dd-trace-py

✅ 0 Failed, 1984 Passed, 109589 Skipped, 18m 17.68s Total duration (1h 33m 3.63s time saved)

@pr-commenter
Copy link

pr-commenter bot commented Mar 27, 2024

Benchmarks

Benchmark execution time: 2024-04-11 09:18:41

Comparing candidate commit a8e22c5 in PR branch christophe-papazian/ssrf_support_for_exploit_prevention with baseline commit f26cab7 in branch main.

Found 1 performance improvements and 1 performance regressions! Performance is the same for 199 metrics, 9 unstable metrics.

scenario:httppropagationextract-datadog_tracecontext_tracestate_not_propagated_on_trace_id_no_match

  • 🟩 max_rss_usage [-1034.019KB; -758.391KB] or [-4.723%; -3.464%]

scenario:httppropagationinject-ids_only

  • 🟥 max_rss_usage [+677.733KB; +738.255KB] or [+3.205%; +3.491%]

christophe-papazian and others added 11 commits March 27, 2024 15:36
Adds support for submitting coverage data using the CI Visibility manual
API.

The `CICoverageData` class is used as a container. One of the features
it will enable is adding coverage data "a piece at a time" instead of
one-shotting the entire coverage data (which is still possible).

This will be useful for cases where tests don't run in-order within
suites (which is possible in cases like `pytest-xdist`).

Two worthwhile mentions compared to existing approach to coverage:
- paths are stored using `pathlib.Path` rather than strings
- paths are stored as absolute paths (by calling `.absolute()` on all
given `Paths` objects)

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Federico Mon <federico.mon@datadoghq.com>
Co-authored-by: Teague Bick <teague.bick@datadoghq.com>
Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com>
This PR fixes the AWS bedrock integration to default tag an empty string
if an input parameter is not provided (preivously we defaulted to
`"None"`), as that led to casting errors on the LLMObs integration side
when we try to cast the span tag value to an int/float.

Now, we default tag to an empty string, which avoids the casting errors
we had previously, and only add the `max_tokens` parameter to the LLMObs
span event if it is a non-null value.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [X] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
This PR fixes three things:
- An issue in the LLMObs' OpenAI integration which stores tool calls
(via the Chat Completions endpoint). Chat completions for tool calls
return a list of tool calls, but we had previously assumed only one tool
call would be returned.
- How we construct streamed tool chat completions. We were previously
checking the first chunk in the response to know to join the
`tool/function_call` chunk fields together, but it appears that the
first chunk in a response can actually contain no data at all. We are
now constructing the streamed response chunk-by-chunk.
- Add type checking for request messages arg in the chat completions
endpoint, as OpenAI allows users to pass in OpenAI `ChatMessage` class
types. We were previously only looking for dictionary arguments, but now
we'll correctly extract the message content based on the message type.

No changelog is required as this only affects private beta customers for
LLMObs.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [X] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
## Overview
This PR adds the
[`dramatiq`](https://github.com/Bogdanp/dramatiq/tree/master) library as
a supported integration for `dd-trace-py`. This addresses
#5043.

### In Scope for this PR
- Instrumenting the `dramatiq.actor.Actor.send_with_options()` method.
This is the method called when a function with the `@dramatiq.actor`
decorator is called asynchronously. See example below:

```python
# app.py
import dramatiq
from flask import Flask

app = Flask(__name__)

@dramatiq.actor
def my_func():
   return "response"

@dramatiq.actor
def my_other_func(a: int, b: int) -> int:
   return a + b

@app.route('/')
def index():
   my_func.send() # this calls send_with_options() under the hood
   my_other_func.send_with_options(args=(1, 1), options={"max_retries": 3})
   return 'hello world'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=6060)
```

Running the above example Flask app with `ddtrace-run flask run` and
hitting the `/` endpoint would generate a trace in the UI like the
following:

<img width="1052" alt="Screenshot 2024-03-26 at 3 28 21 PM"
src="https://github.com/DataDog/dd-trace-py/assets/153395705/78c6e832-c406-4829-92ac-799821fc2e31">

The detailed span info for `my_other_func.send_with_options(...)` would
look something like this:

```python
{
   actor: {
      name: my_other_func
      options: {"max_retries": 3}
   }
   env: test
   language: python
   span: {
      kind: producer
   }
}
```

#### NOTE
- The duration of the span is of the `send_with_options()` call itself,
and not reflective of the execution duration of the function being
asynchronously completed.
- When calling a function asynchronously with `send()`, it will also
display on the UI as a `send_with_options()` span. This is because
`send()` calls `send_with_options()` with empty options, so tracing both
of these functions would create two spans for every instance that
`send()` is called, with the exact same span information.

### Out of Scope for this PR
- All other `dramatiq` methods.
- Supporting the actual duration of the function executed asynchronously
by `send_with_options()`

The above and additional features can be investigated and added to this
initial iteration at a later time.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
CI: Slashes system tests running time in half (~40 min to ~20).

Main  drawback for this approach:
Doubles actual usage time. We can group the scenarios to reduce the
overhead, since most of them take ~1 min to run, and now an extra ~1 min
to fetch the docker images.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
This reverts commit ef4d804 from #8791,
which broke CI on the main branch.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.
## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
…#8798)

This PR adds support to the botocore integration's bedrock service to
correctly tag input/output messages from Anthropic calls.

Previously bedrock's models only supported raw text prompts and returned
text outputs. However, Anthropic's newest claude 3 supports a chat
message API, which means we need to support that as well.

This change also switches to using `tracer.trace()` instead of
`tracer.start_span(..., activate=False)` for bedrock spans, because the
latter meant that bedrock spans would always be root spans (messing up
parenting for traces containing non-root bedrock spans).
Additionally by getting rid of the `activate=False` argument, this means
that bedrock spans will now continue to be the active span until the
stream/body is completely consumed. Previously we allowed bedrock spans
to not be active, but if other downstream operations happen in the
bedrock span then they would not correctly be child spans of the bedrock
spans.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [X] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
christophe-papazian and others added 10 commits March 29, 2024 14:05
## Checklist

- [ ] Change(s) are motivated and described in the PR description
- [ ] Testing strategy is described if automated tests are not included
in the PR
- [ ] Risks are described (performance impact, potential for breakage,
maintainability)
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [ ] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [ ] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [ ] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
tests/appsec/contrib_appsec/django_app/urls.py Dismissed Show dismissed Hide dismissed
tests/appsec/contrib_appsec/django_app/urls.py Dismissed Show dismissed Hide dismissed
tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed
@christophe-papazian christophe-papazian marked this pull request as ready for review April 11, 2024 08:03
@christophe-papazian christophe-papazian requested a review from a team as a code owner April 11, 2024 08:03
christophe-papazian and others added 2 commits April 11, 2024 10:21
Co-authored-by: Alberto Vara <alberto.vara@datadoghq.com>
@christophe-papazian christophe-papazian enabled auto-merge (squash) April 11, 2024 09:21
@christophe-papazian christophe-papazian merged commit 9a40868 into main Apr 11, 2024
82 of 83 checks passed
@christophe-papazian christophe-papazian deleted the christophe-papazian/ssrf_support_for_exploit_prevention branch April 11, 2024 09:21
christophe-papazian added a commit that referenced this pull request Apr 15, 2024
After #8776 this PR adds
support for third party `requests` for ssrf monitoring on exploit
prevention.

This feature is still private and disabled. Corresponding tests were run
locally and on the CI before being marked skipped.

Also:

- add "request" to unpatch in contrib/requests (this was previously
missing, CI/test only feature)

APPSEC-51853

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASM Application Security Monitoring changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants