test: ci system tests - test failure #8815

gnufede · 2024-04-01T13:48:36Z

Checklist

Change(s) are motivated and described in the PR description
Testing strategy is described if automated tests are not included in the PR
Risks are described (performance impact, potential for breakage, maintainability)
Change is maintainable (easy to change, telemetry, documentation)
Library release note guidelines are followed or label changelog/no-changelog is set
Documentation is included (in-code, generated user docs, public corp docs)
Backport labels are set (if applicable)
If this PR changes the public interface, I've notified @DataDog/apm-tees.
If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.

Reviewer Checklist

Title is accurate
All changes are related to the pull request's stated goal
Description motivates each change
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Change is maintainable (easy to change, telemetry, documentation)
Release note makes sense to a user of the library
Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

…_prevention

Adds support for submitting coverage data using the CI Visibility manual API. The `CICoverageData` class is used as a container. One of the features it will enable is adding coverage data "a piece at a time" instead of one-shotting the entire coverage data (which is still possible). This will be useful for cases where tests don't run in-order within suites (which is possible in cases like `pytest-xdist`). Two worthwhile mentions compared to existing approach to coverage: - paths are stored using `pathlib.Path` rather than strings - paths are stored as absolute paths (by calling `.absolute()` on all given `Paths` objects) ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: Federico Mon <federico.mon@datadoghq.com> Co-authored-by: Teague Bick <teague.bick@datadoghq.com> Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com>

This PR fixes the AWS bedrock integration to default tag an empty string if an input parameter is not provided (preivously we defaulted to `"None"`), as that led to casting errors on the LLMObs integration side when we try to cast the span tag value to an int/float. Now, we default tag to an empty string, which avoids the casting errors we had previously, and only add the `max_tokens` parameter to the LLMObs span event if it is a non-null value. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [X] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [X] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

This PR fixes three things: - An issue in the LLMObs' OpenAI integration which stores tool calls (via the Chat Completions endpoint). Chat completions for tool calls return a list of tool calls, but we had previously assumed only one tool call would be returned. - How we construct streamed tool chat completions. We were previously checking the first chunk in the response to know to join the `tool/function_call` chunk fields together, but it appears that the first chunk in a response can actually contain no data at all. We are now constructing the streamed response chunk-by-chunk. - Add type checking for request messages arg in the chat completions endpoint, as OpenAI allows users to pass in OpenAI `ChatMessage` class types. We were previously only looking for dictionary arguments, but now we'll correctly extract the message content based on the message type. No changelog is required as this only affects private beta customers for LLMObs. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [X] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [X] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

## Overview This PR adds the [`dramatiq`](https://github.com/Bogdanp/dramatiq/tree/master) library as a supported integration for `dd-trace-py`. This addresses #5043. ### In Scope for this PR - Instrumenting the `dramatiq.actor.Actor.send_with_options()` method. This is the method called when a function with the `@dramatiq.actor` decorator is called asynchronously. See example below: ```python # app.py import dramatiq from flask import Flask app = Flask(__name__) @dramatiq.actor def my_func(): return "response" @dramatiq.actor def my_other_func(a: int, b: int) -> int: return a + b @app.route('/') def index(): my_func.send() # this calls send_with_options() under the hood my_other_func.send_with_options(args=(1, 1), options={"max_retries": 3}) return 'hello world' if __name__ == '__main__': app.run(host='0.0.0.0', port=6060) ``` Running the above example Flask app with `ddtrace-run flask run` and hitting the `/` endpoint would generate a trace in the UI like the following: <img width="1052" alt="Screenshot 2024-03-26 at 3 28 21 PM" src="https://github.com/DataDog/dd-trace-py/assets/153395705/78c6e832-c406-4829-92ac-799821fc2e31"> The detailed span info for `my_other_func.send_with_options(...)` would look something like this: ```python { actor: { name: my_other_func options: {"max_retries": 3} } env: test language: python span: { kind: producer } } ``` #### NOTE - The duration of the span is of the `send_with_options()` call itself, and not reflective of the execution duration of the function being asynchronously completed. - When calling a function asynchronously with `send()`, it will also display on the UI as a `send_with_options()` span. This is because `send()` calls `send_with_options()` with empty options, so tracing both of these functions would create two spans for every instance that `send()` is called, with the exact same span information. ### Out of Scope for this PR - All other `dramatiq` methods. - Supporting the actual duration of the function executed asynchronously by `send_with_options()` The above and additional features can be investigated and added to this initial iteration at a later time. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [ ] Title is accurate - [ ] All changes are related to the pull request's stated goal - [ ] Description motivates each change - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [ ] Testing strategy adequately addresses listed risks - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] Release note makes sense to a user of the library - [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

CI: Slashes system tests running time in half (~40 min to ~20). Main drawback for this approach: Doubles actual usage time. We can group the scenarios to reduce the overhead, since most of them take ~1 min to run, and now an extra ~1 min to fetch the docker images. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

This reverts commit ef4d804 from #8791, which broke CI on the main branch. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`.

## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [ ] Title is accurate - [ ] All changes are related to the pull request's stated goal - [ ] Description motivates each change - [ ] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [ ] Testing strategy adequately addresses listed risks - [ ] Change is maintainable (easy to change, telemetry, documentation) - [ ] Release note makes sense to a user of the library - [ ] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [ ] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

…#8798) This PR adds support to the botocore integration's bedrock service to correctly tag input/output messages from Anthropic calls. Previously bedrock's models only supported raw text prompts and returned text outputs. However, Anthropic's newest claude 3 supports a chat message API, which means we need to support that as well. This change also switches to using `tracer.trace()` instead of `tracer.start_span(..., activate=False)` for bedrock spans, because the latter meant that bedrock spans would always be root spans (messing up parenting for traces containing non-root bedrock spans). Additionally by getting rid of the `activate=False` argument, this means that bedrock spans will now continue to be the active span until the stream/body is completely consumed. Previously we allowed bedrock spans to not be active, but if other downstream operations happen in the bedrock span then they would not correctly be child spans of the bedrock spans. ## Checklist - [x] Change(s) are motivated and described in the PR description - [x] Testing strategy is described if automated tests are not included in the PR - [x] Risks are described (performance impact, potential for breakage, maintainability) - [x] Change is maintainable (easy to change, telemetry, documentation) - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed or label `changelog/no-changelog` is set - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)) - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] If this PR changes the public interface, I've notified `@DataDog/apm-tees`. - [x] If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. ## Reviewer Checklist - [x] Title is accurate - [x] All changes are related to the pull request's stated goal - [x] Description motivates each change - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - [x] Testing strategy adequately addresses listed risks - [X] Change is maintainable (easy to change, telemetry, documentation) - [x] Release note makes sense to a user of the library - [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

…e checks

…_prevention

This reverts commit 998cb34.

…_for_exploit_prevention' into gnufede/ci-system-tests-test-fail

tests/appsec/contrib_appsec/flask_app/app.py

pr-commenter · 2024-04-01T15:24:24Z

Benchmarks

Benchmark execution time: 2024-04-01 15:54:32

Comparing candidate commit a087073 in PR branch gnufede/ci-system-tests-test-fail with baseline commit 907b7e8 in branch main.

Found 4 performance improvements and 2 performance regressions! Performance is the same for 195 metrics, 9 unstable metrics.

scenario:httppropagationextract-b3_single_headers

🟥 max_rss_usage [+548.554KB; +704.412KB] or [+2.584%; +3.318%]

scenario:httppropagationextract-full_t_id_datadog_headers

🟥 max_rss_usage [+523.489KB; +756.921KB] or [+2.480%; +3.586%]

scenario:httppropagationextract-large_header_no_matches

🟩 max_rss_usage [-806.350KB; -728.830KB] or [-3.682%; -3.328%]

scenario:httppropagationextract-medium_header_no_matches

🟩 max_rss_usage [-772.461KB; -689.811KB] or [-3.532%; -3.154%]

scenario:httppropagationextract-wsgi_large_header_no_matches

🟩 max_rss_usage [-817.192KB; -750.756KB] or [-3.730%; -3.427%]

scenario:httppropagationextract-wsgi_medium_header_no_matches

🟩 max_rss_usage [-798.193KB; -732.891KB] or [-3.646%; -3.348%]

christophe-papazian and others added 30 commits March 26, 2024 18:22

add ssrf support for urllib.request

5288ec7

update django and flask endpoints for threat tests

3356c97

Merge branch 'main' into christophe-papazian/ssrf_support_for_exploit…

c2373b7

…_prevention

test parallel system tests

cccf720

force run system-tests

7a9eac1

include sha, improve needs

1ae1763

checkout system tests in run step too

54afc50

add remaining scenarios

903ce90

store and restore venv

f83366b

execution permissions

2c39429

store and restore agent and runner images too

c05ea8d

only patch when exploit prevention is enabled

033e230

typo

8e7aa8b

don't store/restore the runner

cf6cc88

remove guard for patching

dc25103

add final step

b82b802

revert run always

91d8af7

test log artifacts

b091875

fix compress artifact step

c716b90

cleanup

92a5745

upgrade tests, iast check still wip

2347f00

fix label for stack_id, improve exploit prevention unit test with mor…

f03e74d

…e checks

christophe-papazian and others added 12 commits March 29, 2024 17:27

Merge branch 'main' into christophe-papazian/ssrf_support_for_exploit…

f40223f

…_prevention

test force skip

77783bd

group scenarios

ccbcbe4

revert force skip

4eaa629

Merge branch 'main' into gnufede/ci-parallel-system-tests

009324c

change if logic

f34171e

change matrix.scenario to scenario

61cb162

fix if condition for compress step

9630b34

test: force skip

998cb34

adds if to python 3.9 step

d64c503

Revert "test: force skip"

b6d02ef

This reverts commit 998cb34.

Merge remote-tracking branch 'origin/christophe-papazian/ssrf_support…

33168b4

…_for_exploit_prevention' into gnufede/ci-system-tests-test-fail

gnufede changed the title ~~Gnufede/ci system tests test fail~~ test: ci system tests - test failure Apr 1, 2024

github-advanced-security bot found potential problems Apr 1, 2024

View reviewed changes

tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed

tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed

tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed

gnufede added 4 commits April 1, 2024 16:20

propagate failure to end jobs

a61650e

fix failure

37d2755

limit to flask

a087073

typo, make it faster

5fd617e

fix failure condition

680b0f0

gnufede closed this Apr 1, 2024

gnufede deleted the gnufede/ci-system-tests-test-fail branch April 1, 2024 16:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: ci system tests - test failure #8815

test: ci system tests - test failure #8815

gnufede commented Apr 1, 2024

pr-commenter bot commented Apr 1, 2024 •

edited

Loading

test: ci system tests - test failure #8815

test: ci system tests - test failure #8815

Conversation

gnufede commented Apr 1, 2024

Checklist

Reviewer Checklist

pr-commenter bot commented Apr 1, 2024 • edited Loading

Benchmarks

scenario:httppropagationextract-b3_single_headers

scenario:httppropagationextract-full_t_id_datadog_headers

scenario:httppropagationextract-large_header_no_matches

scenario:httppropagationextract-medium_header_no_matches

scenario:httppropagationextract-wsgi_large_header_no_matches

scenario:httppropagationextract-wsgi_medium_header_no_matches

pr-commenter bot commented Apr 1, 2024 •

edited

Loading