Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: ci system tests - test failure #8815

Closed
wants to merge 47 commits into from

Conversation

gnufede
Copy link
Member

@gnufede gnufede commented Apr 1, 2024

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.
  • If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

christophe-papazian and others added 30 commits March 26, 2024 18:22
Adds support for submitting coverage data using the CI Visibility manual
API.

The `CICoverageData` class is used as a container. One of the features
it will enable is adding coverage data "a piece at a time" instead of
one-shotting the entire coverage data (which is still possible).

This will be useful for cases where tests don't run in-order within
suites (which is possible in cases like `pytest-xdist`).

Two worthwhile mentions compared to existing approach to coverage:
- paths are stored using `pathlib.Path` rather than strings
- paths are stored as absolute paths (by calling `.absolute()` on all
given `Paths` objects)

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Federico Mon <federico.mon@datadoghq.com>
Co-authored-by: Teague Bick <teague.bick@datadoghq.com>
Co-authored-by: Emmett Butler <723615+emmettbutler@users.noreply.github.com>
This PR fixes the AWS bedrock integration to default tag an empty string
if an input parameter is not provided (preivously we defaulted to
`"None"`), as that led to casting errors on the LLMObs integration side
when we try to cast the span tag value to an int/float.

Now, we default tag to an empty string, which avoids the casting errors
we had previously, and only add the `max_tokens` parameter to the LLMObs
span event if it is a non-null value.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [X] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
This PR fixes three things:
- An issue in the LLMObs' OpenAI integration which stores tool calls
(via the Chat Completions endpoint). Chat completions for tool calls
return a list of tool calls, but we had previously assumed only one tool
call would be returned.
- How we construct streamed tool chat completions. We were previously
checking the first chunk in the response to know to join the
`tool/function_call` chunk fields together, but it appears that the
first chunk in a response can actually contain no data at all. We are
now constructing the streamed response chunk-by-chunk.
- Add type checking for request messages arg in the chat completions
endpoint, as OpenAI allows users to pass in OpenAI `ChatMessage` class
types. We were previously only looking for dictionary arguments, but now
we'll correctly extract the message content based on the message type.

No changelog is required as this only affects private beta customers for
LLMObs.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [X] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
## Overview
This PR adds the
[`dramatiq`](https://github.com/Bogdanp/dramatiq/tree/master) library as
a supported integration for `dd-trace-py`. This addresses
#5043.

### In Scope for this PR
- Instrumenting the `dramatiq.actor.Actor.send_with_options()` method.
This is the method called when a function with the `@dramatiq.actor`
decorator is called asynchronously. See example below:

```python
# app.py
import dramatiq
from flask import Flask

app = Flask(__name__)

@dramatiq.actor
def my_func():
   return "response"

@dramatiq.actor
def my_other_func(a: int, b: int) -> int:
   return a + b

@app.route('/')
def index():
   my_func.send() # this calls send_with_options() under the hood
   my_other_func.send_with_options(args=(1, 1), options={"max_retries": 3})
   return 'hello world'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=6060)
```

Running the above example Flask app with `ddtrace-run flask run` and
hitting the `/` endpoint would generate a trace in the UI like the
following:

<img width="1052" alt="Screenshot 2024-03-26 at 3 28 21 PM"
src="https://github.com/DataDog/dd-trace-py/assets/153395705/78c6e832-c406-4829-92ac-799821fc2e31">

The detailed span info for `my_other_func.send_with_options(...)` would
look something like this:

```python
{
   actor: {
      name: my_other_func
      options: {"max_retries": 3}
   }
   env: test
   language: python
   span: {
      kind: producer
   }
}
```

#### NOTE
- The duration of the span is of the `send_with_options()` call itself,
and not reflective of the execution duration of the function being
asynchronously completed.
- When calling a function asynchronously with `send()`, it will also
display on the UI as a `send_with_options()` span. This is because
`send()` calls `send_with_options()` with empty options, so tracing both
of these functions would create two spans for every instance that
`send()` is called, with the exact same span information.

### Out of Scope for this PR
- All other `dramatiq` methods.
- Supporting the actual duration of the function executed asynchronously
by `send_with_options()`

The above and additional features can be investigated and added to this
initial iteration at a later time.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
CI: Slashes system tests running time in half (~40 min to ~20).

Main  drawback for this approach:
Doubles actual usage time. We can group the scenarios to reduce the
overhead, since most of them take ~1 min to run, and now an extra ~1 min
to fetch the docker images.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
This reverts commit ef4d804 from #8791,
which broke CI on the main branch.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.
## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
…#8798)

This PR adds support to the botocore integration's bedrock service to
correctly tag input/output messages from Anthropic calls.

Previously bedrock's models only supported raw text prompts and returned
text outputs. However, Anthropic's newest claude 3 supports a chat
message API, which means we need to support that as well.

This change also switches to using `tracer.trace()` instead of
`tracer.start_span(..., activate=False)` for bedrock spans, because the
latter meant that bedrock spans would always be root spans (messing up
parenting for traces containing non-root bedrock spans).
Additionally by getting rid of the `activate=False` argument, this means
that bedrock spans will now continue to be the active span until the
stream/body is completely consumed. Previously we allowed bedrock spans
to not be active, but if other downstream operations happen in the
bedrock span then they would not correctly be child spans of the bedrock
spans.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [X] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
@gnufede gnufede changed the title Gnufede/ci system tests test fail test: ci system tests - test failure Apr 1, 2024
res.append(f"File: {f.read()}")
except Exception as e:
res.append(f"Error: {e}")
return "<\\br>\n".join(res)

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed
tests/appsec/contrib_appsec/flask_app/app.py Dismissed Show dismissed Hide dismissed
@pr-commenter
Copy link

pr-commenter bot commented Apr 1, 2024

Benchmarks

Benchmark execution time: 2024-04-01 15:54:32

Comparing candidate commit a087073 in PR branch gnufede/ci-system-tests-test-fail with baseline commit 907b7e8 in branch main.

Found 4 performance improvements and 2 performance regressions! Performance is the same for 195 metrics, 9 unstable metrics.

scenario:httppropagationextract-b3_single_headers

  • 🟥 max_rss_usage [+548.554KB; +704.412KB] or [+2.584%; +3.318%]

scenario:httppropagationextract-full_t_id_datadog_headers

  • 🟥 max_rss_usage [+523.489KB; +756.921KB] or [+2.480%; +3.586%]

scenario:httppropagationextract-large_header_no_matches

  • 🟩 max_rss_usage [-806.350KB; -728.830KB] or [-3.682%; -3.328%]

scenario:httppropagationextract-medium_header_no_matches

  • 🟩 max_rss_usage [-772.461KB; -689.811KB] or [-3.532%; -3.154%]

scenario:httppropagationextract-wsgi_large_header_no_matches

  • 🟩 max_rss_usage [-817.192KB; -750.756KB] or [-3.730%; -3.427%]

scenario:httppropagationextract-wsgi_medium_header_no_matches

  • 🟩 max_rss_usage [-798.193KB; -732.891KB] or [-3.646%; -3.348%]

@gnufede gnufede closed this Apr 1, 2024
@gnufede gnufede deleted the gnufede/ci-system-tests-test-fail branch April 1, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants