Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set <event>.duration.us field in ingest pipeline #7261

Merged
merged 6 commits into from
Feb 13, 2022

Conversation

axw
Copy link
Member

@axw axw commented Feb 11, 2022

Motivation/summary

Update package model to no longer set transaction.duration.us or span.duration.us, and instead set event.duration. Update our ingest pipelines to use event.duration to set the legacy fields, and then remove event.duration. This pipeline can be removed once when we are ready and able to update the APM UI to use event.duration directly.

I have listed this as a breaking change in the top-level changelog, as it will have visible effects for anyone not using the Elasticsearch output or the packaged ingest pipeline. I have not done this in the integration package changelog, as I assume the ingest pipeline is always in use there.

Checklist

How to test these changes

Send transactions and spans with zero and non-zero durations, and ensure they each have transaction.duration.us or span.duration.us set, and do not have event.duration set.

Related issues

Prerequisite of #5999
Part of #3565 and #4120

We'll set the <event>.duration.us fields in ingest.
Translate event.duration to <event>.duration.us for
transaction and span events, and then drop event.duration.
@axw axw added the v8.2.0 label Feb 11, 2022
@mergify
Copy link
Contributor

mergify bot commented Feb 11, 2022

This pull request does not have a backport label. Could you fix it @axw? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.x is the label to automatically backport to the 7.x branch.
  • backport-7./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Feb 11, 2022
@apmmachine
Copy link
Collaborator

apmmachine commented Feb 11, 2022

💔 Tests Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-02-13T04:45:50.708+0000

  • Duration: 60 min 37 sec

Test stats 🧪

Test Results
Failed 1
Passed 5625
Skipped 19
Total 5645

Test errors 1

Expand to view the tests failures

Build and Test / System and Environment Tests / TestFleetIntegrationBeatsMonitoring – github.com/elastic/apm-server/systemtest
    Expand to view the error details

     Failed 
    

    Expand to view the stacktrace

     === RUN   TestFleetIntegrationBeatsMonitoring
    2022/02/13 05:17:56 Building image elastic-agent-systemtest:8.2.0-ff5ac1e4-SNAPSHOT...
    2022/02/13 05:17:57 Built image elastic-agent-systemtest:8.2.0-ff5ac1e4-SNAPSHOT
    2022/02/13 05:17:57 Starting container id: 35f9fd482aab image: elastic-agent-systemtest:8.2.0-ff5ac1e4-SNAPSHOT
    2022/02/13 05:17:57 Waiting for container id 35f9fd482aab image: elastic-agent-systemtest:8.2.0-ff5ac1e4-SNAPSHOT
    2022/02/13 05:18:03 Container is ready id: 35f9fd482aab image: elastic-agent-systemtest:8.2.0-ff5ac1e4-SNAPSHOT
        fleet_test.go:78: 
            	Error Trace:	fleet_test.go:78
            	Error:      	Not equal: 
            	            	expected: map[string]interface {}{"output":map[string]interface {}{"events":map[string]interface {}{"acked":15, "active":0, "batches":1, "failed":0, "toomany":0, "total":15}}}
            	            	actual  : map[string]interface {}{"config":map[string]interface {}{"module":map[string]interface {}{"running":0, "starts":0, "stops":0}, "reloads":0, "scans":0}, "output":map[string]interface {}{"events":map[string]interface {}{"acked":0, "active":0, "batches":0, "dropped":0, "duplicates":0, "failed":0, "toomany":0, "total":0}, "read":map[string]interface {}{"bytes":0, "errors":0}, "type":"elasticsearch", "write":map[string]interface {}{"bytes":0, "errors":0}}, "pipeline":map[string]interface {}{"clients":0, "events":map[string]interface {}{"active":0, "dropped":0, "failed":0, "filtered":0, "published":0, "retry":0, "total":0}, "queue":map[string]interface {}{"acked":0, "max_events":4096}}}
            	            	
            	            	Diff:
            	            	--- Expected
            	            	+++ Actual
            	            	@@ -1,10 +1,46 @@
            	            	-(map[string]interface {}) (len=1) {
            	            	- (string) (len=6) "output": (map[string]interface {}) (len=1) {
            	            	-  (string) (len=6) "events": (map[string]interface {}) (len=6) {
            	            	-   (string) (len=5) "acked": (float64) 15,
            	            	+(map[string]interface {}) (len=3) {
            	            	+ (string) (len=6) "config": (map[string]interface {}) (len=3) {
            	            	+  (string) (len=6) "module": (map[string]interface {}) (len=3) {
            	            	+   (string) (len=7) "running": (float64) 0,
            	            	+   (string) (len=6) "starts": (float64) 0,
            	            	+   (string) (len=5) "stops": (float64) 0
            	            	+  },
            	            	+  (string) (len=7) "reloads": (float64) 0,
            	            	+  (string) (len=5) "scans": (float64) 0
            	            	+ },
            	            	+ (string) (len=6) "output": (map[string]interface {}) (len=4) {
            	            	+  (string) (len=6) "events": (map[string]interface {}) (len=8) {
            	            	+   (string) (len=5) "acked": (float64) 0,
            	            	    (string) (len=6) "active": (float64) 0,
            	            	-   (string) (len=7) "batches": (float64) 1,
            	            	+   (string) (len=7) "batches": (float64) 0,
            	            	+   (string) (len=7) "dropped": (float64) 0,
            	            	+   (string) (len=10) "duplicates": (float64) 0,
            	            	    (string) (len=6) "failed": (float64) 0,
            	            	    (string) (len=7) "toomany": (float64) 0,
            	            	-   (string) (len=5) "total": (float64) 15
            	            	+   (string) (len=5) "total": (float64) 0
            	            	+  },
            	            	+  (string) (len=4) "read": (map[string]interface {}) (len=2) {
            	            	+   (string) (len=5) "bytes": (float64) 0,
            	            	+   (string) (len=6) "errors": (float64) 0
            	            	+  },
            	            	+  (string) (len=4) "type": (string) (len=13) "elasticsearch",
            	            	+  (string) (len=5) "write": (map[string]interface {}) (len=2) {
            	            	+   (string) (len=5) "bytes": (float64) 0,
            	            	+   (string) (len=6) "errors": (float64) 0
            	            	+  }
            	            	+ },
            	            	+ (string) (len=8) "pipeline": (map[string]interface {}) (len=3) {
            	            	+  (string) (len=7) "clients": (float64) 0,
            	            	+  (string) (len=6) "events": (map[string]interface {}) (len=7) {
            	            	+   (string) (len=6) "active": (float64) 0,
            	            	+   (string) (len=7) "dropped": (float64) 0,
            	            	+   (string) (len=6) "failed": (float64) 0,
            	            	+   (string) (len=8) "filtered": (float64) 0,
            	            	+   (string) (len=9) "published": (float64) 0,
            	            	+   (string) (len=5) "retry": (float64) 0,
            	            	+   (string) (len=5) "total": (float64) 0
            	            	+  },
            	            	+  (string) (len=5) "queue": (map[string]interface {}) (len=2) {
            	            	+   (string) (len=5) "acked": (float64) 0,
            	            	+   (string) (len=10) "max_events": (float64) 4096
            	            	   }
            	Test:       	TestFleetIntegrationBeatsMonitoring
        fleet_test.go:151: elastic-agent logs: {"log.level":"info","@timestamp":"2022-02-13T05:17:59.020Z","log.origin":{"file.name":"cmd/enroll_cmd.go","file.line":455},"message":"Starting enrollment to URL: https://fleet-server:8220/","ecs.version":"1.6.0"}
            Successfully enrolled the Elastic Agent.
            {"log.level":"info","@timestamp":"2022-02-13T05:17:59.796Z","log.origin":{"file.name":"cmd/enroll_cmd.go","file.line":253},"message":"Elastic Agent might not be running; unable to trigger restart","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:17:59.904Z","log.origin":{"file.name":"application/application.go","file.line":68},"message":"Detecting execution mode","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:17:59.905Z","log.origin":{"file.name":"application/application.go","file.line":92},"message":"Agent is managed by Fleet","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:17:59.905Z","log.origin":{"file.name":"capabilities/capabilities.go","file.line":59},"message":"capabilities file not found in /usr/share/elastic-agent/state/capabilities.yml","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:00.524Z","log.logger":"composable.providers.docker","log.origin":{"file.name":"docker/docker.go","file.line":43},"message":"Docker provider skipped, unable to connect: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:00.526Z","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":62},"message":"Starting stats endpoint","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:00.527Z","log.origin":{"file.name":"application/managed_mode.go","file.line":290},"message":"Agent is starting","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:00.527Z","log.logger":"api","log.origin":{"file.name":"api/server.go","file.line":64},"message":"Metrics endpoint listening on: /usr/share/elastic-agent/state/data/tmp/elastic-agent.sock (configured: unix:///usr/share/elastic-agent/state/data/tmp/elastic-agent.sock)","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:02.271Z","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":48},"message":"New State ID is rpa60MOt","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:02.271Z","log.origin":{"file.name":"stateresolver/stateresolver.go","file.line":49},"message":"Converging state requires execution of 2 step(s)","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:02.461Z","log.origin":{"file.name":"operation/operator.go","file.line":284},"message":"operation 'operation-install' skipped for apm-server.8.2.0-SNAPSHOT","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:02.711Z","log.origin":{"file.name":"log/reporter.go","file.line":40},"message":"2022-02-13T05:18:02Z - message: Application: apm-server--8.2.0-SNAPSHOT[7265a07b-aba3-4df3-bff1-8cdd6118b1f2]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'","ecs.version":"1.6.0"}
            {"log.level":"info","@timestamp":"2022-02-13T05:18:03.780Z","log.origin":{"file.name":"log/reporter.go","file.line":40},"message":"2022-02-13T05:18:03Z - message: Application: apm-server--8.2.0-SNAPSHOT[7265a07b-aba3-4df3-bff1-8cdd6118b1f2]: State changed to RUNNING: Running - type: 'STATE' - sub_type: 'RUNNING'","ecs.version":"1.6.0"}
    --- FAIL: TestFleetIntegrationBeatsMonitoring (22.03s)
     
    

Steps errors 1

Expand to view the steps failures

Run Linux tests
  • Took 14 min 50 sec . View more details here
  • Description: ./.ci/scripts/linux-test.sh

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /hey-apm : Run the hey-apm benchmark.

  • /package : Generate and publish the docker images.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@axw axw marked this pull request as ready for review February 11, 2022 08:09
@axw
Copy link
Member Author

axw commented Feb 11, 2022

The integration tests are failing, so we'll need to update the package before merging this.

@axw axw requested a review from a team February 11, 2022 08:10
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change and handling of breaking change LGTM

@axw
Copy link
Member Author

axw commented Feb 13, 2022

/test

@axw
Copy link
Member Author

axw commented Feb 13, 2022

System test failure is unrelated.

@axw axw merged commit f599dbb into elastic:main Feb 13, 2022
@axw axw deleted the event-duration-ingest branch February 13, 2022 05:57
@stuartnelson3
Copy link
Contributor

confirmed with 772e8c1

  1. start elasticsearch/kibana
  2. start apm-server
  3. ingest spans/transactions with the following program:
package main

import (
	"context"
	"fmt"
	"os"
	"time"

	"go.elastic.co/apm"
)

func main() {
	version := "undefined"
	if len(os.Args) > 1 {
		version = os.Args[1]
	}
	name := fmt.Sprintf("apm-server-%s", version)

	tx := apm.DefaultTracer.StartTransaction(name, "type")
	ctx := apm.ContextWithTransaction(context.Background(), tx)
	span, ctx := apm.StartSpan(ctx, name, "type")
	span.Duration = 0
	span.End()
	tx.Duration = 0
	tx.End()
	<-time.After(time.Second)
	apm.DefaultTracer.Flush(nil)
	fmt.Printf("%s: %+v\n", name, apm.DefaultTracer.Stats())
}
  1. check the transaction and span documents have transaction.duration.us and span.duration.us in the console
GET traces-apm-default/_search
{
  "sort" : [{ "@timestamp" : "desc" }]
}
  1. repeat steps 3 and 4, modifying span.Duration and tx.Duration to be non-zero

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify test-plan test-plan-ok v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants