Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pipeline registration and add default pipelines. #1258

Merged
merged 8 commits into from
Aug 13, 2018

Conversation

simitt
Copy link
Contributor

@simitt simitt commented Aug 3, 2018

implements #1032

This PR introduces pipeline registration.

The default pipeline registered in ES when registration.ingestion.pipeline.enabled=true is the User Agent Pipeline, depending on the Ingest User Agent Plugin. As this plugin is not pre-installed in a standard ES download, I disabled the plugin registration by default.
As registration.ingest.pipeline.enabled=true APM Server loads all pipeline definitions from ingest/pipeline/definition.json and sends them to the ES API endpoint.

Generally the pipeline registration works for the run and the setup cmd, by registering to the according libbeat callbacks.

depends on #1256 to be merged first [UPDATE: merged]
The changes in the vendor folder will be removed, once elastic/beats#7851 is merged. [UPDATE: removed, as #1256 was merged]

@simitt
Copy link
Contributor Author

simitt commented Aug 7, 2018

@elastic/apm-server anyone up for a review?

@@ -450,4 +450,3 @@ release: mage

.PHONY: snapshot
snapshot: mage
@SNAPSHOT=true mage package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! All changes in _beats and vendor will be replaced once the updated libbeat code is merged in.

_meta/beat.yml Outdated
# Pipeline definitions are registered at the configured Elasticsearch instances.
# Registering pipelines therefore requires to have `output.elasticsearch` enabled and configured.
#
# If registering pipelines is disabled, you can still configure `output.elasticsearch.pipelines`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably because i never looked into pipelines, this comment is hard to understand for me: is the case that output.elasticsearch.pipelines is the same as register.ingest.pipeline for existing documents? or why do you mention it here?

_meta/beat.yml Outdated
@@ -374,8 +393,14 @@ output.elasticsearch:
when.contains:
processor.event: "onboarding"

# Optional ingest node pipeline. By default no pipeline will be used.
#pipeline: ""
# Ingest node pipelines are applied when writing documents to Elasticsearch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading this i am not clear if i need to uncomment #pipelines: - pipeline: "apm_user_agent" or not...

@simitt simitt force-pushed the 1032-default-pipelines branch 2 times, most recently from d6a6221 to 83a7b27 Compare August 8, 2018 08:14
@simitt
Copy link
Contributor Author

simitt commented Aug 8, 2018

Thanks for the review @jalvz , I updated the description.

)

func RegisterPipelines(esClient *es.Client, overwrite bool, path string) error {
pipelines, err := loadPipelinesFromJSON(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this path being resolved correctly with deb/yum/etc packaging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i saw now that you are packing the json file with mage 👍
do you want me to test that or are you confident with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it only with the tar package so far, but can do more testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think eg linux packaging is interesting because assets would end in another directory/etc/something/something i presume?
lmk if you want any help testing that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are up for testing - much appreciated, otherwise I'll do I'll later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i can test that 👍

if overwrite || !exists {
_, _, err := esClient.Connection.CreatePipeline(p.Id, nil, p.Body)
if err != nil {
logp.Debug("pipelines", "Pipeline registration failed for %s.", p.Id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the deprecated logging package, should be logp.NewLogger(selector).Debugf...

# You can manually register pipelines, or use this configuration option to ensure
# pipelines are loaded and registered at the configured Elasticsearch instances.
# Automatic pipeline registration requires
# * `output.elasticsearch` to be enabled and configured.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for updating the docs, much better now!
so there are 2 dependent settings, in output.elasticsearch.pipelines you refer to apm-server.register.ingest.pipeline, but not the other way around.

would be great to add another line like

  • output.elasticsearch.pipelines contains "apm_user_agent" or similar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refer to them both ways, see

  • line 120: (2) applying a pipeline during data ingestion (see output.elasticsearch.pipelines)
  • line 400: # (1) ensure pipelines are registered in Elasticsearch, see apm-server.register.ingest.pipeline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed L120, sorry

@@ -13,6 +13,7 @@ https://github.com/elastic/apm-server/compare/6.4\...master[View commits]
=== Added

- Provide basic server information at `/` {pull}1197[1197]
- Add pipeline registration and pipeline usage {pull}[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can update this now

@simitt
Copy link
Contributor Author

simitt commented Aug 8, 2018

jenkins, run package tests please

require.NoError(t, err)
esClient := &esClients[0]
_, current, _, _ := runtime.Caller(0)
path := filepath.Join(filepath.Dir(current), "..", "..", "ingest", "pipeline", "definition.json")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not important, but i'd try to reuse functions in tests/loader.go for these things


case mage.DMG:
mage.Packages = append(mage.Packages[:idx], mage.Packages[idx+1:]...)

default:
panic(errors.Errorf("unhandled package type: %v", pkgType))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should packaging ever panic, a log is not enough?
if adding a default, would make sense to drop the case mage.DMG then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case should not be hit, but if additional packages are added to the package.yml, where we don't have a case defined, then we would be notified with this.

@@ -519,3 +520,41 @@ def test_metric_doc(self):
expected_type = "scaled_float"
actual_type = mappings[self.index_name]["mappings"]["doc"]["system.process.cpu.total.norm.pct"]["mapping"]["pct"]["type"]
assert expected_type == actual_type, "want: {}, got: {}".format(expected_type, actual_type)


class PipelineRegisterTest(ElasticTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 🥇

Copy link
Contributor

@jalvz jalvz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(looks good to / works for) me, GJ!

@simitt
Copy link
Contributor Author

simitt commented Aug 9, 2018

Thanks @jalvz for testing the packages, I updated the file handling and tested it.

@simitt
Copy link
Contributor Author

simitt commented Aug 10, 2018

jenkins, test this please.

@simitt
Copy link
Contributor Author

simitt commented Aug 10, 2018

jenkins, retest this please

1 similar comment
@simitt
Copy link
Contributor Author

simitt commented Aug 10, 2018

jenkins, retest this please

@simitt
Copy link
Contributor Author

simitt commented Aug 13, 2018

@jalvz as I made several changes after your approval, could you quickly re-review please?

@@ -66,6 +66,13 @@ def setUp(self):
super(ServerSetUpBaseTest, self).setUp()
shutil.copy(self._beat_path_join("fields.yml"), self.working_dir)

pipeline_dir = os.path.join("ingest", "pipeline")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment about why are you doing this so we remember in six months would be nice 👍

@jalvz
Copy link
Contributor

jalvz commented Aug 13, 2018

i was up to date with everything except last commit, lgtm 👍
thanks for this

* Disable pipeline registration and application by default.
* If enabled, register default pipelines on APM Server `run` or `setup`.
* Add user_agent pipeline as default pipeline.
* Add system tests and log entries.
* Add pipeline definition to packaging.

implements elastic#1032
@simitt simitt merged commit fc51041 into elastic:master Aug 13, 2018
@simitt simitt deleted the 1032-default-pipelines branch August 21, 2018 12:43
@@ -507,7 +508,7 @@ def test_expvar_exists(self):
assert r.status_code == 200, r.status_code


class MetricsIntegrationTest(Test):
class MetricsIntegrationTest(ElasticTest):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants