Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add journald receiver #5160

Merged
merged 7 commits into from
Oct 13, 2021

Conversation

luckyj5
Copy link
Contributor

@luckyj5 luckyj5 commented Sep 8, 2021

Description: Add Journald receiver
Link to tracking Issue: #2332

Testing: Unit tests are included

Documentation: Journald receiver README.md is included

@luckyj5 luckyj5 requested a review from a team as a code owner September 8, 2021 07:47
@project-bot project-bot bot added this to In progress in Collector Sep 8, 2021
Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new module looks fine, with a few comments to address.

However, I am not clear on the implications of adding journalctl to the dockerfile. Is this best approach? Is it wise to manage such a large set of specific files in this way? Are we sure this is the right set of files? Will this be a fragile pattern? At a minimum, I think we need a test that will fail if journald cannot run due to changes in the set of files that must be copied into the image.

We also have a challenge with the windows tests failing. opentelemetry-log-collection only builds the journald operator for linux builds. I would think we want a way to include this receiver only in linux builds, but I don't think we have such a mechanism in place for the collector. Any ideas on the best approach here?

receiver/journaldreceiver/go.mod Outdated Show resolved Hide resolved
receiver/journaldreceiver/README.md Outdated Show resolved Hide resolved
receiver/journaldreceiver/README.md Outdated Show resolved Hide resolved
receiver/journaldreceiver/testdata/config.yaml Outdated Show resolved Hide resolved
@luckyj5
Copy link
Contributor Author

luckyj5 commented Sep 8, 2021

The new module looks fine, with a few comments to address.

However, I am not clear on the implications of adding journalctl to the dockerfile. Is this best approach? Is it wise to manage such a large set of specific files in this way? Are we sure this is the right set of files? Will this be a fragile pattern? At a minimum, I think we need a test that will fail if journald cannot run due to changes in the set of files that must be copied into the image.

We also have a challenge with the windows tests failing. opentelemetry-log-collection only builds the journald operator for linux builds. I would think we want a way to include this receiver only in linux builds, but I don't think we have such a mechanism in place for the collector. Any ideas on the best approach here?

@djaglowski Thanks. PR is updated per your comments.
Regarding best approach for adding journalctl to the dockerfile, its added based on the discussion here #2332.

We'll look into it a bit more and share our thoughts. Thanks!

@gillg
Copy link
Contributor

gillg commented Sep 9, 2021

I like the global idea of this receiver I was searching for it some weeks ago and I fallbacked on syslog. Moreover reuse the sideproject for logs collection seems perfect for the consistency.
But I'm a little bit affraid about the difficult to maintain the docker image with journalctl installed by this way... We could have broken builds if the debian version change, incompatibilities between amd64 / arm64 builds, cross OS difficulties and so on.

Maybe an alternative could be to propose install systemd-journal-remote to the user like we need to install rsyslog to use syslog receiver ?
In this case instead of read local files with journalctl we call an HTTP endpoint (native journald proxy) to receive the journal content in a pre-translated format (json probably). Features of the HTTP endpoint are the same than journalctl.

@rockb1017
Copy link
Contributor

First, I would like to know how to not include journald receiver when building it for non-linux OS because it will only work for linux. Please let me know if you have a preferred way for it.

@rockb1017
Copy link
Contributor

rockb1017 commented Sep 10, 2021

But I'm a little bit affraid about the difficult to maintain the docker image with journalctl installed by this way... We could have broken builds if the debian version change, incompatibilities between amd64 / arm64 builds, cross OS difficulties and so on.

image
it has been updated 6 months ago and no changes since then. I think we will be fine since we use a specific version tag 10.8 not latest.

In this case instead of read local files with journalctl we bind an HTTP endpoint to receive the journal content in a pre-translated format (json probably).
what is sending journald logs to HTTP endpoint? I am not fan of requiring any operation on users host side to make it work.

@djaglowski
Copy link
Member

@rockb1017 I think we may want to do the following to the log-collection library: https://github.com/open-telemetry/opentelemetry-log-collection/pull/267/files

This should allow all operating systems to compile and pass tests, while still omitting the journald operator from non-linux builds. The result, in theory, is that users who try to use the journald receiver will get an error from the log-collection library, along the lines of "unknown operator 'journald'". (If we go this route, we may want to capture the error and wrap it in something clearer to the user.)

@rockb1017
Copy link
Contributor

@djaglowski Thanks for replying. I tried something along that approach but build still fails with this. receiver/journaldreceiver/journald.go:68:14: undefined: journald.NewJournaldInputConfig
it is because in internal/components/components.go, it calls journaldreceiver.NewFactory()

@djaglowski
Copy link
Member

djaglowski commented Sep 15, 2021

@rockb1017 I have opened an issue to discuss excluding components such as this, based on OS.

If this is not implemented, or not soon enough, I think the next best option may be to write a full "dummy" implementation for non-linux builds. Something like:

// build !linux

package journaldreceiver

import (
	"go.opentelemetry.io/collector/component"
	"go.opentelemetry.io/collector/config"
)

const typeStr = "journald"

// NewFactory creates a dummy factory.
func NewFactory() component.ReceiverFactory {
	return receiverhelper.NewFactory(
		typeStr,
		createDefaultConfig,
		receiverhelper.WithLogs(createLogsReceiver))
}

func createDefaultConfig() config.Receiver {
	return nil // or whatever dummy is necessary

func createLogsReceiver(
	_ context.Context,
	_ component.ReceiverCreateSettings,
	_ config.Receiver,
	_ consumer.Metrics,
) (component.MetricsReceiver, error) {
	return nil, fmt.Errorf("'journald' only supported on linux")
}

EDIT: This would seem to be the best current approach for "excluding" a component from a specific OS. It does not look like the builder will be changed to handle this.

@gillg
Copy link
Contributor

gillg commented Sep 15, 2021

Hello again.
I bring some new information with my strategy to avoid OS dependancies on OTEL collector build and docker image.
If you want expose journald in a readable format, journalctl is the way localy and journald-gateway is the way remotely.

My idea is :

  • We use a sidecar pattern container (called "journald-gateway" as exemple) built with something like that :
FROM debian:stable-slim

RUN apt update && apt install -y systemd-journal-remote && apt clean
# For RHEL : yum install systemd-journal-gateway

EXPOSE 19531
 
# Doc https://www.freedesktop.org/software/systemd/man/systemd-journal-gatewayd.service.html
ENTRYPOINT ["/lib/systemd/systemd-journal-gatewayd"]
# For RHEL : /usr/lib/systemd/systemd-journal-gatewayd

CMD ["--directory=/host/var/log/journal"]
receiver:
  journald:
    gateway:
      endpoint: http://journald-gateway:19531/
      units:  # can be used to filter the /entries http route

# directory and files parameters can be passed to the journald-gateway container at startup but not at otel-collector level

All documentation here https://www.freedesktop.org/software/systemd/man/systemd-journal-gatewayd.service.html#Supported%20URLs
We can eventualy have only one collector for multiple nodes if we don't want the sidecar pattern.

Probably, this change a lot the actual code but not the goal, and don't disturb otel-collector.

# This is required by the podmanreceiver.
RUN mkdir -p /rundir
RUN mkdir -p /tmp
RUN chown -R ${USER_UID}:${USER_UID} /rundir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already said, work on Dockerfile and introduce kind of complexity will probably introduce issues between containers runtimes and OS.
If tomorrow we provide an image for windows container (PR already exists) as an exemple, these steps will not be compatible, while just copying otelcontribcol is.

Copy link
Contributor

@rockb1017 rockb1017 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dockerfile for windows would be a separate file yea? so it wouldn't affect the image for windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gillg thanks. Just to clarify are you referring line 8-11 in the Dockerfile or changes related to journalctl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure if windows will be a different file. The image can be just an ARG and buildkit can create all flavors.
And my comment is pretty general but just trigger here last changes related to adaptations made in podman context.

@rockb1017
Copy link
Contributor

@gillg
Thanks for looking into it and providing alternatives.
I think it makes deployment complicated. in the k8s environment, all will be wrapped in a helm chart, so it doesn't matter much but for any other use case, it makes it complicated to collect journald logs.
Also when it is consuming from gateway, how would it track offset for checkpointing? is it possible to make sure it consumes journald logs without data loss and data duplication when pod gets restarted?

@gillg
Copy link
Contributor

gillg commented Sep 15, 2021

Also when it is consuming from gateway, how would it track offset for checkpointing? is it possible to make sure it consumes journald logs without data loss and data duplication when pod gets restarted?

Good point, you can use Range query parameter to start to a specific offset, you can put a cursor reference in a file or shared parameter. How do you deal it with journalctl ? Parameters are pretty the same.
And about dedup yes I don't have solution, but seems similar with journalctl if you start 2 container mounted on journal folder, no ?

@rockb1017
Copy link
Contributor

@gillg oh there is no checkpoint mechanism with journald input operator. I assumed there is.

@djaglowski
Copy link
Member

@gillg oh there is no checkpoint mechanism with journald input operator. I assumed there is.

Actually, there is. See references to cursor

@rockb1017
Copy link
Contributor

@djaglowski Awesome! yes it does have it. i missed that part. so we just need to enable storage extension just like filelog, yea?

@djaglowski
Copy link
Member

so we just need to enable storage extension just like filelog, yea?

Yes, it should work exactly the same.

@luckyj5 luckyj5 force-pushed the add-journald-receiver branch 3 times, most recently from 32d794a to c606e03 Compare September 27, 2021 22:58
@djaglowski
Copy link
Member

Please see comment here about using a stub receiver for non-linux OS.

@djaglowski
Copy link
Member

Also, the main branch was broken this morning, but was just fixed. Pulling in the update should resolve many of these failures.

@luckyj5 luckyj5 marked this pull request as ready for review October 5, 2021 18:11
COPY --from=prep /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY otelcontribcol /
EXPOSE 4317 55680 55679
ENTRYPOINT ["/otelcontribcol"]
CMD ["--config", "/etc/otel/config.yaml"]
CMD ["--config", "/etc/otel/config.yaml"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a newline

cmd/otelcontribcol/Dockerfile Outdated Show resolved Hide resolved
@@ -0,0 +1,26 @@
## `Journald Receiver`

Parses Journald events using using the [opentelemetry-log-collection](https://github.com/open-telemetry/opentelemetry-log-collection) library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rephrase as a complete sentence and remove the duplicate word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Let me know if anything specific needs to be added. Thanks!


| Field | Default | Description |
| --- | --- | --- |
| `directory` | /run/log/journal | A directory containing journal files to read entries from |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying operator will by default read from either /run/log/journal or /run/journal, whichever is present. Can you please note both here?

journald_input docs

receiver/journaldreceiver/README.md Outdated Show resolved Hide resolved
@@ -0,0 +1 @@
mode: atomic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file is supposed to be excluded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for pointing this. Removed it

logs:
receivers: [journald]
processors: [nop]
exporters: [nop]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline please

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The receiver code looks good to me, but there are two open questions in my mind:

  1. Are the changes to the Dockerfile acceptable? Should we consider adding a dedicated Dockerfile that enables this receiver? I'm not sure the if the implementation of that approach would be reasonable either. @tigrannajaryan, do you have an opinion on this?
  2. Can we have an integration test, and/or test(s) in the testbed? Perhaps take inspiration from the containerized integration tests in other receivers.

@dmitryax
Copy link
Member

dmitryax commented Oct 7, 2021

Are the changes to the Dockerfile acceptable? Should we consider adding a dedicated Dockerfile that enables this receiver? I'm not sure the if the implementation of that approach would be reasonable either. @tigrannajaryan, do you have an opinion on this?

I would suggest if we can remove dockerfile changes from this PR and make it clear in the docs that this receiver depends on journald to be available on the system.

We can handle containerized environments later -- I would lean towards having a dedicated Dockerfile. That part can probably go to https://github.com/open-telemetry/opentelemetry-collector-releases.

If this is an acceptable solution, I think we also need to add a graceful handling of journald absence, similar to how we handle Windows now.

@jpkrohling
Copy link
Member

I would suggest if we can remove dockerfile changes from this PR and make it clear in the docs that this receiver depends on journald to be available on the system.

+1, I don't think journald is even recommended to exist in containers...

@tigrannajaryan
Copy link
Member

I would suggest if we can remove dockerfile changes from this PR and make it clear in the docs that this receiver depends on journald to be available on the system.

+1.

@dmitryax I assume for the use case we have for journald (to be added to Helm Chart) this is not a problem? (I don't know, just want to make sure).

@gillg
Copy link
Contributor

gillg commented Oct 7, 2021

@tigrannajaryan @jpkrohling see my summarized proposal at the end of related issue.
Maybe we can document one or two solutions for users.

@dmitryax
Copy link
Member

dmitryax commented Oct 7, 2021

@dmitryax I assume for the use case we have for journald (to be added to Helm Chart) this is not a problem? (I don't know, just want to make sure).

@tigrannajaryan for now we can enhance docker image in our own distro since we do want to use journald by default

@djaglowski
Copy link
Member

Incorporating the conversation above, this is my understanding of what needs to be done on this PR:

  1. Changes to the Dockerfile should be reverted.
  2. The receiver will, in this initial iteration, depend upon the presence of journalctl, but should error gracefully if it is not present.
  3. A future PR can add an option to use journald-gateway, and possibly and associated Dockerfile.

@luckyj5
Copy link
Contributor Author

luckyj5 commented Oct 11, 2021

Incorporating the conversation above, this is my understanding of what needs to be done on this PR:

  1. Changes to the Dockerfile should be reverted.
  2. The receiver will, in this initial iteration, depend upon the presence of journalctl, but should error gracefully if it is not present.
  3. A future PR can add an option to use journald-gateway, and possibly and associated Dockerfile.

Thanks @djaglowski.

@rockb1017
Copy link
Contributor

rockb1017 commented Oct 11, 2021

Thanks for the review and the discussions.
Regarding graceful handling of missing journalctl, why should it be graceful error (warning)? we can let it fail the agent until user disable journald receiver input in the config (or add journalctl to host).
Anyways, like @luckyj5 said, it it should be implemented in log library, can we get this PR merged ?

@luckyj5
Copy link
Contributor Author

luckyj5 commented Oct 11, 2021

Update on unsuccessful checks:

Pipeline failures are not related to this PR.

Failed integration tests belongs to jmx receiver and were added by this pr
And build-and-test / loadtest (TestBallastMemory|TestLog10kDPS) test did fail on other PRs which are already merged.

@djaglowski
Copy link
Member

I think graceful handling of a missing journalctl primarily means that receiver fails in an expected manner, and that it returns an error message that sufficiently conveys the cause of the failure to the user.

The current the error message appears sufficient for this purpose, though perhaps could be improved upon in the future:

Error: cannot start receivers: start stanza: start journalctl: exec: "journalctl": executable file not found in $PATH
2021/10/11 20:20:42 collector server run finished with error: cannot start receivers: start stanza: start journalctl: exec: "journalctl": executable file not found in $PATH

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Collector automation moved this from In progress to Reviewer approved Oct 12, 2021
Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>
@djaglowski djaglowski added the ready to merge Code review completed; ready to merge by maintainers label Oct 12, 2021
@tigrannajaryan tigrannajaryan merged commit fab94ea into open-telemetry:main Oct 13, 2021
Collector automation moved this from Reviewer approved to Done Oct 13, 2021
hex1848 pushed a commit to hex1848/opentelemetry-collector-contrib that referenced this pull request Jun 2, 2022
…5160)

* change feature registry as a prototype to make test friendly

Signed-off-by: Kay Yan <kay.yan@daocloud.io>

* Change colTelemetry to work with a custom Registry, improve tests

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

* Fix tests, windows, deprecate old API

Co-authored-by: Alex Boten <aboten@lightstep.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
No open projects
Collector
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

8 participants