Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerfiles: Update the systemd libs used to build docker image #3177

Merged
merged 1 commit into from Oct 25, 2021

Conversation

yasn77
Copy link
Contributor

@yasn77 yasn77 commented Mar 5, 2021

The systemd libraries used in the docker image can not read the journal db in newer
versions of systemd. This specifically impacts users that run systemd 246 or
above on their host server:

  • systemd-journald gained support for zstd compression of large fields in
    journal files. The hash tables in journal files have been hardened against
    hash collisions. This is an incompatible change and means that journal files
    created with new systemd versions are not readable with old versions. If the
    $SYSTEMD_JOURNAL_KEYED_HASH boolean environment variable for
    systemd-journald.service is set to 0 this new hardening functionality may be
    turned off, so that generated journal files remain compatible with older
    journalctl implementations.

(Taken from https://github.com/systemd/systemd/blob/v246/NEWS)

This commit uses the backported version of the systemd libs, which are 247 for
Debian buster. Fixing it for most distributions.

It might be worth considering having a separate build stage for systemd, that
would give better flexibility to the version shipped in the image.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • [N/A] Debug log output from testing the change
  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • [N/A] Documentation required for this feature

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 8, 2021

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Apr 8, 2021
@yasn77
Copy link
Contributor Author

yasn77 commented Apr 8, 2021

@edsiper @fujimotos @koleini

Hi, is there any chance this can get looked at? I can rebase if it's something you would merge.

Thanks

@github-actions github-actions bot removed the Stale label Apr 9, 2021
@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label May 10, 2021
@yasn77
Copy link
Contributor Author

yasn77 commented May 11, 2021

Bump?

@github-actions github-actions bot removed the Stale label May 12, 2021
@yasn77
Copy link
Contributor Author

yasn77 commented Jun 11, 2021

Bumping again before it goes stale...

I am happy to look at the merge conflicts if I know this has a chance of being merged. But it's been hanging around for a while and still an issue with newer versions of systemd

@github-actions
Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Jul 22, 2021
@george-angel
Copy link

remove stale

@github-actions github-actions bot removed the Stale label Jul 23, 2021
@Kuckkuck
Copy link

Any update on this? because new Flatcar images with zstd compression for journald logs are rolled out everywhere and fluent-bit is blind. ;-)

@ced0ps
Copy link

ced0ps commented Sep 8, 2021

Taking a look at the failing check it seems that it was a sporadically failing test which was reported here and fixed here

@yasn77
Copy link
Contributor Author

yasn77 commented Sep 8, 2021

I've rebased my branch with master. It would be really nice if someone can actually look at this.

@jonkerj
Copy link

jonkerj commented Sep 22, 2021

@edsiper (being commit champion at fluent), could you maybe look into this one? This PR is required to run fluent on recent container linux, which is quite common in Kubernetes-land. We are using @yasn77's branch for quite a while now, would love to see his work upstream.

@yasn77
Copy link
Contributor Author

yasn77 commented Oct 19, 2021

The commit message has been updated (+squashed) and signed.

Hope that's OK. I am still not sure why the integration step fails.

@patrick-stephens
Copy link
Contributor

I think you need to update the PR title as well

@yasn77 yasn77 changed the title Use updated systemd libs dockerfiles: Update the systemd libs used to build docker image Oct 19, 2021
@patrick-stephens
Copy link
Contributor

patrick-stephens commented Oct 19, 2021

It looks like it's assuming an image build for this PR so not sure if there has been an action change that happened which meant it has not been done, @agup006 ? It looks like other PRs have them skipped.

@agup006
Copy link
Member

agup006 commented Oct 22, 2021

I'm +1 to this, also adding @niedbalski to take a look as he maintains CI/CD/Docker

@niedbalski
Copy link
Collaborator

Hello @yasn77

The patch looks good to me.

  1. Could you please provide me with a configuration file that exposes this issue? I would like to add an integration test that validates the systemd journal input.

  2. "It might be worth considering having a separate build stage for systemd, that
    would give better flexibility to the version shipped in the image." -- Can you elaborate a bit more on this proposal?

Thanks,

@yasn77
Copy link
Contributor Author

yasn77 commented Oct 23, 2021

Hi @niedbalski

  1. Could you please provide me with a configuration file that exposes this issue? I would like to add an integration test that validates the systemd journal input.

I can't really provide a config file as it's related to the host running a newer version of systemd and not a fluentbit config issue. For example, you can try reading journalctl logs from a fluentbit image running on a flatcar host. Trying this will show that the Docker image doesn't pick up any of the logs.

  1. "It might be worth considering having a separate build stage for systemd, that
    would give better flexibility to the version shipped in the image." -- Can you elaborate a bit more on this proposal?

Sure, the current Docker build relies on systemd from Debian. Quite a few hosts running in container environments (such as Kubernetes), will be running distros that most likely will be using newer versions of systemd. My concern was that it's possible to have this issue come up again.

My thought was that it might be an idea to separately build systemd and pin the version, making it more flexible to keep up to date with systemd new features.

For most users, I suspect this might not be a big issue, but for users running Kubernetes with distros like flatcar, this issue leaves them blind to host logs (when using the container).

I suppose it's not a massive issue to pin systemd to Debian backports. That should be enough, so this PR should be fine. If I get time, I might open a PR to have systemd built from source... But TBH this PR has taken quite a while to be dealt with, I feel like it might not be worth it.

@patrick-stephens
Copy link
Contributor

I did also have a similar issue with my Openshift containers due to a security upgrade on the hosts but as I was building from source like you suggest it was just a case of triggering a new build.

@niedbalski niedbalski merged commit 3f99107 into fluent:master Oct 25, 2021
@jonkerj
Copy link

jonkerj commented Oct 25, 2021

fluent-devs, is it possible to cut out a release for this?

@niedbalski
Copy link
Collaborator

@jonkerj I guess you'll have to wait until the next milestone/tag release. In the meanwhile, you can use the
master branch images published here: https://hub.docker.com/layers/fluentbitdev/fluent-bit/x86_64-master/images/sha256-bb41f9517d2e49bdb36e0943f53822a59b64357265e5b69a5445bbb1bf9824fb?context=explore

@jonkerj
Copy link

jonkerj commented Jan 4, 2022

@niedbalski : there have been many releases since this PR was merged, but none of them seem to carry the fix. If we use the PR tag images like you suggest, they do not contain fixes introduced after that (such as #3905, which we face too). This makes it fairly convoluted to keep track with FB releases, as we have to cherry pick/build/upload every time.

Would it be possible to include this fix in the 1.8.x train?

@patrick-stephens
Copy link
Contributor

I agree this should be a simple one to backport although little concerned you then said you wanted other fixes as well. It might be easier just to submit a PR to the 1.8 branch with the changes you want so we can easily approve & merge then everything you need.

https://github.com/fluent/fluent-bit/tree/1.8

Currently master is targeting the next release (1.9) but is the default branch so we need another PR to then backport it. I might update the PR template to suggest that when people submit a PR they also consider one for the current release series (i.e. 1.8.X now).

I agree it's not always obvious that a merged change is not necessarily in the next release and I did try to suggest some improvements in this area at previous community meetings - be good to have some more suggestions or improvements to the docs/process too.

@jonkerj
Copy link

jonkerj commented Jan 4, 2022

@patrick-stephens that totally makes sense, thanks for elaborating this. My assumptions about the release process of FB were based on a linear model. I will create a PR to backport @yasn77's fixes to 1.8 as well.

I think a PR template would be great, as well as some explaning in CONTRIBUTING.md (you could steal some inspiration at elastic, they have a very complex branching system explained in their doc). I would have missed both, however, as I'm not the author of this PR, just a random stakeholder 😆

@patrick-stephens
Copy link
Contributor

@jonkerj great minds think alike - I've submitted a PR (#4565 ) to do exactly that. It's pretty basic for now but I'll have a look at the Elastic info if I can find the right bit or @agup006 may know already...

jonkerj pushed a commit to equinix-ms/fluent-bit that referenced this pull request Jan 5, 2022
…nt#3177)

The libraries created in the docker image can not read the journal db in newer
versions of systemd. This specifically impacts users that run systemd `246` or
above on their host server:

> * systemd-journald gained support for zstd compression of large fields in
> journal files. The hash tables in journal files have been hardened against
> hash collisions. This is an incompatible change and means that journal files
> created with new systemd versions are not readable with old versions. If the
> $SYSTEMD_JOURNAL_KEYED_HASH boolean environment variable for
> systemd-journald.service is set to 0 this new hardening functionality may be
> turned off, so that generated journal files remain compatible with older
> journalctl implementations.

(Taken from https://github.com/systemd/systemd/blob/v246/NEWS)

This commit uses the backported version of the systemd libs, which are `247` for
Debian buster. Fixing it for most distributions.

It might be worth considering having a separate build stage for systemd, that
would give better flexibility of the version shipped in the image.

Signed-off-by: Yasser Saleemi <yassersaleemi@gmail.com>
niedbalski pushed a commit that referenced this pull request Jan 12, 2022
* dockerfiles: Update the systemd libs used to build docker image

This commit is originally from @yasn77, but I had to sign it myself in
order to pass DCO checks. Commit message below is originally theirs.

The libraries created in the docker image can not read the journal db in newer
versions of systemd. This specifically impacts users that run systemd `246` or
above on their host server:

> * systemd-journald gained support for zstd compression of large fields in
> journal files. The hash tables in journal files have been hardened against
> hash collisions. This is an incompatible change and means that journal files
> created with new systemd versions are not readable with old versions. If the
> $SYSTEMD_JOURNAL_KEYED_HASH boolean environment variable for
> systemd-journald.service is set to 0 this new hardening functionality may be
> turned off, so that generated journal files remain compatible with older
> journalctl implementations.

(Taken from https://github.com/systemd/systemd/blob/v246/NEWS)

This commit uses the backported version of the systemd libs, which are `247` for
Debian buster. Fixing it for most distributions.

It might be worth considering having a separate build stage for systemd, that
would give better flexibility of the version shipped in the image.

Signed-off-by: Yasser Saleemi <yassersaleemi@gmail.com>
Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>

* dockeriles: added missing dependency for libatomic.so.1 (#4405)

Signed-off-by: Patrick Stephens <pat@calyptia.com>

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>

* dockerfiles: resolve ARM64 build issues (#4408)

Signed-off-by: Patrick Stephens <pat@calyptia.com>

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>

Co-authored-by: Yasser Saleemi <yasser@giantswarm.io>
Co-authored-by: Pat <patrick-stephens@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants