Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Automatically fetch headers for Google's Container OS linux header install via initContainer #48

Merged
merged 5 commits into from
Mar 10, 2019

Conversation

dalehamel
Copy link
Member

@dalehamel dalehamel commented Jan 27, 2019

This should be reasonably ready for review, and should fix #47

As of now, the current status is:

  • This works! I can run a test script on container OS now.
  • It appears to be able to successfully download and generate headers to a shared location, and this is reasonably fast, subsequent loads don't have any noticeable overhead from the initContainer as I had hoped
  • When headers are already installed, the initContainer exists almost immediately and the tracer can start
  • Headers will be read unconditionally from a cached path at /var/cache/linux-headers, which should dependably be writeable (see https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s05.html). Since we are literally just caching the headers, this should be fine
  • For hosts where headers are already installed, this will symlink to those.
  • For hosts missing headers, they will be downloaded here and the symlink will point to that instead.

This may actually make #7 not needed / "fix" #7, as the approach can just be to ensure that headers are always pointed to by this symlink.

Work still to do:

  • Modify tracer container to use custom header path if specified I went with a different approach
  • Verify that this works end-to-end
  • Cleanups in preparation for merge.

I would like if someone else could test this against another environment where headers are already present, to ensure that it uses the headers at /lib/modules.host, as I haven't tested this codepath. Just running a trace should be sufficient to prove that this works, but examining the symlinks in /var/cache/linux-headers/modules_dir would show this conclusively. My main concern is that this fixes my use-case, and breaks for other users.

If there are hosts where /var/cache is not writeable (which would frankly baffle me), then this will break kubectl-trace for those users.

@fntlnz can you give this a review, and help me out by QA'ing against any environments you think might make sense? I would hate to have this be merged and break kubectl trace for other users in a way I hadn't anticipated :)

*"Container-Optimized OS"*)
install_cos_linux_headers
;;
*)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just warning, a future version could check if it is apparent that headers exist.

If there are no headers, then a generic handling of checking kernel version via uname, and fetching from kernel.org or some other kernel mirror would be a good fallback.

For distros like ubuntu, arch, fedora, etc. if the administrator hasn't installed the kernel headers package, we could also try to do something clever by fetching the headers directly. for them here, as they should be available as binary archives (and even though alpine won't have access to the host's package manager, it should be able to determine how to fetch these archives directly from a package mirror easily enough, as these packages are basically just fancy tarballs anyways).

Until that becomes a practical issue for someone though, I don't see a reason to implement it and add the complexity.

@dalehamel dalehamel changed the title WIP Container OS linux header install via initContainer Container OS linux header install via initContainer Jan 27, 2019
@dalehamel
Copy link
Member Author

@fntlnz should be ready for a serious review if you have a chance

@dalehamel dalehamel changed the title Container OS linux header install via initContainer feat: Automatically fetch headers for Google's Container OS linux header install via initContainer Jan 27, 2019
@fntlnz fntlnz self-requested a review January 28, 2019 08:53
Copy link
Member

@fntlnz fntlnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks shiny, I need two more eyes from @leodido and I'm testing this on GKE before merge.

@fntlnz
Copy link
Member

fntlnz commented Jan 28, 2019

I'm very thankful you did this @dalehamel - one of the main goals of this project is to address every Linux machine in a kubernetes cluster and this work here sets the foundations to let us support more and more different cases.

@dalehamel
Copy link
Member Author

one of the main goals of this project is to address every Linux machine in a kubernetes cluster

Yeah there are approaches that download headers directly from kernel.org, that could be used for other cases where headers are missing.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some fixes!

P.S. share your ideas, feedbacks or issues with us at https://github.com/fixmie/feedback (this message will be removed after the beta stage).

pkg/version/version.go Outdated Show resolved Hide resolved
@dalehamel
Copy link
Member Author

updated my branch to add resource requests to the initcontainer, since #50 was merged.

I also noticed some whitespace errors I introduced with that one, fixed those here - sorry about that!

@leodido
Copy link
Member

leodido commented Jan 28, 2019

I've just tried this and it works for me. PTAL @fntlnz

@leodido leodido self-requested a review January 28, 2019 19:50
leodido
leodido previously approved these changes Jan 28, 2019
Copy link
Member

@leodido leodido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

p.s.: thanks for the whitespaces thing I did not noticed in the other PR review :)

pkg/version/version.go Outdated Show resolved Hide resolved
@dalehamel
Copy link
Member Author

@leodido thanks for the review - i'll fix the comment issue shortly.

I also saw a comment briefly about the image hosting being under my quay namespace, I agree it shouldn't be. I think both images should be moved to iovisor. I think we can probably handle that after merging this though, I'm not a member of iovisor org though, so it'll have to be someone who is.

The initContainer also doesn't have an automated build as part of CI, so whoever does that could just make sure both images are built and published to the right place automatically - maybe we should file a separate issue for that?

@fntlnz
Copy link
Member

fntlnz commented Jan 30, 2019

@dalehamel I was thinking about making this step optional, and having a short parameter that will enable the initContainer. Wdyt?

@fntlnz
Copy link
Member

fntlnz commented Jan 30, 2019

@dalehamel yes we need to address ci for those new images, we can do that in a separate PR, that's just ok to merge this first

@dalehamel
Copy link
Member Author

I was thinking about making this step optional, and having a short parameter that will enable the initContainer. Wdyt?

Would be good for compatibility purposes / for cases where users can dependably mount the headers from their host system, but we'd need to adjust this PR in some way, as the symlink that it creates over /lib/modules is created by the initContainer, so it's currently coupled.

@fntlnz
Copy link
Member

fntlnz commented Jan 31, 2019

What do you think about adding a flag to make the init container optional @leodido ?

@dalehamel I'm not yet sure I want to do that, I'm trying to think about different situations and asking as many users I can reach what they think about this. Also I want to make you aware of the fact that I'm at a company retreat this week and I'm having problems in keeping up with all the awesome things you are doing because I don't have many time available. So expect some delays until Monday but feel free to keep doing what you have in mind, I will find time to give feedback but I can't personally do proper code reviews now.

@dalehamel
Copy link
Member Author

dalehamel commented Jan 31, 2019

I'm not yet sure I want to do that

Yeah, this was my main hesitation with the PR, I think gating it as a default-disabled flag makes sense. However, if it's a flag to enable/disable, we can make the same flag toggle back to the default behavior of detecting from /lib/modules.

Related to #53, it would be nice to be able to toggle this with both an environment variable and command line flag.

So expect some delays until Monday but feel free to keep doing what you have in mind

No worries on the delay, but I was a little worried about overwhelming you with stuff to review so thanks for the confirmation.

I also did a bit of a deep dive on the bpftrace issue for pid namespaces and hit a bit of a brick wall. Would be good to sync up with you on that to figure out what the options are until BTF is available, as being able to trace (just) a specific pod (#33) is a bit of a "holy grail" for my use case.

@fntlnz
Copy link
Member

fntlnz commented Jan 31, 2019

@dalehamel yes we need to be able to change between the two modes, the flag would be the best way to do that while using kubectl trace to change the behavior on the fly while the environment variable would be the way to explicitly set a behavior as a default for this machine , like hey I always want to use the init container and this specific image with it.

I'm glad to hear that the pid namespace feature is important for you too, that's really what we wanted to achieve since the begging and it's something I need so bad too so feel free to continue exploring and working on it while I'm away this week and after we can find a way to do that together as a group.

@dalehamel
Copy link
Member Author

dalehamel commented Feb 1, 2019

I like @leodido's suggestion for --fetch-headers from #53, and probably a ENV['KUBECTL_TRACE_FETCH_HEADERS'] or similar to toggle this.

That also makes it clearer what this PR should do that it doesn't now:

  • If the user doesn't specify --fetch-headers, then the behavior should be to only look at /lib/modules (and perhaps in a future version, a path specified by an env var as in Support setting the kernel headers path via BPFTRACE_KERNEL_SOURCE #7 )
  • The script should add a handling for the generic case, where it just fetches kernel sources from a reputable mirror like kernel.org, so that --fetch-headers will work with more than container OS.

@leodido
Copy link
Member

leodido commented Feb 1, 2019 via email

@fntlnz
Copy link
Member

fntlnz commented Feb 7, 2019

Ok good, I think that the solution we ended up with is looking good.

@dalehamel I just merged the PR that introduces flags, #55 @leodido also added flags for the init container and fetch headers that are not used right now but are solely made for your pr here.

In the master now I changed the init container image to be under fntlnz, I'm talking with @drzaeus77 to have an iovisor owned registry to stop using my quay account.

So I would say, test your changes using the flags but when you rebase make sure you leave my image as default so that the CI can push it on a merge on master and in branches from members.

Also @dalehamel , good job here, thank you!

@dalehamel dalehamel force-pushed the cos-header-install branch 2 times, most recently from 9730065 to 71eb93c Compare March 6, 2019 16:01
@dalehamel
Copy link
Member Author

@fntlnz @leodido sorry for the long turnaround time 😅 should be ready for a review now though!

I've rebased against master and made use of the --fetch-headers flag.

The default behavior should be unchanged, it would be great if someone could verify this for me in their own environment.

Currently you need both --fetch-headers and --init-imagename for this to work, as the default init image doesn't exist. In a follow-up PR, we should probably update to point to images in the iovisor namespace, and ensure the init image is built and pushed there.

In my test, this worked as expected:

kubectl trace run NODE_NAME -f tcpconnect.bt --fetch-headers --init-imagename=quay.io/dalehamel/kubectl-trace-init:latest

},
})

job.Spec.Template.Spec.Containers[0].VolumeMounts = append(job.Spec.Template.Spec.Containers[0].VolumeMounts,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we ever have more than one container, this will lead to some "fun" behavior, but I don't think that is likely to occur.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you're right.

BTW, not something to approach in this PR imho. :)

HEADERS_TARGET=${SOURCES_DIR}
;;
*)
echo "WARNING: ${distro} is not a supported distro, cannot install headers, ensure they are installed to /lib/modules"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow up with another patch to try a generic approach to downloading headers once this is merged, rather than just warning and bailing

@leodido
Copy link
Member

leodido commented Mar 6, 2019 via email

@leodido leodido self-requested a review March 7, 2019 23:51
Copy link
Member

@leodido leodido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried it. Looks good to me 🎉

@leodido
Copy link
Member

leodido commented Mar 7, 2019

P.S.: I'm going to push images to iovisor quay org soon, and then opening a little PR for that. Ideally we should rebase this on top of that.

@dalehamel
Copy link
Member Author

dalehamel commented Mar 7, 2019 via email

@leodido leodido self-requested a review March 8, 2019 01:18
leodido
leodido previously approved these changes Mar 8, 2019
Copy link
Member

@leodido leodido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just made a PR (#65) to use iovisor as the image's org.

I also added some suggestion here so to build the image here with the right name.

Please rebase this when #65 is merged in.

Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved
@dalehamel
Copy link
Member Author

thanks @leodido I tested and committed your changes. I'll happily rebase once #65 is done, hopefully we can merge this next week :)

@fntlnz
Copy link
Member

fntlnz commented Mar 9, 2019

This looks shiny @dalehamel ! #65 had been merged when you rebase I'll do another review then I think we will be good to go! 🦝

@dalehamel
Copy link
Member Author

@fntlnz rebased and tested locally, I had to override the image name since nothing has been pushed to quay.io/iovisor/kubectl-trace-init:latest yet, but otherwise it looks good.

@fntlnz fntlnz self-requested a review March 10, 2019 18:03
Copy link
Member

@fntlnz fntlnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🥇

@fntlnz
Copy link
Member

fntlnz commented Mar 10, 2019

Good job @dalehamel thanks

@fntlnz fntlnz merged commit 03c1a99 into iovisor:master Mar 10, 2019
@dalehamel dalehamel deleted the cos-header-install branch March 11, 2019 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Doesn't work with google container os (GKE)
3 participants