Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8264482: container info misleads on non-container environment #3280

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

@YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Mar 31, 2021

hs_err log and VM.info dcmd shows cgroup information as container information even though the process run on non-container environment as following.

container (cgroup) information:
container_type: cgroupv2
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 164163584
memory_max_usage_in_bytes: not supported

We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.

cgroup information:
cgroup_type: cgroupv2
container runtime: podman
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 256176128
memory_max_usage_in_bytes: not supported

In case of systemd, it checks PID (PID 1 or not) and $container in PID 1. We should check them to know the JVM runs on the container or not.

https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8264482: container info misleads on non-container environment

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3280/head:pull/3280
$ git checkout pull/3280

Update a local copy of the PR:
$ git checkout pull/3280
$ git pull https://git.openjdk.java.net/jdk pull/3280/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3280

View PR using the GUI difftool:
$ git pr show -t 3280

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3280.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 31, 2021

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 31, 2021

@YaSuenag The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@YaSuenag YaSuenag marked this pull request as ready for review Mar 31, 2021
@openjdk openjdk bot added the rfr label Mar 31, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 31, 2021

Webrevs

@jerboaa
Copy link
Contributor

@jerboaa jerboaa commented Mar 31, 2021

I'm a bit nervous about the container_runtime addition. Can this be done in a separate bug, please? What testing has been done? Has this been tested with docker and podman? Did you run tests in test/hotspot/jtreg/containers ?

@@ -68,6 +70,29 @@ void OSContainer::init() {

_is_containerized = true;

if (getpid() == 1) {
// This process is in container
_runtime = os::strdup_check_oom(getenv("container"));

This comment has been minimized.

@jerboaa

jerboaa Mar 31, 2021
Contributor

In my testing this shows oci:

$ podman run --rm -ti fedora:33
[root@2322a30ef7cd /]# echo $container
oci

So I'm not sure this will be very helpful. Systemd does some fairly involved translation:
https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L677

Those heuristics will involve a partial implementation of https://bugs.openjdk.java.net/browse/JDK-8261242

Also consider that there are multiple container runtimes when podman is in use (I don't know about docker). For example crun and runc. In a way, container runtime then becomes ambiguous too.

This comment has been minimized.

@YaSuenag

YaSuenag Mar 31, 2021
Author Member

I think it is ok if we can know the process was run on a container. It is better to know what runtime was used if possible - it might not be mandatory. So I think we can refer /run/.containerenv and /.dockerenv now like systemd. It might not be enough, but it would be acceptable.

JDK-8261242 has been assigned to Harold, so I will not start to work for it now.
(I can work for it of course if you agree with the above.)

@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 31, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 31/03/2021 7:51 pm, Yasumasa Suenaga wrote:

hs_err log and `VM.info` dcmd shows cgroup information as container information even though the process run on non-container environment as following.

container (cgroup) information:
container_type: cgroupv2
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 164163584
memory_max_usage_in_bytes: not supported

We can use cgroup outside of container, so it is useful to show. However cgroup is different from container. We should distinguish them.
And also it is useful if we can see container runtime in this section. So I added it. We can see following contents in this section after this change.

cgroup information:
cgroup_type: cgroupv2
container runtime: podman
cpu_cpuset_cpus: not supported
cpu_memory_nodes: not supported
active_processor_count: 4
cpu_quota: not supported
cpu_period: not supported
cpu_shares: not supported
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 256176128
memory_max_usage_in_bytes: not supported

In case of systemd, it checks PID (PID 1 or not) and `$container` in PID 1. We should check them to know the JVM runs on the container or not.

https://github.com/systemd/systemd/blob/68337e55f62cf49b7bdfb73dc5662e23b0ea17fa/src/basic/virt.c#L619

Our container support is based around cgroups. The actual containers are
still too ad-hoc to reliably interact with. I would not want to see this
additional code added at startup up time, but rather the container
environment should be interrogated when the information is desired. It
really bugs me that there are no (even informally) standardized API's
around containers and we have to provide custom support to deal with
each of them.

David
-----

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Mar 31, 2021

Thanks @jerboaa and @dholmes-ora for the comment!
I haven't aware JDK-8261242, so I will remove the code to set container runtime name.

The reason why I sent this PR is I found "container_type" in hs_err log even though I did not run JVM on the container. It was confising. I agree to check cgroups configuration, but we should not use "container" for them. cgroups is not only for containers.

Did you run tests in test/hotspot/jtreg/containers ?

I haven't run them yet. I attempted to fix OSContainer::is_containerized() as JDK-8261242 said, but I do not have environment to run test/hotspot/jtreg/containers test. So I fixed UL output only, and confirmed it does not affect their tests.

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 1, 2021

I removed container runtime detection code - this PR changes log / hs_err / VM.info message only at result.
How about this? please review it.

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 2, 2021

Commit 8fd9b3f seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

@gerard-ziemski
Copy link

@gerard-ziemski gerard-ziemski commented Apr 5, 2021

Commit 8fd9b3f seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

I'll check it out on my local macOS machine and report back...

@gerard-ziemski
Copy link

@gerard-ziemski gerard-ziemski commented Apr 5, 2021

Commit 8fd9b3f seems to fail on macOS x64 (hs/tier1 compiler) tests, but I think it is not caused by this change because I modified Linux code only.

I'll check it out on my local macOS machine and report back...

On my local mac with the latest jdk+PR3280 I see all tests pass:

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg:tier1_compiler             783   783     0     0   
==============================
TEST SUCCESS

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 5, 2021

Thanks @gerard-ziemski !
I'm waiting for reviewers...

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Sorry Yasumasa but I don't agree with this. OSContainer is supposed to be the abstraction representing a containerized environment. Calling code doesn't know (nor care) what the underlying mechanism is - maybe it is cgroups, maybe it isn't. If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.

David

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 6, 2021

If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.

To resolve this problem, I think JDK-8261242 is filed. But the discussion does not seem to active... it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.

As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
As an option, we can move logging code into CgroupSubsystem, but it might not be better...

@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 6, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 6/04/2021 12:31 pm, Yasumasa Suenaga wrote:

On Tue, 6 Apr 2021 01:38:50 GMT, David Holmes <dholmes at openjdk.org> wrote:

If there is an issue with reporting this stuff when not actually in a container then the fault lies in is_containerized() IMHO.

To resolve this problem, I think [JDK-8261242](https://bugs.openjdk.java.net/browse/JDK-8261242) is filed. But the discussion does not seem to active... it might take a lot of time to fix because there might not be clear solution to detect the process runs on the container.

As I said in before, I want to remove "container" from log output because it makes to mislead. Especially next JDK release is LTS, so the code will use a long time.
As an option, we can move logging code into `CgroupSubsystem`, but it might not be better...

If you are actually running under a cgroup and have resource limits
applied then aren't you effectively running in a "container"? its just
not a container that has been productized like Docker et al.

David

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 6, 2021

If you are actually running under a cgroup and have resource limits
applied then aren't you effectively running in a "container"?

No, we can see container logs even if the process is not limited by cgroups. I (and maybe JDK-8261242) want to say it is a problem.

HotSpot detects whether it is run on container in following code. It checks whether specified directories exist.

const char* proc_cgroups = "/proc/cgroups";
const char* proc_self_cgroup = "/proc/self/cgroup";
const char* proc_self_mountinfo = "/proc/self/mountinfo";
bool valid_cgroup = determine_type(cg_infos, proc_cgroups, proc_self_cgroup, proc_self_mountinfo, &cg_type_flags);

According to cgoups manpage, /proc/cgroups seems to generated by kernel, so we cannot use them to detect.
I think we should fix like systemd, and JDK-8261242 has been discussed on to do so.

@jerboaa
Copy link
Contributor

@jerboaa jerboaa commented Apr 6, 2021

@YaSuenag Unfortunately that's the status quo, yes. On a plain linux system without cgroup limits for the java process, but with the cgroup subsystem mounted, it'll report some cgroup (container) metrics (same as the host values, though). What I'm unsure about is whether or not this is a problem worth fixing. The main idea is abstraction from cgroups in the container code. It does say container (cgroup) information in the heading.

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 6, 2021

What I'm unsure about is whether or not this is a problem worth fixing.

I think fundamental solutions will be implemented in JDK-8261242, but It seems to be a difficult task.

The main idea is abstraction from cgroups in the container code. It does say container (cgroup) information in the heading.

I know that, but "container_type" label is in hs_err / VM.info log. In addition, UL reports some metrics as "container" in spite of cgroups.
If it takes a long time to fix JDK-8261242, I want to fix log messages for cgroups for next LTS release at least. "container" messages may make the customer do extra work to collect / analyzer their system.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 7, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 6/04/2021 10:09 pm, Yasumasa Suenaga wrote:

On Tue, 6 Apr 2021 10:06:18 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

What I'm unsure about is whether or not this is a problem worth fixing.

I think fundamental solutions will be implemented in JDK-8261242, but It seems to be a difficult task.

The main idea is abstraction from cgroups in the container code. It does say `container (cgroup) information` in the heading.

I know that, but "container_type" label is in hs_err / VM.info log. In addition, UL reports some metrics as "container" in spite of cgroups.
If it takes a long time to fix JDK-8261242, I want to fix log messages for cgroups for next LTS release at least. "container" messages may make the customer do extra work to collect / analyzer their system.

Then can we compromise and just say "container/cgroup" everywhere?

David

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 7, 2021

Then can we compromise and just say "container/cgroup" everywhere?

How about following change?
It is easy to add new resource controller other than cgroups in future.

// get_resource_container_name() will return "cgroups" or "cgroupsV2"
const char *resource_controller_name = OSContainer::get_resource_container_name();

log_debug(os, container)("%s memory limit %s: " JLONG_FORMAT ", using host value",
                       resource_controller_name,
                       mem_limit == OSCONTAINER_ERROR ? "failed" : "unlimited", mem_limit);
@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Apr 10, 2021

I pushed new commit to use resource controller name to show both UL and VM.info.
In UL, resource controller name ("cgroupsv1" or "cgroupsv2") will be added to UL message, but the log is tagged os+container, so the user aware it might relates to the container, and they can know it says about cgroups.

It has tested with test/hotspot/jtreg/containers jtreg tests on Linux x64.

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented May 8, 2021

@YaSuenag This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants