Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Added containerId to etc/hosts so as to resolve hostname. Closes #4446 #4816

Closed
wants to merge 2 commits into from

Conversation

SaurabhAhuja1983
Copy link

What type of PR is this?

#4446

/kind api-change
/kind bug
/kind cleanup
/kind deprecation
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake
/kind other

What this PR does / why we need it:

To fix java builds. Hostname (containerId) was unable to resolve.

How to verify it

Run the script provided here before and after the fix.
#4446 (comment)

Which issue(s) this PR fixes:

#4446

Fixes #4446

Special notes for your reviewer:

I would appreciate if you can provide any feedback, this is my first commit to buildah

Does this PR introduce a user-facing change?

No


…ainers#4446

Signed-off-by: Saurabh Ahuja <nsit.saurabh@gmail.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 23, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: SaurabhAhuja1983
Once this PR has been reviewed and has the lgtm label, please assign cevich for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@TomSweeneyRedHat
Copy link
Member

@SaurabhAhuja1983 Thanks for the PR! A couple of things. First, how did the file in the ./vendor directory change? If you did so "by hand", that won't work. The next time common is vendored in, it will be overwritten. That fix, if it is not already, needs to be made in the containers/common project and then vendored into Buildah.

Second, you did not add a test for this change. I think that's OK, but it would be best to add one if possible. If not, please add the tag [NO NEW TESTS NEEDED] to the description text. Otherwise, our CI will never let this in.

@SaurabhAhuja1983
Copy link
Author

SaurabhAhuja1983 commented May 24, 2023

Thank You @TomSweeneyRedHat for the feedback.

First, how did the file in the ./vendor directory change? If you did so "by hand", that won't work.

Yes, i did it by hand.

That fix, if it is not already, needs to be made in the containers/common project and then vendored into Buildah.

Alright. let me read more around it and will make the change in containers/common project.

Second, you did not add a test for this change. I think that's OK, but it would be best to add one if possible. If not, please add the tag [NO NEW TESTS NEEDED] to the description text. Otherwise, our CI will never let this in.

I will go through the code once again and will try to add test if i can, otherwise will do as you suggested.

@rhatdan rhatdan changed the title Added containerId to etc/hosts so as to resolve hostname. Closes #4446 [wip] Added containerId to etc/hosts so as to resolve hostname. Closes #4446 May 25, 2023
@SaurabhAhuja1983
Copy link
Author

SaurabhAhuja1983 commented Jun 7, 2023

Created #1491 in containers/common for vendor specific code.
Next steps

  1. Waiting on Approval
  2. vendorize the buildah code with latest containers/common
  3. Update this PR to remove vendor specific code and add the tag [NO NEW TESTS NEEDED] as added test to the containers/common code.

@Luap99
Copy link
Member

Luap99 commented Jun 8, 2023

The java reproducer works fine for me on the latest buildah version. I don't understand what this is trying to achieve? The hostname is already in /etc/hosts in the current version so the reporducer works fine.

@SaurabhAhuja1983
Copy link
Author

@Luap99 I tried to reproduce the problem and i am able to reproduce it even with the current version. Problem is it's unable to resolve hostname because for buildah intermediate RUN step hostname is containerId itself and there is no entry for ip->contatinerId mapping in /etc/hosts. I see there is an entry for ip->host.containers.internal but not for ip-> containerId.
Which environment/os you are using? Are you using rootless buildah?

@SaurabhAhuja1983
Copy link
Author

SaurabhAhuja1983 commented Jun 8, 2023

@Luap99
Here is the output where i am able to reproduce the problem.

buildah bud .
STEP 1/3: FROM <REPO-PleaseReplaceThisPublicRepoURL>/java_jdk_8:latest
STEP 2/3: COPY MyClass.java .
STEP 3/3: RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass
---
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
fe00::0	ip6-mcastprefix
fe00::1	ip6-allnodes
fe00::2	ip6-allrouters
10.20.213.177	agent-588ddcb847-sjkjz
10.20.213.177	host.containers.internal
---
Hostname: 4040283c83cd
---

engine="buildah-1.30.0"
name="java_jdk_8-working-container"
id="57b069c91740bd47dd5554051fab67e5f4f4f3fb9a5cccfddbb07595b94da0d4"
image="028647500327.dkr.ecr.us-west-2.amazonaws.com/base/java_jdk_8:latest"
imageid="a1d7714d474de0ad161716633b95157faf150d0d3700a112eac8335770f7f2b3"
rootless=1
---
Exception in thread "main" java.lang.IllegalStateException: cannot get host name
	at MyClass.main(MyClass.java:12)
Caused by: java.net.UnknownHostException: 4040283c83cd: 4040283c83cd: Name or service not known
	at java.net.InetAddress.getLocalHost(InetAddress.java:1432)
	at MyClass.main(MyClass.java:7)
Caused by: java.net.UnknownHostException: 4040283c83cd: Name or service not known
	at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
	at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:867)
	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302)
	at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:815)
	at java.net.InetAddress.getAllByName0(InetAddress.java:1291)
	at java.net.InetAddress.getLocalHost(InetAddress.java:1427)
	... 1 more
subprocess exited with status 1
subprocess exited with status 1
Error: building at STEP "RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass": exit status 1

@Luap99
Copy link
Member

Luap99 commented Jun 9, 2023

How do you run buildah? Is this rootless or as root? Do you have any containers.conf options changed, i.e. base_hosts_file?

The hostname is already added here:

buildah/run_common.go

Lines 1223 to 1236 in 7468369

if hostsFile != "" {
var entries etchosts.HostEntries
if netstatus != nil {
entries = etchosts.GetNetworkHostEntries(netstatus, spec.Hostname, buildContainerName)
} else {
// we have slirp4netns, default to slirp4netns ip since this is not configurable in buildah
entries = etchosts.HostEntries{{IP: "10.0.2.100", Names: []string{spec.Hostname, buildContainerName}}}
}
// make sure to sync this with (b *Builder) generateHosts()
err = etchosts.Add(hostsFile, entries)
if err != nil {
return err
}
}

@SaurabhAhuja1983
Copy link
Author

Yes.. it's rootless. I already printed the env. in the above comment.
Posting it again here
cat /run/.containerenv

engine="buildah-1.30.0"
name="java_jdk_8-working-container"
id="57b069c91740bd47dd5554051fab67e5f4f4f3fb9a5cccfddbb07595b94da0d4"
image="028647500327.dkr.ecr.us-west-2.amazonaws.com/base/java_jdk_8:latest"
imageid="a1d7714d474de0ad161716633b95157faf150d0d3700a112eac8335770f7f2b3"
rootless=1

@Luap99
Copy link
Member

Luap99 commented Jun 9, 2023

You are running inside a container? Check the containers.conf file in that container, I assume you run with host networking. It should work if you set --network private.

We currently do not add the hostname in /etc/hosts when host networking is used, the same goes for podman.
I assume you could add the hostname with host networking but that requires you to get the actual host ip, reusing HostContainersInternalIP is not correct for this. And of course this must only happen when host network is set.

@SaurabhAhuja1983
Copy link
Author

You are running inside a container?

Yes, i am running rootless buildah inside a container to build apps.

Check the containers.conf file in that container, I assume you run with host networking.

yes, i want to run with --network host option. I didn't find anything in containers.conf, i believe buildah uses --network host by default if we don't specify anything.
containers.conf

---
cat /etc/containers/containers.conf
[engine]
cgroup_manager = "cgroupfs"
---

It should work if you set --network private.

I tried both --network private and --network host but getting the same problem.
I need a way to resolve containerId hostname and it would require entry in /etc/hosts.
or is there a way to specify hostname instead of containerId as hostname for the intermediate containers.

We currently do not add the hostname in /etc/hosts when host networking is used, the same goes for podman.

Alright. Any reason for that.

I assume you could add the hostname with host networking but that requires you to get the actual host ip, reusing HostContainersInternalIP is not correct for this. And of course this must only happen when host network is set.

oks.. Got it. but i am thinking when intermediate build hostname is containerId shouldn't the mapping be
HostContainersInternalIP to containerId (or hostname for the intermediate container)
Also, as i mentioned i am getting the same problem with both --network host/private


buildah --network host bud . 
buildah --network private bud .

@SaurabhAhuja1983
Copy link
Author

@Luap99 did you get a chance to read my comments. I would really appreciate your guidance here. We don't want to maintain the patch on buildah in our codebase and it should help others as well who are trying to build gradle/jave applications and using rootless buildah in a container.
As long as containerId which is actually hostname resolves, we are good to go and gradle builds should be happy.
We can talk about what's the right thing to do here.

@Luap99
Copy link
Member

Luap99 commented Jun 14, 2023

I don't see that behaviour, I can confirm the the host entry is missing for --network host case but in the normal case it should really be there. That is something we can and maybe should address.

I think in your specific case the problem is you are running in a container that has BUILDAH_ISOLATION=chroot set? Or buildah somehow knows it can only with chroot isolation in the in container use case. With chroot it always uses host networkiing apparently.

When using cli flags this errors out:

$ buildah bud --network none --isolation chroot
Error: cannot set --network other than host with --isolation chroot

However when using the env var this does not produce an error, bug!

BUILDAH_ISOLATION=chroot buildah bud --network private
<runs with host networking>

You haven't shown what image you are using but the newer openjdk versions I tried always worked correctly regardless of the entry in /etc/hosts. The hostname is stored in /etc/hostname or can be retired via gethostname() syscall after all so I have no clue why the jdk even starts reading /etc/hosts.

Anyway I am well aware that switching jdk versions may not be possible for you and the missing entry can be considered a bug regardless of what java is doing.

@Luap99
Copy link
Member

Luap99 commented Jun 15, 2023

@SaurabhAhuja1983 Can you give #4869 a try? I think this should solve your problem.

@SaurabhAhuja1983
Copy link
Author

SaurabhAhuja1983 commented Jun 15, 2023

@Luap99 Thank you for submitting #4869 I will give it a try.

Yes, you are right, i am using buildah_isolation=chroot but i am also using --network=host.

I have used public openjdk image to reproduce the problem... Please use the dockerfile and MyClass.java and run rootless buildah in a container, you should be able to reproduce the problem.

1. As you can from the output below see the only problem is
Hostname: 12cbcea00125
but there is no corresponding entry in /etc/hosts so there is no way to resolve hostname.

As long as i am able to resolve hostname, i think we are good.

zokr@agent-588ddcb847-vggrd:/code/branches$ buildah build .
STEP 1/3: FROM gcr.io/google-appengine/openjdk:8
STEP 2/3: COPY MyClass.java .
STEP 3/3: RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass
---
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
fe00::0	ip6-mcastprefix
fe00::1	ip6-allnodes
fe00::2	ip6-allrouters
10.20.207.235	agent-588ddcb847-vggrd
10.20.207.235	host.containers.internal
---
Hostname: 12cbcea00125
---

engine="buildah-1.30.0"
name="openjdk-working-container"
id="12cbcea0012569529c3944d7c4fce35de6c23cbdd04975c03ac222bf3e4ffb65"
image="gcr.io/google-appengine/openjdk:8"
imageid="24b0b1cf77f0c22b1b72fd120769fb7659a3c22bc36acbd4146f19828d510e83"
rootless=1
---
Exception in thread "main" java.lang.IllegalStateException: cannot get host name
	at MyClass.main(MyClass.java:12)
Caused by: java.net.UnknownHostException: 12cbcea00125: 12cbcea00125: Name or service not known
	at java.net.InetAddress.getLocalHost(InetAddress.java:1506)
	at MyClass.main(MyClass.java:7)
Caused by: java.net.UnknownHostException: 12cbcea00125: Name or service not known
	at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
	at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
	at java.net.InetAddress.getLocalHost(InetAddress.java:1501)
	... 1 more
subprocess exited with status 1
subprocess exited with status 1
Error: building at STEP "RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass": exit status 1

Dockerfile

zokr@agent-588ddcb847-vggrd:/code/branches$ cat Dockerfile
FROM gcr.io/google-appengine/openjdk:8
COPY MyClass.java .
RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass

MyClass.java

zokr@agent-588ddcb847-vggrd:/code/branches$ cat MyClass.java
import java.net.InetAddress;
import java.net.UnknownHostException;

public class MyClass {
    public static void main(String args[]) {
        try {
            InetAddress ia = InetAddress.getLocalHost();
            String str = ia.getHostAddress();
            String HOSTNAME = InetAddress.getLocalHost().getHostName();
            System.out.println("hostname: " + HOSTNAME);
         } catch (final UnknownHostException e) {
            throw new IllegalStateException("cannot get host name", e);
         }
    }
}

Env

zokr@agent-588ddcb847-vggrd:/code/branches$ env | grep -i buildah
_BUILDAH_STARTED_IN_USERNS=
BUILDAH_ISOLATION=chroot

@SaurabhAhuja1983
Copy link
Author

@Luap99 Just verified #4869 and it worked fine.

Thank you for fixing it. I can see hostname entry in /etc/hosts

10.20.206.5	f8e90dcd64a1
---
Hostname: f8e90dcd64a1
---

Please do update once #4869 is merged to main and when new version is released.
I will close my PR's and we can close the issue as well.

Here is the output for the same Dockerfile and MyClass.java

STEP 1/3: FROM gcr.io/google-appengine/openjdk:8
Trying to pull gcr.io/google-appengine/openjdk:8...
Getting image source signatures
Copying blob bf5fa58026af done
Copying blob 6c1ecc5fc89f done
Copying blob 3898cc33768f done
Copying blob 47053b57b33d done
Copying blob 4e645dc40a1e done
Copying blob 58006514b43a done
Copying blob 02de81cd3ec5 done
Copying blob 0ec52dc55701 done
Copying config 24b0b1cf77 done
Writing manifest to image destination
Storing signatures
STEP 2/3: COPY MyClass.java .
STEP 3/3: RUN echo "---" ; cat /etc/hosts ; echo "---" ; echo "Hostname: $(hostname)" ; echo "---"; cat /run/.containerenv; echo "---"; javac MyClass.java ; /usr/bin/java -cp . MyClass
---
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
fe00::0	ip6-mcastprefix
fe00::1	ip6-allnodes
fe00::2	ip6-allrouters
10.20.206.5	agent-68b4457b45-v8qsk
10.20.206.5	host.containers.internal
10.20.206.5	f8e90dcd64a1
---
Hostname: f8e90dcd64a1
---

engine="buildah-1.31.0-dev"
name="openjdk-working-container"
id="f8e90dcd64a1fb8180e7e52f1b3821cdd637101dc2fe8f723976b6925e404659"
image="gcr.io/google-appengine/openjdk:8"
imageid="24b0b1cf77f0c22b1b72fd120769fb7659a3c22bc36acbd4146f19828d510e83"
rootless=1
---
hostname: f8e90dcd64a1
COMMIT
Getting image source signatures
Copying blob bae09f197569 skipped: already exists
Copying blob e8dfdad54e7f skipped: already exists
Copying blob 38f4df8894d4 skipped: already exists
Copying blob 9f7fab83bf06 skipped: already exists
Copying blob d45333c5e4e1 skipped: already exists
Copying blob cb46af800a4a skipped: already exists
Copying blob 2a0c9830ac72 skipped: already exists
Copying blob d871745b73e3 skipped: already exists
Copying blob 96c39b037d1f done
Copying config fbdbfeef33 done
Writing manifest to image destination
Storing signatures
--> fbdbfeef33bb
fbdbfeef33bb69d319b033afbb4d489fdba3623eec0252b62d66b79f006f84b2

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 21, 2023
@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@SaurabhAhuja1983
Copy link
Author

It's taken care here #4869
closing this Pull Request.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
do-not-merge/work-in-progress locked - please file new issue/PR needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buildah intermediate container(Id) is not resolvable
4 participants