Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trail run on pre-prod #88

Closed
sbose78 opened this issue Mar 30, 2020 · 15 comments
Closed

Trail run on pre-prod #88

sbose78 opened this issue Mar 30, 2020 · 15 comments
Assignees

Comments

@sbose78
Copy link
Member

sbose78 commented Mar 30, 2020

Next week we will first try to onboard the latest Build v2 on our dev environment and try to provide more e2e tests to make sure the existing features work fine, such as:

@zhangtbj , How did it go? Starting this ticket to track the status/feedback of test deployment.

@zhangtbj
Copy link
Contributor

Hi @sbose78 ,

I first added the private builder image and different service account test cases in our CI/CD first. If we have a common solution for the private resources, we will contribute them back to build v2.

And @qu1queee is helping add private github test case.

Last week, we tried to onboard the build v2 on our dev env. But unluckily....ALL build strategies don't work on our dev environment because of the restrict PSP config.

Such:

Without them, the buildrun/taskrun will report failures like:

  • Operation not permitted
  • Privileged containers are not allowed

I am not sure do you have the related PSP in the project of OpenShift.

But in our multi-tenant environments. We have to fix or workaround them ONE by ONE.

We have to focus on these blocking issues in this week first.... :(

Or do you have any idea about these permission issues?

@sbose78
Copy link
Member Author

sbose78 commented Mar 30, 2020

private builder image and different service account

These should work, and yes, it would be good to ensure good test coverage for these 👍

@sbose78
Copy link
Member Author

sbose78 commented Mar 30, 2020

@gabemontero , any pointers/thoughts on the issues above?

@gabemontero
Copy link
Member

On the buildah example

I'm missing the context on why you need

      volumeMounts:
        - name: buildah-images
          mountPath: /var/lib/containers/storage

Without clarification, at first blush, seems like those could be removed.

On buildpaks, yeah running as root a la https://github.com/redhat-developer/build/blob/master/samples/buildstrategy/buildpacks-v3/buildstrategy_buildpacks-v3_cr.yaml#L62 is a non starter on OpenShift for user data.

To draw a parallel to builds v1, any operation like that is encapsulated in the controller and its construction of the build pod, so as to remove the user from Pod create/edit anc controlling what is done within the escalated pod.

In theory then, build v2 could mimic that pattern, but then you are moving away from your plug in any image build tool seamlessly advantage. But that might be the compromise you have to make for these build tools that require root.

Or you expand the list of folks to ask ;-)

Lastly, if @zhangtbj is allowed to share it with us, the precise specifics of the "restricted PSP config" @zhangtbj mentions might shed some light.

@zhangtbj
Copy link
Contributor

I think buildah requires this:

      volumeMounts:
        - name: buildah-images
          mountPath: /var/lib/containers/storage

to pass the build artifacts to the push step. Without it buildah will report error:

error pushing image "us.icr.io/source-to-image-build/buildah-taxi" to "docker://us.icr.io/source-to-image-build/buildah-taxi": error locating image with name "us.icr.io/source-to-image-build/buildah-taxi" ([us.icr.io/source-to-image-build/buildah-taxi]): image not known

@gabemontero
Copy link
Member

That's right / totally forgot !! ... Thanks for the test run @zhangtbj

As it turns out I see the same thing in the upstream tekton examples: https://github.com/tektoncd/catalog/blob/master/buildah/buildah.yaml#L39-L43

Also, I incorrectly mixed the two. The volumeMounts I believe in and of itself does not need privileged (it is a ephemeral volume if I remember correctly). There are other reasons buildah needs privileged.

Fixing it is beyond to scope of build v2 alone. And there is no short path for this.

There are a series of Jira's opened to track the requirement.

@zhangtbj
Copy link
Contributor

zhangtbj commented Mar 31, 2020

Hi @gabemontero ,

Thanks for the info and it is great that you and buildah team already plan on that.

Today I tried ALL 4 buildstrategies.

The buildpacks and kaniko can run normally without privileged permission.

The S2I requires to use buildah and buildah requires privileged.

So S2I and buildah cannot build normally in our multi-tenant env.

I opened an issue to ask for help:
https://github.ibm.com/coligo/source-to-image/issues/196

And I saw there are some people also have this problem from Google search.

I cannot access your https://issues.redhat.com. Do you have any issue/doc or plan to track and do you know when they plan to fix this issue?

Thanks!

@gabemontero
Copy link
Member

Hi @gabemontero ,

Thanks for the info and it is great that you and buildah team already plan on that.

Today I tried ALL 4 buildstrategies.

The buildpacks and kaniko can run normally without privileged permission.

The S2I requires to use buildah and buildah requires privileged.

So S2I and buildah cannot build normally in our multi-tenant env.

I opened an issue to ask for help:
https://github.ibm.com/coligo/source-to-image/issues/196

And I saw there are some people also have this problem from Google search.

I cannot access your https://issues.redhat.com. Do you have any issue/doc or plan to track and do you know when they plan to fix this issue?

Don't have dates from the buildah team for the current set of dependencies.

I'm going to direct you @zhangtbj to @siamaksade , our product manager for build v1, build v2, and tekton / openshift pipelines ... so he product manages for all the players here but buildah. And he can coordinate with his peer from the buildah team.

Thanks!

@gabemontero
Copy link
Member

And as I surface this to the other players here on my end, got a good tip on how to do buildah non-privileged for build v1 (though it results in slower performance).

See https://docs.openshift.com/container-platform/4.3/builds/custom-builds-buildah.html

To map this to build v2 @sbose78 and @zhangtbj you'll need to reverse engineer:

  • create the build v1 example there
  • look at the resulting build pod's init containers and containers
  • map those to tekton tasks / taskrun steps
  • etc. etc.

The Dockerfile machinations in that example are the secret sauce here.

@gabemontero
Copy link
Member

It is also possible perhaps to map the Dockerfile machinations in that example to tekton tasks/steps

@zhangtbj
Copy link
Contributor

zhangtbj commented Mar 31, 2020

Thanks for the info! We need to track this issue.

emm..... I am not sure if it is worth using the custom buildah in build v2 before buildah team support the official unprivileged mode.

Or how long we can reverse or support it. Do you have any idea about it? @sbose78 .

And can kaniko replace buildah at the first stage for Dockerfile build?

@sbose78
Copy link
Member Author

sbose78 commented Mar 31, 2020

If I understood you correctly, you are good with

  • Kaniko
  • Buildpacks

...and

  • buildah doesn't work
  • s2i doesn't work because of buildah

And can kaniko replace buildah at the first stage for Dockerfile build?

Conceptually, yeah. I don't see why it wouldn't work :-) .. it's about building from a Dockerfile anyway right? Do you want to give it a try? If you'd like to pair program over a call with @otaviof and try out an "s2i with kaniko", drop me an email and I will connect you both :)

@zhangtbj
Copy link
Contributor

zhangtbj commented Mar 31, 2020

Hi @sbose78 ,

yes, Kaniko and Buildpacks works. But the other two don't work.

I can co-work with you to have a try the "s2i with kaniko", but can we open an issue and prioritize it in our feature list?

After this privileged problem, I still need to investigate another performance blocking issue. Because the build which is executed under tenant namespace is very slower than executed by cluster admin.

  • Cluster admin just take about 40-60 seconds to build
  • Tenant user needs 5 - 10 mins to build.

It is terrible... :(

If kaniko can replace the buildah, I prefer we open an issue to track "s2i with kaniko". And maybe in the middle of this month (April), we can work together to see the proposal of "s2i with kaniko".

At the same time, I also would like to know if it is possible that we can workaround or fix the buildah privileged issue with buildah team....

Any idea? :)

@sbose78
Copy link
Member Author

sbose78 commented Mar 31, 2020

I can co-work with you to have a try the "s2i with kaniko", but can we open an issue and prioritize it in our feature list?

Absolutely!

@qu1queee
Copy link
Contributor

@zhangtbj I believe this issue can be closed, please close if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants