Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace dependency on mikefarah/yq build container in favour of kislyuk/yq (python wrapper for jq) #14076

Closed
nickboldt opened this issue Jul 30, 2019 · 24 comments
Assignees
Labels
area/devfile-registry area/plugin-registry kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system. status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach
Milestone

Comments

@nickboldt
Copy link
Contributor

nickboldt commented Jul 30, 2019

Is your task related to a problem? Please describe.

Today, che-operator uses yq from https://github.com/kislyuk/yq but the devfile and plugin registries use a DIFFERENT yq implementation from https://github.com/mikefarah/yq

And as demonstrated in eclipse-che/che-devfile-registry#27 yaml processing may not always be required where a simple grep | sed scriplet can be used.

Additionally, the mikefarah/yq implementation requires building go code (or using his Alpine based image) whereas the kislyuk/yq works with both python2 and 3 as a wrapper for jq.

I've already prototyped a container that we can use to prebuild/preinstall yq as a build container:

https://github.com/nickboldt/containers/blob/master/ubi8-yq/Dockerfile

I will extend this approach so I can use this image in Brew/OSBS for CRW builds (it requires that the python deps be prepulled and pushed as a tarball into pkgs.devel's dist-git lookaside cache, as we do for the CRW 1.2 stacks' lang server deps).

Describe the solution you'd like

@nickboldt nickboldt added the kind/task Internal things, technical debt, and to-do tasks to be performed. label Jul 30, 2019
@nickboldt nickboldt changed the title Replace dependency on https://github.com/mikefarah/yq image container in favour of https://github.com/kislyuk/yq (python wrapper for jq) Replace dependency on mikefarah/yq build container in favour of kislyuk/yq (python wrapper for jq) Jul 30, 2019
@nickboldt nickboldt added this to the 7.1.0 milestone Jul 30, 2019
@nickboldt nickboldt added area/devfile-registry area/plugin-registry kind/enhancement A feature request - must adhere to the feature request template. severity/P1 Has a major impact to usage or development of the system. labels Jul 30, 2019
@amisevsk
Copy link
Contributor

Note that the commands for the different versions of yq are incompatible; in general the mikefarah version is far more limited.

I'm +1 for moving to the jq wrapper version because it can be very useful in general. Less so on moving to sed/grep since those commands are less readable.

@skabashnyuk
Copy link
Contributor

I'm +1 for moving to the jq wrapper version because it can be very useful in general. Less so on moving to sed/grep since those commands are less readable.

I'm supporting @amisevsk opinion. sed/grep solution looks less readable.

@l0rd
Copy link
Contributor

l0rd commented Jul 31, 2019

A couple of comments:

  • Using the go version (mikefarah) has the benefit that the resulting image can be ridiculously small. I have no idea if the limited features of the go version is enough for our use cases though.
  • Using registry.redhat.io/ubi8-minimal:8.0 as a base image is a no go: if I try to pull it I get "unauthorized: Please login to the Red Hat Registry". It's not compatible with an upstream project as Eclipse Che.

@skabashnyuk
Copy link
Contributor

skabashnyuk commented Jul 31, 2019

FYI in devfile-registry and plugin-registry yq is used ONLY in buld containers. This tool is not used in end-user images

@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jul 31, 2019
@benoitf benoitf added status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jul 31, 2019
@l0rd
Copy link
Contributor

l0rd commented Jul 31, 2019

@skabashnyuk the goal is to run it at runtime, in the container entrypoint, to switch from online to offline mode when starting the registry

@skabashnyuk
Copy link
Contributor

@l0rd it's just an apache in runtime that is serving static files, is there any reason to have a jq in runtime?

@nickboldt
Copy link
Contributor Author

nickboldt commented Jul 31, 2019

if "Using registry.redhat.io/ubi8-minimal:8.0 as a base image is a no go", try registry.access.redhat.com/ubi8-minimal:8.0. It's only 90M, and does not require registry authentication [1].

[1] https://access.redhat.com/RegistryAuthentication

@nickboldt
Copy link
Contributor Author

is there any reason to have a jq in runtime?

Yes, if we need to manipulate the index.json file to inject environment variables such as PLUGIN__REGISTRY__URL so that we can resolve vsix files from within the plugin registry container itself, as in eclipse-che/che-plugin-registry#195

@skabashnyuk
Copy link
Contributor

Yes, if we need to manipulate the index.json file to inject environment variables such as PLUGIN__REGISTRY__URL

That is a legitime request. It also means that current architecture with "static files & apache" aren't fit our need and we should start thinking about microservice written on some "high" language GO, NodeJS etc. I'm -1 with idea to write more logic in BASH.

@l0rd
Copy link
Contributor

l0rd commented Aug 1, 2019

@nickboldt ok to use registry.access.redhat.com then but what's the difference? I am afraid that registry.access.redhat.com may be deprecated. Do you know why the same image is provided on 2 distinct registries? Are the images identical or there are some differences?

@skabashnyuk we need to be pragmatic. Rewriting the registries as CRUD web api is something that we need to plan for next releases but in the short term 2 lines of jq to support offline mode are enough.

@slemeur slemeur removed the kind/enhancement A feature request - must adhere to the feature request template. label Aug 2, 2019
@benoitf
Copy link
Contributor

benoitf commented Aug 5, 2019

hello, I don't see why we should use python wrapper for jq instead of the go version which make the image smaller

and AFAIK we can compile go stuff on RHEL as well. (so I would avoid to require python as it's not useful)

@benoitf
Copy link
Contributor

benoitf commented Aug 5, 2019

Basically I think we're trying to solve something with a wrong solution on jq. It's like 100 workarounds.

Why can't we have a build image using https://github.com/mikefarah/yq on top of ubi8 ?

@nickboldt
Copy link
Contributor Author

nickboldt commented Aug 6, 2019

I am afraid that registry.access.redhat.com may be deprecated.

It'll be supported until at least the EOL of RHEL 8. That's RHEL 8.8 in 2023, with extended support through 2027. https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase

Will 8 years be enough time? :)

Do you know why the same image is provided on 2 distinct registries?

This is done so that people who want to use the auth'd registry can transition, and those that don't, do not need to.

Are the images identical or there are some differences?

If the same image appears on both, it's the same. If it only appears on one registry, it's probably an older one (eg., some RHSCL images are only on RHCC, but newer UBI images are on BOTH RHCC and RHIO URLs).

See also email I've fwd'd from Scott McCarthy with more details.

@nickboldt
Copy link
Contributor Author

hello, I don't see why we should use python wrapper for jq instead of the go version which make the image smaller

Because the python wrapper uses the more commonly used (and thus standard) jq implementation.

The golang version is not the same as the one used in the che-operator repo - different implementation, different CLI, different syntax. Do you really want to have two different yq implementations across the Che repos?

Either way, I'm going to need to pick something I can use in Brew. python36, pip, jq and PyYAML are already available as RPMs, so installation of yq is easy in a container. No compilation required, just some yum or microdnf installs.

Going the golang route would require importing the source code into Brew (or building a new rpm for it). That's more overhead. Are you offering to contribute the RPM build to Fedora so I can get it from the UBI 8 content sets?

and AFAIK we can compile go stuff on RHEL as well. (so I would avoid to require python as it's not useful)

Sure, but if we want a single-stage image for the 2 registries (as per Mario's BJN chat w/ me last Friday) then we don't want to have to install all the golang deps just to uninstall them afterwards. And there's the above problem w/ source importing too. It's not as straightforward as you think. :)

@nickboldt
Copy link
Contributor Author

nickboldt commented Aug 9, 2019

Did some quick performance testing of the sed and yq options in devfile registry. Results suggest that we should use the golang yq instead of the python one as it's much faster:

Using sed:

0.385s for ./check_mandatory_fields.sh
0.246s for ./index.sh

Using yq (mikefarah version):

0.38s for ./check_mandatory_fields.sh
0.27s for ./index.sh

Using yq (python version):

0m17.991s for ./check_mandatory_fields.sh
0m15.440s for ./index.sh

HOWEVER building this in Brew will be a bit challenging unless I just import the sources and compile it on the fly as a first stage build for the registries. Otherwise we need to maintain a 12th image container for CRW 2... or I start using cekit for templating these things.

@amisevsk
Copy link
Contributor

I've done a bit of testing, and while the python version of yq is a lot slower when used as a replacement in the existing scripts, the real benefit of using the more full-featured yq is that it can simplify scripts. E.g. index.sh could be rewritten

set -e

readarray -d '' metas < <(find devfiles -name 'meta.yaml' -print0)
for i in "${metas[@]}"; do
    # Workaround for self-links
    echo -e "links:\n  self: /${i}" >> "${i}"
done
yq -s 'map(.)' "${metas[@]}"

(note that in my testing this is as fast as the mikefarah version)

Also, if additional restructuring is required, it's pretty painless to implement.

@amisevsk
Copy link
Contributor

Since the work was pretty much done in my testing, I've opened PR eclipse-che/che-devfile-registry#80

@benoitf
Copy link
Contributor

benoitf commented Aug 13, 2019

I still don't get why we should use python bloatware

@nickboldt
Copy link
Contributor Author

I still don't get why we should use python bloatware

If it's just as fast, and it's a more product-friendly solution, what's the issue? container size too big? (I haven't compared ubi8-minimal + mikefarahyq vs. ubi8-minimal + python + jq + yq to see if we're really better off with the go impl than the py one.)

@amisevsk
Copy link
Contributor

I've also opened PR eclipse-che/che-plugin-registry#205 in the registry to look at what the change would entail.

@amisevsk
Copy link
Contributor

@skabashnyuk @l0rd @benoitf Any updates on this issue -- do we want to still go in this direction? I know @nickboldt is a +1 on the changes because it helps with CRW, but I need more feedback before devoting more time to it.

PRs eclipse-che/che-devfile-registry#80 and eclipse-che/che-plugin-registry#205 are still waiting for review.

In regards to the "bloat" complaint, since they're multi-stage builds, the resulting images are of identical size -- I don't think using a better tool is bloat, and most people working on Che or OpenShift are more familiar with the python wrapper version of yq and its syntax anyways (it's just jq syntax)

These PRs additionally have minor improvements:

  • For the plugin registry, the validation and update scripts are greatly simplified (and json is validated using a schema rather than manual bash script).
  • The base images used are more portable; we could use any base image we wanted rather than being required to build from mikefarah/yq.

@l0rd
Copy link
Contributor

l0rd commented Aug 28, 2019

@amisevsk PRs are ok for me. I have approved them.

@benoitf
Copy link
Contributor

benoitf commented Sep 10, 2019

AFAIK it should be closed now

@amisevsk
Copy link
Contributor

Yup, PRs are merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devfile-registry area/plugin-registry kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system. status/analyzing An issue has been proposed and it is currently being analyzed for effort and implementation approach
Projects
None yet
Development

No branches or pull requests

8 participants