Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-14789: Absorb the docker-solr repo. #1769

Merged
merged 16 commits into from Sep 11, 2020

Conversation

HoustonPutman
Copy link
Contributor

@HoustonPutman HoustonPutman commented Aug 20, 2020

WIP - https://issues.apache.org/jira/browse/SOLR-14789

This is intended for 9.0 only, and will not be backported to 8.x.
We can continue using the docker-solr repo for all 8.x.y releases.

Goals

The goal of this PR is to move all functionality for how current solr images are built to lucene solr. However there are clearly a lot of other things we want to accomplish too:

Since this docker image will not be officially used until 9.0 at the earliest, this gives us time to iterate and accomplish all of these goals. However merging in this initial functionality that works like the current docker-solr setup first lets us more easily make and test these changes.

Setup of /solr/docker

gradle assemble will now create a docker image that has been created with contents of :solr:packaging.

The "Solr image" is broken up into 2 docker files.

  1. The package dockerfile, which contains the solr release under /opt/solr-build/solr-${VERSION}. This image requires no other content at the end.
  2. The runtime dockerfile, which is given the package image name as an input and generates everything currently setup in the Solr docker image, copying the package contents from the package image.

The runtime docker image is the one that will be pushed to Dockerhub eventually, and is run by users.

The idea is that there is also a task to generate the release docker image (which can be found under /solr/docker/package/Dockerfile.release-package) that is then passed to the runtime docker image. But this is a future goal and not necessary in order to merge this PR IMO.

I basically learned gradle to write this, so please give any advice on how I could make the build process better. I'm sure it is not optimal.

Testing

Once you run gradle assemble, you can test the new docker image in two ways:

  • gradle :solr:docker:test, which will run the docker-solr integration tests using the newly assembled solr image. This will eventually fail when it reaches the test that needs root permissions however, since gradle doesn't have root permissions itself. This is a TODO.
  • docker run -d -p 8983:8983 -e SOLR_JETTY_HOST="0.0.0.0" apache/solr:9.0.0-SNAPSHOT, then visit the admin screen at http://localhost:8983 .

TODO:

  • Add ability to build with "release" package
  • Remove ability to push the package image.
  • Integrate gradle :solr:docker:tests into the overall gradle tests
  • Make all of the tests work with gradle.

@HoustonPutman
Copy link
Contributor Author

I should point out that the tests, docs and runtime/include/scripts are almost identical copies from the docker-solr repo. Some small changes have been made to the scripts files, but no actual runtime logic has changed.

I'd be happy to copy exactly from the repo and make the small changes in a different PR, if that would make y'all more comfortable.

@madrob
Copy link
Contributor

madrob commented Aug 21, 2020

I'd be happy to copy exactly from the repo and make the small changes in a different PR, if that would make y'all more comfortable.

Doesn't have to be a different PR but would be very helpful to have the unaltered copy as one commit, and then all of your changes as separate commits on top of that.

@madrob
Copy link
Contributor

madrob commented Aug 21, 2020

Also, I imagine @chatman is interested in this.

@HoustonPutman
Copy link
Contributor Author

I'd be happy to copy exactly from the repo and make the small changes in a different PR, if that would make y'all more comfortable.

Doesn't have to be a different PR but would be very helpful to have the unaltered copy as one commit, and then all of your changes as separate commits on top of that.

I'll try to make this happen.

@HoustonPutman
Copy link
Contributor Author

@madrob The first commit has the Dockerfile, tests, scripts and docs copied over from the docker-solr repo. The rest of the commits are small changes to the scripts, splitting the dockerfile logic into two parts, and adding the gradle logic.

@thelabdude
Copy link
Contributor

Looking great @HoustonPutman ! I ran g assemble and then launched the Docker image using a compose file:

version: '3.2'
services:

  zookeeper:
    image: zookeeper:3.5.8
    ports:
      - "2181:2181"

  solr:
    image: apache/solr:9.0.0-SNAPSHOT
    depends_on:
      - zookeeper
    ports:
      - "8983:8983"
    environment:
      SOLR_JETTY_HOST: "0.0.0.0"
    command: ["solr-foreground", "-z", "zookeeper:2181", "-c"]

Note the need to set SOLR_JETTY_HOST to bind to all interfaces in the container if you want to expose Solr to your local workstation. Still kicking the tires but looking great so far.

@madrob
Copy link
Contributor

madrob commented Aug 21, 2020

Note the need to set SOLR_JETTY_HOST to bind to all interfaces in the container if you want to expose Solr to your local workstation

I previously found the need to set SOLR_HOST=localhost with docker containers. Is one way preferred over the other? What's the actual difference?

@thelabdude
Copy link
Contributor

Note the need to set SOLR_JETTY_HOST to bind to all interfaces in the container if you want to expose Solr to your local workstation

I previously found the need to set SOLR_HOST=localhost with docker containers. Is one way preferred over the other? What's the actual difference?

@madrob This commit here (5377742#diff-2e431666cd6c6f1e08f79cdefa4988a4) makes it so Jetty only binds to 127.0.0.1 in the Docker container, which messes up Docker's ability to expose port 8983 outside of the container. It looks like that was done for security reasons ... So setting SOLR_JETTY_HOST to 0.0.0.0 allows me to reach Solr from outside Docker using http://locahost:8983

@madrob
Copy link
Contributor

madrob commented Aug 25, 2020

I was able to build this and did some very simple validation after launching containers using:

docker run --rm -d --name zk-for-solr -p 2181:2181 zookeeper
docker run --rm -p 8983:8983 --link zk-for-solr:zk-for-solr -e SOLR_JETTY_HOST="0.0.0.0" apache/solr:9.0.0-SNAPSHOT -c -z zk-for-solr:2181
docker run --rm -p 8984:8984 --link zk-for-solr:zk-for-solr -e SOLR_JETTY_HOST="0.0.0.0" apache/solr:9.0.0-SNAPSHOT -c -z zk-for-solr:2181 -p 8984

Can we make sure that user run instructions are included in the ref guide, or possibly the tutorial?

@dsmiley
Copy link
Contributor

dsmiley commented Aug 25, 2020

While SOLR_JETTY_HOST is a nice security feature for someone running Solr outside of Docker, isn't it pointless (needless pain) for anyone using Docker? I think so, thus the docker settings could set this so that you don't have to.

@janhoy
Copy link
Contributor

janhoy commented Aug 27, 2020

My first attempt at this gave some bash errors:

./gradlew assemble
> Configure project :solr:docker
readlink: illegal option -- f
usage: readlink [-n] [file ...]
/Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec/test.sh: line 18: ./../../shared.sh: No such file or directory

FAILURE: Build failed with an exception.

I recognize this from the docker-solr repo, the test scripts use some commands that do not work on MacOS. So I added the workaround with putting gnu variants of these tools in my path, but then I could still not make the assemble task run:

> Configure project :solr:docker
Test /Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec apache/solr:9.0.0-SNAPSHOT
Cleaning up left-over containers from previous runs
Running test_apache_solr_9.0.0_SNAPSHOT
Unable to find image 'apache/solr:9.0.0-SNAPSHOT' locally
docker: Error response from daemon: pull access denied for apache/solr, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.

FAILURE: Build failed with an exception.

I had to uncomment the task test() from docker/build.gradle and then my build succeeded.

@HoustonPutman
Copy link
Contributor Author

@janhoy The tests should hopefully work for you now, though I do have the gnu utils installed, so I'm not sure they will.

The tests are no longer run by default in the assemble, so that should work for you now regardless.

I have done some modification on the tests to simplify out some of the logic. I also changed the permissions of folders created during the tests, so that it doesn't require root permissions anymore to run them. This works for me, but it might not work for others. Not confident on that yet.

@janhoy
Copy link
Contributor

janhoy commented Aug 28, 2020

Ok, will give it another try soon.

Btw - should we set SOLR_JETTY_HOST as an ENV or ARG or something?

@HoustonPutman
Copy link
Contributor Author

Its currently set as an ENV, the last line of the large ENV section.

As a simple test, check the automatic assignment and specific assignment work:

```
mak@trinity10:~$ docker run -i --rm --net=netzksolr busybox ip -4 addr show eth0 | grep inet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove mak@trinity10:~$ from the commands

Copy link
Contributor

@thelabdude thelabdude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, left one minor nit about removing Mak's old command-line references

@HoustonPutman
Copy link
Contributor Author

Thanks for reviewing the docs as well. I plan on migrating a lot of the docs to the ref guide, but I think we should probably leave a bulk of that work to a separate PR.

The earlier we get this PR merged, the easier it will be to break up the work we want to do into separate JIRAs that can be attacked independently. (more extensive documentation, reduction of the scripts, make it solr more "cloud native", a plan for release images, etc)

@HoustonPutman HoustonPutman changed the title SOLR-11245: Absorb the docker-solr repo. SOLR-14789: Absorb the docker-solr repo. Aug 28, 2020
solr/docker/docs/docker-compose.yml Outdated Show resolved Hide resolved
solr/docker/tests/shared.sh Show resolved Hide resolved
@dsmiley
Copy link
Contributor

dsmiley commented Aug 31, 2020

CC @dweiss you might want to briefly review this new module for the gradle aspect

@dsmiley
Copy link
Contributor

dsmiley commented Sep 8, 2020

FYI related conversation on the dev list about higher level integration tests: https://lists.apache.org/thread.html/r27d742357f9e1342cce1c9aaa215a10d94f64b43b2e681f19fa8d150%40%3Cdev.lucene.apache.org%3E

solr/docker/build.gradle Outdated Show resolved Hide resolved
solr/docker/package/Dockerfile.local-package Outdated Show resolved Hide resolved
@madrob
Copy link
Contributor

madrob commented Sep 10, 2020

Overall looks good, just a pair of minor questions/comments.

solr/docker/Docker-FAQ.md Outdated Show resolved Hide resolved
@HoustonPutman HoustonPutman merged commit 485d5fb into apache:master Sep 11, 2020
@HoustonPutman
Copy link
Contributor Author

Thanks everyone for the help! Now we should be able to iterate more easily on the next tasks 😄

@HoustonPutman HoustonPutman deleted the docker branch September 11, 2020 16:30
rishisankar pushed a commit to rishisankar/lucene-solr that referenced this pull request Sep 16, 2020
@dsmiley
Copy link
Contributor

dsmiley commented Nov 16, 2020

I'm looking at the docker image this produces closely, and I see it's more than a bit bloated thanks to this line:

COPY --from=solr_package "/opt/solr-$SOLR_VERSION.tgz" "/opt/solr-$SOLR_VERSION.tgz"

COPY --from=solr_package "/opt/solr-$SOLR_VERSION.tgz" "/opt/solr-$SOLR_VERSION.tgz"

Which adds an entire layer of waste that is all of Solr 😱

I'm also a confused why we need an entire gradle module :solr:docker:package just for providing input to the Dockerfile. I read the PR opening description which sort of says why it exists... but I think that could be provided all in one module using different tasks for the different steps. I'm also not sure we need to support Dockerfile.release-package does since the image production process is merged with the project; it needn't download a .tgz from anywhere. Right? I don't think direct production of the image prohibits the goals stated above about producing an official image.

@HoustonPutman
Copy link
Contributor Author

The extra step exists because there was no consensus around how to do official release images.

If we want to decide that the official image should be built the same way as it currently is in the project (via the local build), then we can get rid of the sub-module and the extra step. However if we want to have the official image use the official release binaries, as it does in docker-solr, then we will need to keep the submodule.

I would have preferred to have all of this done in one module, but the gradle docker plugin only supports building one image per-module. So if we want to build multiple images (which is necessary for supporting the two image types, local and release), we need two modules.

I am all for not adding support for official binary release strategy, and consolidating into one docker file. I just don't want to make that decision unanimously.

@dsmiley
Copy link
Contributor

dsmiley commented Nov 16, 2020

I'm glad your are amenable to changes, and that the complexity & Docker image weight I see will melt away if we only produce an image from the Solr assembly. That is identical to the "official release" except packaging -- plain dir vs tgz of the same. I can appreciate there were unknowns causing you to add this extra baggage because it might be useful but I prefer to follow a KISS/YAGNI philosophy so that we don't pay complexity debt on something not yet needed.

@janhoy
Copy link
Contributor

janhoy commented Nov 17, 2020

+1 to lighter weight. However, our users should somehow be able to verify that a Docker image pulled from Docker Hub (or downloaded from elsewhere) is indeed the officially voted-upon binaries that they find in the release repo. Downloads from mirrors are easy to verify as we provide .sha512 and .asc files for them. Likewise artifacts from maven also have .asc and .sha1 files for every jar. Current docker-solr Dockerfile can be inspected in that it downloads the official tarball and validates GPG signature. The lightweight Dockerfile performs no such checks and cannot be validated the same way.

So here is my proposal. We build the docker image from folder instead of tgz, but also add documentation to our download page on how to verify the solr binaries inside the image. Could even script it:

curl -o verify-docker.sh https://lucene.apache.org/solr/verify-docker.sh
docker run --rm -v ./verify-docker.sh:/verify-docker.sh apache/solr:9.0.0 sh /verify-docker.sh

@HoustonPutman
Copy link
Contributor Author

I completely agree with everything you've said @janhoy, and really like that solution! I am by no means an expert on this verification stuff, so what do you imagine verify-docker.sh would be doing under the hood?

@dsmiley
Copy link
Contributor

dsmiley commented Nov 17, 2020

Wouldn't it be simpler for the release manager to build the docker image, examine the sha256 hash of the image, and publish that to the download location, making it official? Someone who wants to use the official Solr docker image who is ultra-paranoid can reference the image by hash like so:

docker run --rm solr@sha256:02fe5f1ac04c28291fba23a18cd8765dd62c7a98538f07f2f7d8504ba217284d

That runs Solr 8.7, the official one. It's compact and can even be broadcasted easily in the release announcement for future Solr releases for people to get and run the latest release immediately, and be assured it's the correct one.

I wonder what other major Apache projects do.

@janhoy
Copy link
Contributor

janhoy commented Nov 18, 2020

what do you imagine verify-docker.sh would be doing under the hood

It would verify the sha512 sum of each jar with the corresponding checksum published by the project, and also allow the user to verify the PGP signature of each jar to assert it was signed by a committer.

Wouldn't it be simpler for the release manager to build the docker image, examine the sha256 hash of the image, and publish that to the download location, making it official

That's a great idea David, that the RM records the image SHA when pushing, and publishes that sha. People can then either pull with SHA or verify later with docker images --digests solr. That should be sufficient for most users. For those who want to futher assert that they use the very same binaries signed by the committer, we could still document how to perform PGP checks on the jars.

@dsmiley
Copy link
Contributor

dsmiley commented Nov 18, 2020

we could still document how to perform PGP checks on the jars

Just confirming... we agree that ./verify-docker.sh is needless, and we need to just document (e.g. in the ref guide) how to verify individual JARs at the CLI.

@janhoy
Copy link
Contributor

janhoy commented Nov 21, 2020

Yea, add the image hash to download page and document how to check each jar file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants