Skip to content
This repository has been archived by the owner on Feb 12, 2023. It is now read-only.

Pending improvents in the Docker/Docker Compose/k8s workflow #446

Open
8 tasks
ggalmazor opened this issue Mar 20, 2019 · 6 comments
Open
8 tasks

Pending improvents in the Docker/Docker Compose/k8s workflow #446

ggalmazor opened this issue Mar 20, 2019 · 6 comments

Comments

@ggalmazor
Copy link
Contributor

ggalmazor commented Mar 20, 2019

This is a follow-up issue to @brettneese's recent work. We can use this issue's description as a backlog of things we agree on doing.

Ready-ish:

  • Add a sample k8s file
  • Write a separate markdown docs file at /docs describing how to deploy in k8s using a managed db
  • Publish Docker image in Docker hub
    • Tag the Docker image as latest

Ideas:

  • Wrap Gradle commands in Makefile
  • Improve Docker/Docker Compose file structure
  • Ensure PostgreSQL data folder belongs to host user, not container user
  • Study if the new Aggregate CLI tool could be useful for Docker/Docker Compose/k8s deployments

Let's discuss each point. We can cross things we won't be doing or add new things.

@ggalmazor
Copy link
Contributor Author

ggalmazor commented Mar 20, 2019

So, I'll start the discussion :)

I'll comment on the ideas I have some opinion I want to share. The rest are fine by me.

  • k8s, managed db vs PostgreSQL container

    I'm worried that the managed db will depend much on the provider users choose for their deployment. @brettneese, you have experience in k8s. Do you think we could offer instructions that apply universally across providers?

    In any case, we can totally start with a managed db solution and then add more options later.

  • Publish Docker image in Docker hub

    I think we're almost ready to do it. There's an opendatakit account and I think we would just need to hook everything up. It would be best to wait until @brettneese can review the image file structure and do some sanity checks, and then @yanokwa can share (with me?) the Docker Hub credentials so that I can configure the autobuild feature.

  • Wrap Gradle commands in Makefile

    This is a bit controversial. We've invested a lot of effort in improving the tooling around Aggregate, and having Gradle as the one-stop source to deal with the project's build workflow has made the project much more approachable for new contributors (we put a high value on that). I agree that there are other tools that are better suited for some parts of the build workflow, but that's a tradeoff we're willing to pay so far.

    I'm not against of having a Makefile as a wrapper for pre-defined Gradle tasks, but I'd rather start by having better documentation of the build workflow, with more examples, etc.

  • Ensure PostgreSQL data folder belongs to host user, not container user

    I don't know if this one is really an issue. I've put it because I know the Docker Compose setup we use for development (at /db) has this issue. Once you spin up the containers, a pgdata folder owned by root is created, which makes it a bit of a nuissance.

    Not super important for a development environment, but I'd figure that final users would find this super annoying.

@brettneese
Copy link
Contributor

Hey @ggalmazor, thanks for getting this going!

  • k8s, managed db vs PostgreSQL container

It's true that the instructions will vary widely across providers. I tend to strongly believe in managed DBs, particularly for a project like this, so I'd suggest just providing links to the provider's help docs on "how to set up a postgresql." The configuration will be the same - it's how you get the DB that's different. We can work on automating this a bit, maybe, but I don't see a whole lot of value there.

Similarly, how you get yourself a Kubernetes in the first place can vary widely across providers. GKS, EKS, and AKS are all a bit unique in their own way and you always have the option of spinning a Kubernetes up for yourself on bare instances. Again, we should provide recommendations, but I suggest pointing people to first-party help docs (or blog posts, such as my series.)

  • Publish Docker image in Docker hub

I don't see any issue with the current Docker file, now that my entrypoint tweaks are in. As I recall it, it's much lighter than the Docker image I'm currently using in production.

My only remaining thought with the Docker build process is that it's not currently tagging itself as aggregate:latest, which would be a lot simpler than copying around the weird tag that is autogenerated. It probably should tag itself as both. This would also make it easier to keep the docs up to date as they would apply whether someone pulled the image from Docker Hub or built it locally.

  • Wrap Gradle commands in Makefile

We can ignore this for now. It mostly came out of my annoyance with typing:

./gradlew clean dockerBuild -xtest -PwarMode=complete

And not really knowing what that did.

That being said, my background isn't Java, so that tooling is probably already familiar with your base (in the same way that npm is familiar to me, you're lucky I didn't suggest "add a package.json" which is my default plan of attack for build commands ;-))

Absolutely agree that a lot of this pain would also be solved with better docs (although I personally believe that if you have to write docs, it's too late.) This could also be baked into a CLI as well, though - the only advantage of a Makefile is that it essentially creates a CLI in a standard way.

  • Ensure PostgreSQL data folder belongs to host user, not container user

We can double check this once we add native DB support. That being said, as you've already mentioned, permanent volumes in container environments is a bit tricky and I'm going to suggest that we start with suggesting users use a managed DB and leave the DB in container trickiness to advanced users (it's def possible! but most users are much better off using a hosted DB.)

@ggalmazor
Copy link
Contributor Author

Thanks for your comments. I think we're in agreement overall. Some of the stuff we're talking about will have to be addressed further down the road but I think we're composing a fine list of actionable tasks nevertheless.

My only remaining thought with the Docker build process is that it's not currently tagging itself as aggregate:latest, which would be a lot simpler than copying around the weird tag that is autogenerated. It probably should tag itself as both. This would also make it easier to keep the docs up to date as they would apply whether someone pulled the image from Docker Hub or built it locally.

Thanks! That makes sense and I've added to the main list.

@chrismclarke
Copy link

chrismclarke commented Feb 10, 2020

I'm not sure if this is still an active concern, but I recently tried automating docker builds of aggregate for one of my own projects. I ended up creating a github action to build and deploy, although I think it probably can also be done directly through docker hub / github integration with an additional build hook.

The action can be found here:
https://github.com/chrismclarke/aggregate/blob/gh-actions/docker-build-deploy/.github/workflows/docker-build-deploy.yml

The corresponding docker file is published here:
https://hub.docker.com/repository/docker/chrismclarke/odkaggregate

I'm also planning to add a separate build for use with docker-compose (using the docker-compose.gradle build), although I'm still not sure the best way to tag these things.

In any case, let me know if it's of any use to you.

(update - realised docker-compose works fine with existing build, I just needed to pass a DB_HOST environment variable)

@lognaturel
Copy link
Member

Thanks for the update, @chrismclarke! One thing you may want to do is share your actions and docker file in a thread at https://forum.opendatakit.org/c/development/5 so it's a little more visible for folks who may benefit from it in the short term.

@chrismclarke
Copy link

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants