Skip to content
This repository has been archived by the owner on Feb 14, 2023. It is now read-only.

More flavors and better hierarchy #36

Open
zhicwu opened this issue Jul 11, 2017 · 4 comments
Open

More flavors and better hierarchy #36

zhicwu opened this issue Jul 11, 2017 · 4 comments

Comments

@zhicwu
Copy link
Owner

zhicwu commented Jul 11, 2017

Instead of one image with everything(plugins, patches) included, it's better to refine image hierarchy and introduce more flavors to fit different needs.

For examples:

pentaho/biserver:7.1-base (vanilla BI server without any additional plugin and customization)
 |-- pentaho/biserver:7.1-full (7.1-base with more plugins, customizable in build and deployment time)
        |-- pentaho/biserver:7.1-patched (7.1-full with patches and entry points for further customization)

May also need to consider:

  • multi-stage build and alpine for skinny image
  • version number - 7.1-base-20170712 in addition to 7.1-base?
@stonegatebi
Copy link

When I observe the two approaches

  1. CBF (community build framework, especially version 2)
  2. Github tags, and automated builds in hub.docker.com

Probably the best approach would be to have a standard folder heirarchy the divides the states (init = first time stack launch, bind-mounts and named volumes (repeated subsequent launches), backups (normal lifecycle), secrets (special case for init and subsequent launches where the containers can consume configs and secrets to deploy and update correctly).
That's what I tried to do here:
https://github.com/usbrandon/docker-pentaho-stack

I have to work through dependencies like mariadb which do not support the new Docker configs and secrets features. Placing those in environment variables is not secure.

I'll be adding healthchecks too, for example
https://github.com/docker-library/healthcheck/blob/master/mysql/docker-healthcheck

@zhicwu
Copy link
Owner Author

zhicwu commented Jul 21, 2017

Thanks Brandon. Your repo looks great and I'm going to copy and paste soon :D In real deployment on production, at this point, we still use environment variables, although it's not secure enough. Orchestration tool like k8s or docker swarm has its way to manage secrets. I think I'll write a v3 docker-compose file and instructions for running bi server in docker swarm - I actually tried that before but didn't go well, although PDI works very well in docker swarm.

I'm not aware of CBF, but looks like it's discontinued. Is it still valid for 7.1 and the coming 8.0?

Docker Hub is not flexible enough to deal with dependencies. So I created two new branches(instead of tags or sub-directory) and configured automated builds accordingly.

Why we need health check? I added wait-for-it.sh to check database status before starting bi server. Is the script used for same purpose? For runtime monitoring, I think it's better to use Promethus + Alert Manager + Grafana, which I should provide examples too.

@usbrandon
Copy link

usbrandon commented Oct 4, 2017

CBF2 is what you need to investigate https://github.com/webdetails/cbf2. There was a CBF(1) which was about making automating the pulling code and building of binaries easier. CBF2 says, provide software, divide out the improvements and their system of scripts and build process will layer out a docker image for that. They wrapped up some docker commands in it. I would full out recommend playing with it as part of the planning process.

One of the observations, and hence your wait-for-it.sh script, is that Docker Container startups wait for nothing. Once they reach a 'started' state, that may not mean that the process inside the container is ready to be used. That's a problem in a stack if you have dependencies. Health checks serve two purposes. First it can let us know when a container is ready. Second if we have scaled horizontally (sharding, search nodes etc) and one of them fails (out of memory etc). The health check lets the docker daemon know that it can remove the container from the load balancing / ingress part of the ecosystem. Basically allowing the system to clean up after zombie processes / containers. Not only that, but the failed health check could serve as a trigger to start a healthy instance somewhere else in the swarm.
I'm still learning about that and writing these things from memory.

I am fully onboard for the Grafana stuff. That monitoring should probably be part of the stack on one of the hosts at least. That's a really interesting area to pursue.

@zhicwu
Copy link
Owner Author

zhicwu commented Oct 6, 2017

Thanks Brandon. I’ll definitely look into CBF2 and respond more here.

As to wait-for-it.sh, it’s used to hold BI server booting process until external database is available(by watching aspecific port of the database server). Otherwise BI server will fail with initialization error. This happens when ops team restarted all servers/vms without any specific order and BI server started before database. It’s a better-than-nothing workaround, I think it’s better to write java code to do the check instead. In docker swarm or k8s, services started in a similar way, even you explicitly defined dependencies among them. Yes, health check is very useful to ensure HA and it’s usually handled outside of the service. However, for OOM situation, it’s been taken care of by JVM and docker - JVM will call oom-killer.sh for the first OOM exception, docker daemon will then restart the container right after JVM being killed. Scalability on the other hand, it’s one of the weakness of BI server, so I probably won’t start more than two instances - actually only one on production :)

Yes, Grafana + Prometheus is the de facto standard of monitoring. Speaking of that, I’ll tailor the metrics exposed by Tomcat as most of them are not very helpful.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants
@usbrandon @zhicwu @stonegatebi and others