Skip to content
This repository has been archived by the owner on Dec 13, 2023. It is now read-only.

Conductor using Docker #14

Closed
blueelephants opened this issue Dec 23, 2016 · 25 comments
Closed

Conductor using Docker #14

blueelephants opened this issue Dec 23, 2016 · 25 comments

Comments

@blueelephants
Copy link

Is there any Docker image available for Conductor (or at least are there plans for it)?
Searching on hub.docker.com didn't show me any results.

Thanks a lot.

@jcantosz
Copy link
Contributor

jcantosz commented Dec 27, 2016

EDIT: As of 2017-01-25 The docker images have been incorporated into this repo. Please see the docker folder's docker-compose. If an all-in-one docker image is desired, the serverAndUI subfolder's Dockerfile should be used with the config-local.properties.

Original comment:
Hey @blueelephants, I made a docker image that runs the server from the getting started guide. The github project is here: https://github.com/jcantosz/netflixConductor-Docker. The docker hub project is here: https://hub.docker.com/r/jcantosz/netflix-conductor-sample/. That might give you a starting point.

I'm going to see about running this on bluemix and see if there are any issues to sort out (I imagine I will have to do some CORS configuration). EDIT: This worked without modification. Since there the npm modules and the gradle build are gathered at runtime, startup is a bit slow

Hope this helps!

@v1r3n
Copy link
Contributor

v1r3n commented Dec 29, 2016

@jcantosz the docker configuration can be a good addition to the project. Feel free to submit a PR and we can publish a docker image for each release.

@jcantosz
Copy link
Contributor

Hey @v1r3n, I've created a PR for the dockerfile I created (getting started guide code). If you create an official netfilxoss version of the image I would like to link it from the readme (instead of my docker hub repo)

@blueelephants
Copy link
Author

@jcantosz: Many thanks for the Docker image - I will give it a try asap.

@blueelephants
Copy link
Author

@jcantosz: Just a question conc. startup.sh

What was your intention to put the installation of the node.js packages (command: npm install) into startup.sh and run it via ENTRYPOINT everytime you start a container?

Or to say it differently, why not adding this command to Dockerfile using "RUN" so that the command gets run only once when creating/refreshing the Docker image with the docker build command?

I appreciate any comments.

Many thanks.

@v1r3n
Copy link
Contributor

v1r3n commented Jan 4, 2017

@jcantosz thanks for the PR. Let me go through it.

@v1r3n
Copy link
Contributor

v1r3n commented Jan 5, 2017

@jcantosz how about bundling Dynomite and Elasticsearch along with the installation?
This will allow for someone to run a fully persisted version of Conductor using a docker image.

@jcantosz
Copy link
Contributor

jcantosz commented Jan 6, 2017

@blueelephants There are a few reasons I did this on startup:

  1. licensing - I did not want to search for each package and determine if I was allowed to repackage and redistribute their code.
  2. package version - This method allows npm packages to be up to date.
  3. image size - I did not check how much additional size this would add to the image.
    I agree with you, requiring each user to re-download the npm modules may not be ideal. Once the licenses are looked at, then I'm happy to revisit this. I'd love to hear your thoughts!

@v1r3n See comment on docker-compose in the PR. I think this would be a better way to go if we are able to separate elastic/dynomite from the project running the conductor-server. Let me know your thoughts

@v1r3n
Copy link
Contributor

v1r3n commented Jan 9, 2017

@jcantosz I am trying to come up with a server module that can be put into a docker image.
Agree on the points for elasticsearch, however it is open source under apache 2.0 license, and I am assuming there are pre-built docker images available for it.

For Dynomite, adding @shailesh33 and @ipapapa for suggestions.

@blueelephants
Copy link
Author

Just another question for better understanding.

The Dockerfile from @jcantosz includes commands to install Tomcat within the Docker image:

...
# Install Tomcat
&& groupadd tomcat \
&& useradd -s /bin/false -g tomcat -d /opt/tomcat tomcat \ 
...

The startup.sh script compiles the "test-harness" code via gradle:
/gradlew server &

Within Main.java, an embedded Jetty Server gets created and used.

Do I understand it correctly that - in the end - currently the Tomcat server never gets used (because of the embedded Jetty Server)?

If yes, the commands could currently be removed from Dockerfile, to reduce the docker image size.
(Not removing them wouldn't harm anything, and in the end when a "Server Jar" is available, then an external Tomcat/Jetty could be used again)

@ipapapa
Copy link
Contributor

ipapapa commented Jan 10, 2017

Had a brief discussion with @v1r3n about it. My proposal would be to have separate images for the app (conductor) and the data layers (Dynomite and Elasticsearch). For example, you can build a self-contained Dynomite image based on the following: https://github.com/Netflix/dynomite/blob/dev/docker/HOWTO.md

That would make the scalability and deployment lifecycle of each layer independent --> microservices :)

@v1r3n
Copy link
Contributor

v1r3n commented Jan 11, 2017

As @ipapapa suggested, we should separate out docker images for app (Conductor) and storage (Dynomite) and Elasticsearch.

test-harness is meant only for a quick demonstration purpose of Conductor, so I would not recommend bundling it in a docker image for production use. I have created a task #29 to create a server module which can be used in a Docker image along with the UI (which could potentially be a separate Docker Image too as the scalability of UI is a separate concern from the app server itself).

@jcantosz @blueelephants I will update this thread once I have the server module created.

@jcantosz
Copy link
Contributor

Hey @blueelephants you are exactly correct. I made that change in the pull request, but did not update my repo. Thanks for catching that, I'll go back and make that update.

@ipapapa I agree with you completely. That and a docker compose would be very valuable.

@v1r3n Great, thanks! Agreed on the test-harness, it was the easiest path to create an image (thus the notes on data persistence). If the test harness is going to be the entry point for new users though, a docker image tagged with test-harness might not be a bad idea
Note about the licensing I mentioned: Dynomite and Elastic both seem to be fine with redistribution of binaries, I did not individually check the npm modules used for the UI

@v1r3n
Copy link
Contributor

v1r3n commented Jan 14, 2017

@jcantosz I have created a server module and details on how to configure it and use it.

https://netflix.github.io/conductor/server/

It uses an embedded Jetty container for the server - you might want to update your Dockerfile with this instead of test-harness.

Server module is evolving, as I think it needs more fine grain controls (thread pool size etc.) for production readiness, feel free to suggest/send PR if you think it can be done better.

I would suggest also updating the documentation with instructions on how to use docker image (I have a section on the link above that can be updated).

Thanks

@jcantosz
Copy link
Contributor

jcantosz commented Jan 16, 2017

@v1r3n: Thanks for the update! I'll begin looking at a compose for conductor/dynomite/elastic. I'm looking at the config for dynomite and I'm not sure how the dynamic cluster information will work with this.
It looks like @ipapapa created this Dockerfile that uses this config yml. Could you tell me where I can set my workflow.dynomite.cluster.hosts, workflow.dynomite.cluster.name, workflow.namespace.prefix, and workflow.namespace.queue.prefix on the dynomite side? (I assume listen: 127.0.0.1:8102 defines my host:port, but I don't see a rack. Also, do you know if there is a published version of this docker image? I didn't see anything on the netflixoss docker hub

@ipapapa
Copy link
Contributor

ipapapa commented Jan 17, 2017

@jcantosz I do not think I have added it to the NetflixOSS docker hub. You can find the dynomitedocker. The above properties that you mention are set at the application/Dyno level, not at the Dynomite level.

@jcantosz
Copy link
Contributor

@ipapapa Thanks for the information. So to do this compose correctly, it sounds like I would need to create a java project for dyno, publish the jar, run it in a docker file, and use that to connect to dynomite. This would expose this functionality to the conductor? Is that correct?

@v1r3n I think for now I will look into the pure redis implementation instead of dynomite. Can you point me to the configuration parameters for redis?

@v1r3n
Copy link
Contributor

v1r3n commented Jan 17, 2017

@jcantosz re: the dyno project - you do not need that. Conductor already does that. To use dynomite, you just need the host and port for the dynomite servers and provide it to Conductor server config and it would take care of connections.

re: pure redis, here are the two configs you need:

db=redis
# Location of the redis server and port and a shard name.  You can use anything here - I have put in "a"
workflow.dynomite.cluster.hosts=localhost:6379:a

@jcantosz
Copy link
Contributor

@v1r3n That is a relief, thanks for letting me know!
I have a compose that pulls up the conductor with the in-memory DB and elasticsearch, I will begin converting it use dynomite or redis. I will create a new PR when that is done

@v1r3n
Copy link
Contributor

v1r3n commented Jan 17, 2017

@jcantosz we can use a docker image for dynomite. However, not sure how you can specify the networking details like host/ip addresses of the images in a compose. If we can then it should be easy to create a docker compose with conductor, dynomite and elasticsearch.

@jcantosz
Copy link
Contributor

@v1r3n The docker compose exposes the ports. The services' name in the docker-compose should set the image hostname.

Please take a look at PR #37. The compose appears to work, UI pulls up on localhost:3000, and I was able to pull up swagger on 8080. I do not have much experience with dynomite or elastic, so I was hoping you could verify that they are being utilized by the conductor as intended.

@v1r3n
Copy link
Contributor

v1r3n commented Jan 19, 2017

@blueelephants we just added a docker compose with images for dynomite, conductor and elasticearch. Give it a go. Instructions here:

https://github.com/Netflix/conductor/tree/dev/docker

I am working on getting official Netflix OSS conductor image published to docker hub in the meantime.

@blueelephants
Copy link
Author

@v1r3n many thanks for the Docker images - I tried them and I can provide some feedback.
My test was on my development machine:

  • Ubuntu 16.10
  • Docker 1.13.0, build 49bf474

First I ran into one of those "beginner's" problems when starting ElasticSearch via docker-compose

ERROR: bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

A proposed solution can be found here: docker-library/elasticsearch#98

In my case the following command let me start the ElasticSearch Docker image:

sudo sysctl -w vm.max_map_count=262144

But I still have some problems / issues where I would like you to ask for some advice.

I can start the "kitchensink" workflow via Swagger-UI, but when I switch to the Node.Js UI and click on "Running" I don't see the running workflow - instead I get the following error message:

"Error: Internal Server Error
   at Request.callback (/ui/node_modules/superagent/lib/node/index.js:698:17)
   at IncomingMessage.<anonymous> (/ui/node_modules/superagent/lib/node/index.js:922:12)
   at emitNone (events.js:91:20)
   at IncomingMessage.emit (events.js:185:7)
   at endReadableNT (_stream_readable.js:974:12)
   at _combinedTickCallback (internal/process/next_tick.js:74:11)
   at process._tickCallback (internal/process/next_tick.js:98:9)\n"

In the "test-harness version" I also got this error, when I clicked on "Running", "Failed", etc. but only if no workflow was started! Once a workflow was started the UI worked.

Interestingly, in the UI, I can see all task-types and workflow definitions (even when I add some of mine)

Having said that, I can also poll/complete for tasks of a running workflow => Conductor Server is working in the background.

But when taking a look at the console where docker-compose prints output I can also see these error messages:

conductor_1      | 164553 [elasticsearch[Phantom Rider][generic][T#3]] INFO  org.elasticsearch.client.transport  - [Phantom Rider] failed to get local cluster state for {#transport#-1}{172.19.0.3}{172.19.0.3:9300}, disconnecting...
conductor_1      | NodeDisconnectedException[[][172.19.0.3:9300][cluster:monitor/state] disconnected]
elasticsearch_1  | [2017-01-21T16:44:02,250][WARN ][o.e.t.n.Netty4Transport  ] [Ylg2kaW] exception caught on transport layer [[id: 0x5b7f06d4, L:/172.19.0.3:9300 - R:/172.19.0.4:48410]], closing connection
elasticsearch_1  | java.lang.IllegalStateException: Received message from unsupported version: [2.0.0] minimal compatible version is: [5.0.0]
elasticsearch_1  | 	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1236) ~[elasticsearch-5.1.1.jar:5.1.1]
elasticsearch_1  | 	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[transport-netty4-5.1.1.jar:5.1.1]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:651) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:536) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:490) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:450) [netty-transport-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873) [netty-common-4.1.6.Final.jar:4.1.6.Final]
elasticsearch_1  | 	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]

=> Some problem with ElasticSearch which has a side effect on the UI?

ElasticSearch seems to run, but Conductor Server doesn't seem to be able to contact it:

I get an empty output when I want to check the indices:

# curl http://127.0.0.1:9200/_cat/indices?v
health status index     pri rep docs.count docs.deleted store.size pri.store.size

In contrast, in the "test-harness" version I could see the conductor-index (which as far as I know gets created by Conductor Server during startup):

# curl http://127.0.0.1:9200/_cat/indices?v
health status index     pri rep docs.count docs.deleted store.size pri.store.size 
yellow open   blog        5   1          4            0     17.3kb         17.3kb 
yellow open   conductor   5   1          3            1     28.9kb         28.9kb

I appreciate any help.

@v1r3n
Copy link
Contributor

v1r3n commented Jan 24, 2017

@blueelephants the error you are seeing with ES is due to the version mismatch. We built Conductor using 2.x version of ES and default docker image for Elasticsearch uses 5.x causing the errors.

I have fixed this in the current DEV branch. Give it a shot and let me know if that works.

UI giving errors when there are no workflow executions is a known bug. For now, I am starting a new instance of kitchensink when starting up in docker to avoid this.

@v1r3n
Copy link
Contributor

v1r3n commented Feb 15, 2017

@blueelephants closing this issue. Please reopen if the issue still persists.

@v1r3n v1r3n closed this as completed Feb 15, 2017
apanicker-nflx pushed a commit that referenced this issue Jul 3, 2019
long-64 pushed a commit to long-64/conductor that referenced this issue Oct 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants