Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use docker-py module in place of compose file #25

Merged
merged 26 commits into from
Jul 15, 2020

Conversation

22PoojaGaur
Copy link
Member

@22PoojaGaur 22PoojaGaur commented Jun 3, 2020

This PR is for issue #2

Done -

  • Implemented functions to create containers for solr, postgres, tomcat and intermine-builder
  • call the functions where needed
  • functionality to both build image then start container and pull image then start container.

Todo -

  • error handling not done.

@22PoojaGaur 22PoojaGaur changed the title User docker-py module in place of compose file Use docker-py module in place of compose file Jun 8, 2020
@22PoojaGaur
Copy link
Member Author

22PoojaGaur commented Jun 8, 2020

Hi @uosl @leoank ,
Please review the PR.

@22PoojaGaur
Copy link
Member Author

22PoojaGaur commented Jun 8, 2020

I have a few doubts -

  1. here, we are still just checking if config is same. But even with same config a user can manually delete .Dockerfile
    So, we should include check if file exist or not, right?

  2. The main work here is with docker image and docker container, so main place where I think unit test can be added is if I separate getting environment variables from functions. How does it sound to add unit test for that? Is there other place in these changes to add unit test?

  3. In PR#10 on docker-intermine-gradle for issue Update intermine_builder to handle an already built mine #12, we updated build.sh and compose files for handling .intermine folder.
    I think those changes have not reflect on dockerhub as it shows error for .intermine folder.

  4. For error handling, usually dockerclient.containers.run exits without error. For seeing the error, I have to look at container.logs(). How should I handle or show error in such case?

intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
@leoank
Copy link
Member

leoank commented Jun 8, 2020

  • Checking for the existence of Dockerfile is a good idea. Add checks around it.
  • I'll have to look into it.
  • I'll update the dockerhub images later today. Or better add a github workflow to automatically build and push images to dockerhub. Have a look at https://github.com/marketplace/actions/build-and-push-docker-images
  • For errors in container logs, you can do some string processing with regex to look for a specific string pattern.

@heralden
Copy link
Member

heralden commented Jun 8, 2020

Regarding error handling, you should check if you can get the exit code from running client.containers.run. If any of them are non-zero, you can display a message that says "Container X exited with code N" along with the contents of stderr, instead of the usual "Build completed. Visit http://localhost:9999/biotestmine to access your mine.".

@heralden
Copy link
Member

heralden commented Jun 8, 2020

I created the issue #26. This means you won't need to validate that the dockerfiles exist since they'll be included with intermine_boot. @22PoojaGaur You are welcome to include it in this PR, or tackle it later, depending on what you prefer.

@heralden
Copy link
Member

heralden commented Jun 8, 2020

I don't see any good places to put unit tests in the changes so far. If you write tests for obtaining the envvars, you'd basically be testing os.environ.get (we only want to test the code we write). If you refactor after Ank's advice for the default values, you can place the calls to os.environ.get directly inside the dict, making it pretty clean and easy to spot typos.

@22PoojaGaur
Copy link
Member Author

There is still a issue in running. Now all the containers run but the status is still 404

I printed the logs and I can not see any major issue there. I am not sure how can I debug it now. can you suggest something?
Screenshot from 2020-06-24 17-38-38

@heralden
Copy link
Member

There is still a issue in running. Now all the containers run but the status is still 404

I printed the logs and I can not see any major issue there. I am not sure how can I debug it now. can you suggest something?

My first thought is do you wait for the intermine_builder container to finish? All the other containers should continue running, while the intermine_builder container exits once the mine is ready. The logs for it seem very short, so maybe intermine_boot exits before it finishes (meaning the web server will only be ready in the 30 minutes or so it takes to build).

@22PoojaGaur
Copy link
Member Author

There is still a issue in running. Now all the containers run but the status is still 404
I printed the logs and I can not see any major issue there. I am not sure how can I debug it now. can you suggest something?

My first thought is do you wait for the intermine_builder container to finish? All the other containers should continue running, while the intermine_builder container exits once the mine is ready. The logs for it seem very short, so maybe intermine_boot exits before it finishes (meaning the web server will only be ready in the 30 minutes or so it takes to build).

ok. It can be the case.

I think the command starts the container and returns control. I am not sure whether it keeps the process running in container.

When I did separately I was in control of letting it run, maybe that's why it was possible.

Let me read more about command and see if it's a possible issue.

@22PoojaGaur
Copy link
Member Author

Hi @uosl

  1. Following your suggestion, I printed logs while a container was running so that the script does not stop before containers are build.
    But to see that a container is built, I have just done a string search for the last expected log.
    Is there a better way to do this? Like checking exit code of the Dockerfile script or something?

  2. Now, the containers for solr, tomcat and postgres run successfully. But intermine_builder get stuck at a "sed" command error (shown in attached screenshot).
    I think this maybe because of some error in file naming from my side which I'll need to debug. Can you think of any other reason this might be the case?
    Screenshot from 2020-06-29 19-06-44

@heralden
Copy link
Member

  1. Following your suggestion, I printed logs while a container was running so that the script does not stop before containers are build.
    But to see that a container is built, I have just done a string search for the last expected log.
    Is there a better way to do this? Like checking exit code of the Dockerfile script or something?

Firstly, I think it would be a good idea to pass detach=False to the intermine_builder container, since we need to wait for that anyways. According to the documentation, it will raise docker.errors.ContainerError If the container exits with a non-zero exit code and detach is False. This means you can catch that error and handle the case where intermine_builder fails so you can display an appropriate message (which is probably the contents of STDERR from intermine_builder, along with a helpful note that it failed).

When using detach=False, it will block and not return until intermine_builder exits, so this would also be a good way to know when it's done building.

For the other containers you'll be using detach=True so you won't know if they exit. I think going through each of those containers once intermine_builder has finished, checking that they're still running (if they're not running, something has gone wrong so you can use container.logs to get STDERR from the container and print it to the user) is a decent approach.

  1. Now, the containers for solr, tomcat and postgres run successfully. But intermine_builder get stuck at a "sed" command error (shown in attached screenshot).
    I think this maybe because of some error in file naming from my side which I'll need to debug. Can you think of any other reason this might be the case?

I found this stackoverflow answer, which make it seem the error message occurs when the string contains / characters. It sounds to me like one of the variables IM_DATA_DIR or DATA_DIR have a / in them. I think you should check if that's true, then determine if they're supposed to be there (it could be a trailing slash that's unnecessary) and then either avoid having the slash or handle it by using a different sed delimiter (the stackoverflow answer I linked has an example of using ~ at the delimiter).

@heralden
Copy link
Member

heralden commented Jun 29, 2020

@22PoojaGaur Here's some things you should make sure is done in this PR after you merge master into your branch (to get the changes from #27):

  • move data volume dirs out from docker-intermine-gradle dir (docker-intermine-gradle dir should only be used when building images)
  • replace old code creating volume dirs with doing it manually (this might be done already?)
  • make sure to delete volume directories from new location, instead of old code deleting repo
  • remove dependency on docker-compose files (also remove from manifest.in)

Please let me know if anything is unclear!

@22PoojaGaur
Copy link
Member Author

@22PoojaGaur Here's some things you should make sure is done in this PR after you merge master into your branch (to get the changes from #27):

* [ ]  move data volume dirs out from docker-intermine-gradle dir (docker-intermine-gradle dir should only be used when building images)

* [ ]  replace old code creating volume dirs with doing it manually (this might be done already?)

* [ ]  make sure to delete volume directories from new location, instead of old code deleting repo

* [ ]  remove dependency on docker-compose files (also remove from `manifest.in`)

Please let me know if anything is unclear!

I have handled these points.

@22PoojaGaur
Copy link
Member Author

22PoojaGaur commented Jul 6, 2020

Hi @uosl

Thanks for the hint, The issue with sed was resolved.

But after that the script seems to get stuck on creating postgres database
Screenshot from 2020-07-06 20-33-07

I dug a little deeper,
instructure/lti_tool_provider_example#4
Here, it says some changes. But, if it was the case the changes should be handled in building container and it works with simple compose file

After that, I went ahead to look the container status and details which shows this error.
Screenshot from 2020-07-06 20-24-30

(This occurs on both master branch and issue branch so I think this might not be a problem though)

Do you have an idea what might be causing that.

@heralden
Copy link
Member

heralden commented Jul 6, 2020

Alright, so I believe the Postgres is unavailable and docker-healthcheck errors are two separate problems. You can ignore the docker-healthcheck error as it occurs because the script is implemented incorrectly in the postgres dockerfile (we'll fix this in the future, but it's pretty low priority for now).

When using docker-compose, I'm guessing it binds the postgres host to the IP address of the postgres docker container (since our postgres container is separate from the intermine_builder container, the workaround in the github issue you linked won't work). Now that we're using docker-py instead, the Postgres is unavailable error means that the postgres host doesn't resolve to an IP address.

I think this happens because the containers aren't connected with each other, and we need to use the docker network API.
https://docker-py.readthedocs.io/en/stable/networks.html
After a quick look at the documentation, it seems you need to create a network and connect the docker containers to it, making sure to specify aliases for each of them (eg. postgres for the postgres container, so intermine_builder is able to reach it by that name).
It would probably be helpful to reference how docker-compose creates a network, since we know that's the minimum we need to get it working.

I hope this made sense; please let me know (here or on chat) if you're having trouble figuring out the documentation!

@22PoojaGaur
Copy link
Member Author

Hi @uosl

I looked into it and found out that compose file makes a network with name docker-intermine-gradle_default.

So, I created a docker network with name intermine_boot and attached all containers to it. After that I also added alias for postgres, solr, tomcat to the intermine_builder container.

But it does not solve the issue. after some searching it appears the linking in docker-py does not work as expected. This issue details it https://stackoverflow.com/questions/37144357/link-containers-with-the-docker-python-api

I will try the solution suggested there if that solves the problem and let you know in case problem is not resolved.

@heralden
Copy link
Member

heralden commented Jul 9, 2020

It may be worth to try passing name='postgres' in client.containers.run, and similarly for the other containers. Apparently if they're in the same network, they'll be able to reach each other by name, without having to setup links or aliases.

@22PoojaGaur
Copy link
Member Author

It may be worth to try passing name='postgres' in client.containers.run, and similarly for the other containers. Apparently if they're in the same network, they'll be able to reach each other by name, without having to setup links or aliases.

ok, I can give it a try.

I have set up the name as per compose file intermine_postgres, intermine_solr and so on. As far as I read, setting up aliases helps to refer them by alias name.

@22PoojaGaur
Copy link
Member Author

Hi @uosl

Finally everything is working!
Your tip helped to rename the containers =D

There are few minor things in my mind which I'll fix ->

  1. it does not check if network already exist. Should not create a new network if it exist.
  2. Intermine builder container is run in detach =False mode as suggested. But it does not print the building logs. I looked through documentation and github issues but the logs are not printing in detach = False mode.

@heralden
Copy link
Member

Great to hear! 🎉

Intermine builder container is run in detach =False mode as suggested. But it does not print the building logs. I looked through documentation and github issues but the logs are not printing in detach = False mode.

Have you tried passing stream=True?
stream (bool) – If true and detach is false, return a log generator instead of a string. Ignored if detach is true. Default: False.

Hopefully for log in intermine_builder_container: print(log) should work then.

@22PoojaGaur
Copy link
Member Author

Refactored the code and added log statements for intermine builder.

Copy link
Member

@heralden heralden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks great! 🚀

I added some comments, mostly small things we can clean up (I know 11 sounds like a lot, but I promise most of them are quick 😅).

intermine_boot/commands.py Show resolved Hide resolved
intermine_boot/commands.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Show resolved Hide resolved
intermine_boot/intermine_docker.py Outdated Show resolved Hide resolved
intermine_boot/intermine_docker.py Show resolved Hide resolved
@22PoojaGaur
Copy link
Member Author

@uosl handled all comments.

Copy link
Member

@heralden heralden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Great job [=

@heralden heralden merged commit d1f55ff into intermine:master Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants