Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "devshell" Docker container with all tools required for a dev env pre-installed #2559

Closed
wants to merge 22 commits into from

Conversation

vorburger
Copy link
Contributor

@konzz
Copy link
Member

konzz commented Oct 6, 2019

Hi @vorburger, I think this is just awesome, I'll have a look at it asap

RUN dnf install -y git findutils less which \

# Install system packages specifically required by Uwazi
libpng12-devel libpng-devel mongodb-org-shell mongodb-org-tools poppler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you control the poppler version? I think it needs to be 0.26 specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you control the poppler version? I think it needs to be 0.26 specifically.

@bdittes We can't control the version of an (RPM or DEB) OS package .. e.g. on the fedora:30 used as the base image for this container, the poppler-utils package comes with pdftotext -v => pdftotext version 0.73.0 ... (see the Packaged Versions on https://poppler.freedesktop.org).

The node:8-jessie used as based image over in https://github.com/fititnt/uwazi-docker/blob/master/Dockerfile indeed includes the much older pdftotext version 0.26.5.

@konzz @txau (or @fititnt ?) are there any known incompatibilities in up-to-date Poppler versions like 0.73.0 which make them totally un-usable for Uwazi? I guess it would be possible to try to rebuild an ancient, from source (in another Dockerfile..), but... unless there are very strong reasons, why would we?

PS: I noticed while checking these versions that poppler was actually the wrong 😈 package by mistake, it's poppler-utils - have just fixed that in a new commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the https://github.com/huridocs/uwazi/blob/development/.circleci/config.yml in lines 29 both 89 also just does sudo apt-get install poppler-utils, so depending on what version of Debian CircleCI runs, which does not seem to be fixed and which they could change any time (I've actually run into similar problems on TravisCI in another project in the past), you get a Poppler version you can't control either in CI ...

Copy link
Member

@RafaPolit RafaPolit Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use locally the poppler v0.74.0 as installed in the repo cran/poppler (https://askubuntu.com/a/1123774) and have had no problems at all.

I think we should change the dependency version in the README.

@konzz
Copy link
Member

konzz commented Oct 7, 2019

Hi @vorburger I'm having some problem testing this.

This are the containers that show when I run sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4024687a70b1 uwazi_uwazi "/bin/sh -c 'sleep i…" 38 minutes ago Up 22 seconds 0.0.0.0:32779->3000/tcp, 0.0.0.0:32778->8080/tcp
4b6066c3afdd mongo:3.4-xenial "docker-entrypoint.s…" 38 minutes ago Up 23 seconds 0.0.0.0:32777->27017/tcp uwazi_mongo_1
de2cd4fb6c1e elasticsearch:5.6 "/docker-entrypoint.…" 38 minutes ago Up 25 seconds 9300/tcp, 0.0.0.0:32776->9200/tcp

Then I run sudo docker exec -it uwazi_uwazi_1 bash to enter the container
and when I run yarn blank-state I get the following error

Deleting uwazi_development database
MongoDB shell version v3.4.23
connecting to: mongodb://127.0.0.1:27017/uwazi_development
2019-10-07T08:53:50.944+0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27017, in(checking socket for error after poll), reason: Connection refused
2019-10-07T08:53:50.944+0000 E QUERY    [thread1] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed :
connect@src/mongo/shell/mongo.js:240:13
@(connect):1:6
exception: connect failed
2019-10-07T08:53:54.499+0000	Failed: error connecting to db server: no reachable servers

The only difference I noticed, is that in your demo, the containers expose the ports to 3000

@txau
Copy link
Collaborator

txau commented Oct 7, 2019

@vorburger just in case it makes sense to work together with @fititnt , he has been maintaining a dockerized Uwazi: fititnt/uwazi-docker#12

@fititnt
Copy link

fititnt commented Oct 7, 2019

Great initiative! I Did not tested your submission at this moment, but let me say upfront, @vorburger, even if the license already is Public Domain, you are free to copy/adapt or look around the https://github.com/fititnt/uwazi-docker, no need to mention authors and etc. The user base for humanitarian open source is already small compared to open source in general, and even if Docker/Containers are pretty popular, is hard to even have more people to review or test.

@txau @konzz I'm also Ok if on next weeks eventually have separate repository at @huridocs with some container version of Uwazi and both license, copyright, authors, etc be the same as Uwazi. I can even send via e-mail formal Contributor License Agreement (CLA) equivalent if eventually this become more than just this PR.

One of the issues I have on fititnt/uwazi-docker is that we have less people who already know DevOps and is testing/using Uwazi. So if after this PR (merged or not) the people from @huridocs think we eventually could have some minimal number of external people to the Uwazi to discuss in some place (like a different repository under the @huridocs) only about this I would like to be part of. One issue with having some "easy to deploy" for a software complex as Uwazi is support with first time developers. Is not that the Uwazi is hard, but the requirements can require a lot of skills for someone to test.

But for a Docker/Docker-compose committed to the main repository, I think choose simplest and easier solution to maintain for core developers of Uwazi tend to be better as it tend to require be updated with the Uwazi source more often. For more short term, this maybe would means be very explicit that the Docker~ is for development only and do not try to backward compatible (like only changing the docker~ files if will not break, as would require much more testing). This approach could reduce fear to adopt start having some "official" docker alternative and, in fact, become the reference to what needs to be changed for something that have to be updated for next releases for who use containers. Maybe a second step (that does not need be soon, but is an option if have more people than me) could be have a dedicated repository on @huridocs just for "not only development" version of deploy Uwazi, maybe more focused on 'Community Contributed dockerized version' and some more explicit disclaimers of what is and what is not.

(I will also ping @vasyugan, just to be aware of this)

@vorburger
Copy link
Contributor Author

Hi @vorburger I'm having some problem testing this.

Thank You for your interest in this PR and having tested this so quickly! 🥇

The only difference I noticed, is that in your demo, the containers expose the ports to 3000

The difference most likely is that I don't actually use "real" Docker and Docker Compose (from docker.com), but instead Podman with Podman Compose, and while containers should be largely portable and compatible (that's the whole point of containers, after all), the tooling may have subtle differences, such as how exactly shared networking is set up.

Just FYI I use Fedora 30 as host OS, but Podman is now available e.g. for Ubuntu or Debian as well, just in case anyone wants to try it; the reason why I much prefer Podman over Docker is that it's more secure.

(...)
exception: connect failed

OK, so for you, the uwazi container can't connect to 127.0.0.1:27017, to the mongodb container.

Reading https://docs.docker.com/compose/compose-file/#links, and looking at @fititnt's docker-compose.yml for inspiration (thanks for sharing!), we find that he used the MongoDB (and ElasticSearch) containers' names as hostnames to connect to, instead of localhost and port. I gather that he's doing that by setting the DBHOST and ELASTICSEARCH_URL environment variables, which seems easy enough, so that's probably a better solution - I'll change this PR to use that approach, and re-test it.

@vorburger
Copy link
Contributor Author

I'll change this PR to use that approach, and re-test it.

This would need #2564 to be addressed, first.

@vorburger
Copy link
Contributor Author

Great initiative! I Did not tested your submission at this moment, but let me say upfront, @vorburger, even if the license already is Public Domain, you are free to copy/adapt or look around the https://github.com/fititnt/uwazi-docker

Cool, thanks for clarifying that here. BTW it's great to see https://github.com/fititnt/uwazi-docker#license; clear licenses are important in open source.

no need to mention authors and etc.

Credit where credit is due! 😸

eventually have separate repository at @huridocs with some container version of Uwazi
(...) to discuss in some place (like a different repository under the @huridocs) only about this
could be have a dedicated repository on @huridocs

IMHO it's just simpler in practice to keep things like Dockerfiles etc. as part of the main repository. Having a few files in a repo IMHO does not automatically imply guaranteed professional support from an entity - that can be clarified elsewhere.

@fititnt
Copy link

fititnt commented Oct 8, 2019

@vorburger I would have no problem if switch from Debian to CentOS/Red Hat or using Podman instead of docker-compose directly. I just saw "rootless" and "k3s", so you definitely got my attention if we go further.

One thing we could consider (not for this PR, because can take more time than ideal) is choose some LTS version as base image (for the container running the Uwazi App, the others are not important here). It can make things easier later to avoid surprises, but still have control (One issue we had at fititnt/uwazi-docker#37; one change on the intermediate image FROM node:8-slim sadly broke chance of someone compile older fititnt/uwazi-docker releases without any change to the code).

IMHO it's just simpler in practice to keep things like Dockerfiles etc. as part of the main repository.

That's one option too, but (even if just to make easier reduce image size, since production image does not need to have some dependencies only used once to generate assets) have at least 2 Dockerfiles for the App instead of separate repository (one more for "production"-like and other with no optimizations, more similar to developer using daily, with debug tools, etc). I'm saying this considering that one ideal scenario would be eventually (again, in future, not this PR) have automatic build of the container image pushed to some docker registry. I did not started this even on my personal account, that's why everyone had to build the images locally.

Anyway, at least for this first PR, I think tend to be better keep it simple, even if with less features (but less bugs). The more specific comments or ideas could be mentioned for later.

@vorburger
Copy link
Contributor Author

use MongoDB (and ElasticSearch) containers' names as hostnames to connect to, instead of localhost (...) by setting the DBHOST and ELASTICSEARCH_URL environment variables

I'll change this PR to use that approach, and re-test it.

This would need #2564 to be addressed, first.

@konzz I've now addressed #2564 via proposed #2575, but now run into #2576 & #2577 ...

@fititnt
Copy link

fititnt commented Oct 12, 2019

Nice! I like something you are doing here., @vorburger! This maybe will increase need of tests. And maybe, if we can make new changes somewhat backward compatible, this could even made possible to test older versions of Uwazi.

This PR also changes some of the shell scripts like

  • database/admin_user.sh
  • database/blank_state.sh
  • database/dump_blank_state.sh
  • database/dump_sync_state.sh
  • nightmare/fixtures/dump.sh

At fititnt/uwazi-docker one of the main initial reasons to start by making a different repository (and even had to do monkey patch, see https://github.com/fititnt/uwazi-docker/commits/68c0f0538689f16294260d3533372422a7cf936b/scripts/patch/uwazi/database/reindex_elastic.js, but very soon this was fixed on Uwazi codebase) was to make it faster have an MVP. There, if I could fix something only at docker and scripint level, I would try it, and just open issue mentioning.

I did not had to touch the shellscripts be was able to make lots of work at the uwazi-docker/docker-entrypoint.sh. But I'm mentioning that I'm liking your changes because if make easier to use Docker directly on the main code base.

One comment I would make here (both for maybe get this PR merged faster, but also potentially allow use some of Docker~ on older versions is, if you have to change some of these shellscripts (or put some hotfix before the JavaScript code is updated) consider the idea of by default the breaking changes only be activated if some special environment variable is set.

This could be very useful if we hit some issues related with what I supposed here #2578 (comment) (TL;DR; something fail because last step of populating database was not ready yet).

Note that this only apply if any change on this scripts change old behavior as default (not like just adding option to configure a variable).


# Install utility packages generally useful in any devshell
RUN dnf install -y git findutils less which \

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick comment, maybe this empty line could cause problems, but on this case was just one warning if using docker-compose. Maybe just remove it?

# fititnt at bravo in /alligo/code/vorburger/uwazi on git:devshell o [22:37:01]
$ docker-compose --version
docker-compose version 1.21.2, build a133471

# (...)

[WARNING]: Empty continuation line found in:
    RUN dnf install -y git findutils less which     libpng12-devel libpng-devel mongodb-org-shell mongodb-org-tools poppler poppler-utils
[WARNING]: Empty continuation lines will become errors in a future release

Captura de tela de 2019-10-11 22-35-49

@RafaPolit
Copy link
Member

RafaPolit commented Oct 24, 2019

@vorburger This is just incredible!

There are a few issues we need to take care from our side, but I believe there are also a few commands missing after the initial setup, before running yarn blank-state:

$ npm config set scripts-prepend-node-path true
$ yarn install

The first one is just to prevent any conflicts between yarn and npm.
The second one is fairly critical! Without it, there is no chance of having a working environment.

This should be run once, and after all else (before blank-state) is successful. So they should be added to the script.

@RafaPolit
Copy link
Member

As for the failure to run blank-state, I think that our migration system is not respecting the paths correctly, because I detected that, under mongo, there are two databases created:

  • uwazi_develpment
  • undefined

There, we have collections for updatelogs and migrations. So, apparently, something in the app is still pointing to the wrong DBHOST.

RafaPolit and others added 2 commits October 24, 2019 13:56
* development:
  Add hints to avoid pitfalls to Installation guide
  Refactor Text component
  Fix error when selection range ends right before current page range
  Document environment variables in README
@RafaPolit
Copy link
Member

RafaPolit commented Oct 24, 2019

So, I have found the culprit:

Our database.js file (https://github.com/huridocs/uwazi/blob/development/app/api/config/database.js) has this weird configuration: if you define a DBHOST, you MUST also configure a DATABASE_NAME. We have no default or fallback for it.

So, I have just added that config env variable, and things seem to work fantastically!

I've committed that change already.

@RafaPolit
Copy link
Member

RafaPolit commented Oct 24, 2019

One final thing we may need to address is that, we have some HOT paths configured to use localhost. In my Ubuntu environment, the container (I used Docker and Docker-Compose) is mapped to a particular IP, in my case: 172.18.0.4.

So, I had to change two files internally for it to work:

  • all occurrences of 'localhost' in /app/react/App/Root.js
  • single occurrence of 'localhost' in /app/react/ServerRouter.js

So, we either find a way to have the IP of the container mapped to localhost by default... or we have to develop something in this files that would somehow be configurable (the second one is probably still needed for a good long term solution)

PS: I have already incorporated this improvement of having a DOMAIN env variable that can be used to select the domain name in Uwazi. For this dockerized version, it defaults to uwazi.localdomain (but can be changed in the compose file to whatever the user needs)

- elasticsearch
environment:
- DBHOST=mongo
- DATABASE_NAME=uwazi_development
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is critical in order to have our system point to the right database when DBHOST is configured.

@RafaPolit
Copy link
Member

I have also rebased our current dev branch into this one so all the recent changes are already included. Hope you don't mind.

@RafaPolit
Copy link
Member

This haven't been worked on in a while. Closing it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants