Skip to content
This repository has been archived by the owner on Dec 16, 2019. It is now read-only.

Add RES microservice to the Docker Compose setup #369

Merged
merged 16 commits into from
Oct 10, 2018
Merged

Add RES microservice to the Docker Compose setup #369

merged 16 commits into from
Oct 10, 2018

Conversation

dtitov
Copy link

@dtitov dtitov commented Sep 28, 2018

Describe the pull request:

  • Bug fix
  • Functional change
  • New feature
  • Code cleanup
  • Build system change
  • Documentation change
  • Language translation

Pull request long description:

Brings RES DAta Out microservice to Docker Compose setup.

@dtitov dtitov added this to the Sprint 37 milestone Sep 28, 2018
@dtitov dtitov self-assigned this Sep 28, 2018
@blankdots
Copy link

I think it might be worth it to have a different stage on Travis where we run the Outgestion integration tests in parallel, and i assume this translates to being able to run them ike mvn test -Dtest=OutgestionTests -B.
This also raises the question if we should name CommonTests to IngestionTests, but the naming is not the issue, what is more important we there should be different stages for Ingestion and Outgestion?

@nanjiangshu
Copy link

Hi,

I've tested with the following three combinations of bootstrapping but none of then went through the ingestion test (with make -C test). Tests were done for the branch feature/res at the commit #f8d99d3.

  1. Using the OpenSSH inbox and the Python keyserver
    make bootstrap
  2. Using the Apache Mina inbox and the Python keyserver
    make bootstrap ARGS='--inbox mina'
  3. Using the Mina inbox and the ega-keyserver
    make bootstrap ARGS='--inbox mina --keyserver ega'

Also, for the EGA microservices, the images are retrieved from cscfi/ega-*. Maybe better with the tag ega-data-api/*** so that in case locally built images for specific branches need to be used?

Detailed error messages from logs for the above three tests are pasted below. For the test 3, it seems the format of the PGP key is not set correctly.

verify         | [lega.verify][ INFO ] (L34) Key ID F90814577E68C267
verify         | [lega.verify][ INFO ] (L37) Retrieving the Private Key from https://keys:8443/retrieve/F90814577E68C267/private (verify certificate: False)
verify         | [lega.verify][ERROR ] (L52) HTTP Error 404: Not Found
verify         | [lega.utils.db][ERROR ] (L284) Converting PGP Key error: HTTP Error 404: Not Found to a FromUser error
verify         | [lega.utils.db][ERROR ] (L251) Exception: <class 'lega.utils.exceptions.FromUser'> in /usr/lib/python3.6/site-packages/lega/utils/db.py on line: 285
verify         | [lega.utils.db][ERROR ] (L254) PGP Key error: HTTP Error 404: Not Found (from user: True)
verify         | [lega.utils.db][DEBUG ] (L141) Setting error for 1: PGP Key error | Cause: None    
verify         | [lega.verify][ INFO ] (L34) Key ID F90814577E68C267
verify         | [lega.verify][ INFO ] (L37) Retrieving the Private Key from https://keys:8443/retrieve/F90814577E68C267/private (verify certificate: False)
verify         | [lega.verify][ERROR ] (L52) HTTP Error 404: Not Found
verify         | [lega.utils.db][ERROR ] (L284) Converting PGP Key error: HTTP Error 404: Not Found to a FromUser error
verify         | [lega.utils.db][ERROR ] (L251) Exception: <class 'lega.utils.exceptions.FromUser'> in /usr/lib/python3.6/site-packages/lega/utils/db.py on line: 285
verify         | [lega.utils.db][ERROR ] (L254) PGP Key error: HTTP Error 404: Not Found (from user: True)
verify        | [lega.verify][ INFO ] (L34) Key ID F90814577E68C267
verify        | [lega.verify][ INFO ] (L37) Retrieving the Private Key from http://keys:8080/keys/retrieve/F90814577E68C267/private/bin?idFormat=hex (verify certificate: False)
verify        | [lega.utils.db][CRITICAL] (L287) Keyserver error: Expected: ASCII-armored PGP data
verify        | [lega.utils.db][ERROR ] (L251) Exception: <class 'lega.utils.exceptions.KeyserverError'> in /usr/lib/python3.6/site-packages/lega/utils/db.py on line: 282
verify        | [lega.utils.db][ERROR ] (L254) Keyserver error: Expected: ASCII-armored PGP data (from user: False)
verify        | [lega.utils.db][DEBUG ] (L141) Setting error for 1: Keyserver error | Cause: None
verify        | [lega.utils.db][DEBUG ] (L48) 30 attempts (every 1 seconds)
verify        | [lega.utils.db][ INFO ] (L33) Initializing a connection to: db:5432/lega
verify        | [lega.utils.db][DEBUG ] (L261) Catching error on file id: 1
verify        | [lega.utils.amqp][DEBUG ] (L104) Sending ACK for message 1 (Correlation ID: 9e82a1c8-c95b-4862-8164-882c29be06ea)

@dtitov
Copy link
Author

dtitov commented Oct 1, 2018

@nanjiangshu

  1. It will work only with Data Out's keyserver: our Python keyserver is not supported at the moment, because it was not updated yet.
  2. You need to have the updated image for RES, which is not yet in Docker Hub. You can build it yourself locally, based on this branch: LocalEGA S3 and decryption fixes EGA-archive/ega-data-api#45

@nanjiangshu
Copy link

Hi @dtitov, Thanks for clarification. Then it goes to my second point. It is probably better use the ega-data-api/ tag for the (locally built) images instead of cscfi/, since the latter is not always up-to-date.

@nanjiangshu
Copy link

nanjiangshu commented Oct 1, 2018

Tested with both the OpenSSH inbox and the Mina inbox, ingestion test passed.
For the moment, only the EGA-keyserver works (not the Python keyserver)

This pull request requires the merging of EGA-archive/ega-data-api#45

@blankdots
Copy link

getting some inconsistent behaviour with the Outgestion similar to what we are getting on Travis e.g. https://travis-ci.org/NBISweden/LocalEGA/jobs/436193340#L1737
Steps to reproduce:

  1. clean existing ega-images from local
  2. either build images with make -C images all or proceed to 3.
  3. make bootstrap ARGS='--inbox mina --keyserver ega'- this will pull required images
  4. make up
  5. run Outgestion tests mvn test -Dtest=OutgestionTests -B

On first run I get the same error as on travis, on the following runs the tests pass.
By exploring RES service logs, there seems to be some errors:
error_logs_res_october_2018.log

Copy link

@blankdots blankdots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtitov Please squash the last 11 commits :D

Regarding the travis tests:

  1. I will test if we can speed this up by dividing the parallel jobs into different stages and if this solves our issues
  2. we will have to think if we outgrew travis and need a dedicated CI ( we will have two more services)
  3. we will nee to look at the memory consumption for RES and Keyserver

Copy link

@blankdots blankdots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should exclude res for the IngestionTests, this would improve the Travis time.
From the Robustness tests I think we should keep it or if we test the Robustness of the Data Out in ega-data-api repo, here we will also can test only the robustness of the Data Ingestion.

Testing the robustness of the whole setup should be performed on a testing environment along with end to end testing (meaning trigger an update on a custom server and run the tests there).

@blankdots
Copy link

Follow-up with https://github.com/EGA-archive/ega-data-api/ devs as the Keyserver and Res services each consume > 1GB of memory, and we still need to add two more of these.

Some optimisation if possible is required on the Data Out.

@nanjiangshu
Copy link

Hi @blankdots , is this 1GB RAM usage the peak consumption or constant?

@blankdots
Copy link

@nanjiangshu you can take a look for yourself

Scenario is as follows:

  1. start everything from docker
  2. wait to finish, e.g. services to be up
  3. run the integration tests
  4. wait for them to finish
  5. ponder to why it is like that

Peek 2018-10-03 15-54.webm.zip

Note: I could only upload as *.zip

@nanjiangshu
Copy link

Thanks @blankdots for sharing this video. It seems it is generally a problem with Java. The Mina Inbox uses also significantly more memory (at +400Mb) than others.

@dtitov
Copy link
Author

dtitov commented Oct 8, 2018

Guys, I wouldn't be to fast to say that it's "Java problem". There are some good explanations here: https://spring.io/blog/2015/12/10/spring-boot-memory-performance

What I can say right now it that at least each REST Spring Boot application contains built-in Web-server (Tomcat or Jetty), which is already a memory-consuming thing. Then there are some other things like the recently added Hystrix, that is residing in the memory and is doing its job in the background, etc.

At the moment, I don't see a problem in the fact that microservice consumes 1.5GB RAM - unless memory usage starts growing indefinitely which will signalize of a memory-leak (but we don't have that kind of behavior currently). Another problematic thing is Travis - but I would say that it's more of a Travis problem rather than application problem.

@blankdots
Copy link

blankdots commented Oct 8, 2018

We can take this discussion elsewhere, but it was just something I observed as part of debugging why tests failed on Travis, i do not blame it on anything, but I am curious if and what we can optimize.

The only concern, that i should have articulated is the trend of ~1GB RAM per service in Data Out, and we have 2 more services to add it will result in 4 -5 GB just to run the Data Out, and I do not think this is desirable, but it is what it is.

@dtitov
Copy link
Author

dtitov commented Oct 8, 2018

Well, as Alexander mentioned, with caching enabled only RES can consume up to 12GBs in their setup and it's considered to be okay... But of course, I agree that the lower memory consumption we can achieve, the better it is.

Copy link

@blankdots blankdots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Travis build seems to be stable, and passing without supervision and the RES was added and tested.

Still some issues i observed with the timeout e.g. https://travis-ci.org/NBISweden/LocalEGA/jobs/438959330#L1747 and https://travis-ci.org/NBISweden/LocalEGA/jobs/438959330#L1844 and the fact that the time to build is 25 min and the memory consumption is too high are not desirable, however these are not the subject of this issue and I think we should investigate them in another issue (up to a certain point as this is CI/CD seems to be a time black hole).

@dtitov
Copy link
Author

dtitov commented Oct 9, 2018

Great. Shall we merge it?

@dtitov dtitov merged commit c9aa621 into dev Oct 10, 2018
@blankdots blankdots deleted the feature/res branch October 23, 2018 07:22
viklund pushed a commit that referenced this pull request Nov 22, 2018
Add RES microservice to the Docker Compose setup
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants