Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

employ docker-compose v3.4 and shared logging config #152

Merged
merged 5 commits into from May 6, 2021

Conversation

fmigneault
Copy link
Collaborator

@fmigneault fmigneault commented Apr 21, 2021

export only the specific changes of docker-compose version and logging config from #114

Change overview:

  • Cap container logs to 500m. Before there was no limit to it could be logging indefinitively and eating up precious disk space.
  • Remove unused volume_from: mongodb from solr and ncwms2 container.
  • Bump docker-compose file version to 3.4 to support sharing logging config.

Matching PR for Ouranos specific config bird-house/birdhouse-deploy-ouranos#10

Fixes #10

Copy link
Collaborator

@dbyrns dbyrns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll wait for the tests to pass then it should be all good.

Copy link
Collaborator

@tlvu tlvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions. Not approving immediately because I'd like to try it on my dev server to see the difference with the new logging config but looks good overral, except the .gitignore part that I find unusual.

&default-logging
driver: "json-file"
options:
max-size: "200k"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have not set any of those config currently (driver, max-size, max-file). What's was their default?

It's very nice to have a unified logging config for all containers but can each container, if needed, not use this config and roll their own?

And to loop back to the original Weaver PR, Weaver need to "hook" into other components logs and that's why it prefers other components to have unified logging config or is there another reason to introduce this unified logging?

Just to be clear, I like it to have a default and unified logging config, I just want to understand its relation to Weaver.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker's default is json so that remains the same.
The difference here is that we add log rotation over 10 files.
This is a config we have found a long time ago when deploying on our instances since twitcher and magpie can generate an immense quantity of logs (on each request). This allows discarding them gradually instead of accumulating GB of useless logs.
Each container can define its own logging section by using the relevant config instead of the default-logging reference.
I don't think there is a nice way to automate that though. Best could be to have an optional-component, but I think it is better to control the logs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Weaver, it is a bit more tricky since a different docker-compose file is employed.
Depending on how docker-compose merges the files, the reference could or won't work across files.
Basically, if files are merged, and then references resolved, it would work. But if internally it uses PyYAML or similar lib, to read each file individually and merge them, Weaver's compose will raise a missing reference. So the config would need to be duplicated in its own file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, Weaver did not need this central logging and the log rotation to avoid accumulating infinite logs (which is indeed a nice thing).

From our recent "contributing guideline", this PR should not have been mixed with the Weaver PR to begin with. Or if mixed in, it should have been "well explained" that it is not required by Weaver?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is why it has been split by itself now that we updated the contribution guideline.
Please consider that Weaver PR was started in January and review only started in March, both long before we even started discussions of contribution guidelines.

@@ -213,9 +233,8 @@ services:
- "48983:9001"
volumes:
- /data/solr:/data
volumes_from:
- mongodb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understood from our meeting, the removal of volumes_from directive was because of the update to format version 3.4, not because we want it or Weaver needs it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. volumes_from is deprecated for >=3
There are a lot of improvements in docker-compose since v2, so better bump it while problems are still few.

This needs to be tested, but if I understand the compose correctly, that volumes_from directive specifically is useless, since mongodb doesn't even have a volumes section. Starting with Weaver though, I require a data persistence volume in mongo, so I would like to have this adjusted with docker-compose v3 before it other images are not using it anyway (and don't start breaking obviously).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this looks harmless but should be tested to be sure. Has you or Mat gone over the failed build to determine the error is not related to this? Currently only the ESGF notebook should be intermittently failing because their server seems flaky but everything else should be green.

docs/.gitignore Outdated Show resolved Hide resolved
tlvu added a commit to bird-house/birdhouse-deploy-ouranos that referenced this pull request Apr 23, 2021
Copy link
Collaborator

@tlvu tlvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please confirm failing tests are okay.

Please let me do this merge since I need to since this merge with my other PR bird-house/birdhouse-deploy-ouranos#10 else our prod won't even start.

@nikola-rados I am pretty sure you will also need to do something like my PR bird-house/birdhouse-deploy-ouranos#10 above on your side. I can also sync with you for this merge. Let me know.

&default-logging
driver: "json-file"
options:
max-size: "200k"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, Weaver did not need this central logging and the log rotation to avoid accumulating infinite logs (which is indeed a nice thing).

From our recent "contributing guideline", this PR should not have been mixed with the Weaver PR to begin with. Or if mixed in, it should have been "well explained" that it is not required by Weaver?

@@ -213,9 +233,8 @@ services:
- "48983:9001"
volumes:
- /data/solr:/data
volumes_from:
- mongodb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this looks harmless but should be tested to be sure. Has you or Mat gone over the failed build to determine the error is not related to this? Currently only the ESGF notebook should be intermittently failing because their server seems flaky but everything else should be green.

@fmigneault
Copy link
Collaborator Author

@tlvu
I have no problem waiting for your merge on this.
If anything, I prefer you do it so that it is validated against a full stack.

I'm not too sure what was the problem on the last build, it was stopped for some reason.
Manual trigger produced the following outputs:

=> http://daccs-jenkins.crim.ca/job/DACCS-iac/401
=> http://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/316

22:12:17  notebooks/hummingbird.ipynb ............                                 [  8%]
22:12:25  notebooks-auth/test_thredds.ipynb ..........                             [ 15%]
22:12:32  pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb .......        [ 20%]
22:13:06  pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb .FFFF.         [ 24%]
22:13:15  pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb ........       [ 30%]
22:13:17  pavics-sdi-master/docs/source/notebooks/WPS_example.ipynb ..........     [ 37%]
22:13:31  pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb .                 [ 37%]
22:13:32  pavics-sdi-master/docs/source/notebooks/jupyter_extensions.ipynb .       [ 38%]
22:13:40  pavics-sdi-master/docs/source/notebooks/opendap.ipynb .......            [ 43%]
22:13:44  pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb ....F       [ 46%]
22:16:52  pavics-sdi-master/docs/source/notebooks/regridding.ipynb ...FF.FF.FFF.FF [ 57%]
22:16:56  FFFFFF...FFFFF.                                                          [ 67%]
22:17:00  pavics-sdi-master/docs/source/notebooks/rendering.ipynb ....             [ 70%]
22:17:05  pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ......          [ 74%]
22:17:17  finch-master/docs/source/notebooks/dap_subset.ipynb ..........           [ 81%]
22:17:24  finch-master/docs/source/notebooks/finch-usage.ipynb ......              [ 86%]
22:17:51  finch-master/docs/source/notebooks/subset.ipynb ...FF....FFFFF....FF     [100%]

@fmigneault
Copy link
Collaborator Author

Just to confirm, it reran the whole thing, one time targeting master and the other docker-compose-update

master
http://daccs-jenkins.crim.ca/job/DACCS-iac/402
http://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/317

docker-compose-update
http://daccs-jenkins.crim.ca/job/DACCS-iac/403
http://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/318

Results are the same in both cases, as presented below.
Previous run of above #152 (comment) just hit a connection timeout with boreas.ouranos.ca that trickled down failing many tests.

22:43:28  notebooks/hummingbird.ipynb ............                                 [  8%]
22:43:35  notebooks-auth/test_thredds.ipynb ..........                             [ 15%]
22:43:51  pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb .......        [ 20%]
22:44:00  pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ......         [ 24%]
22:44:08  pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb ........       [ 30%]
22:44:10  pavics-sdi-master/docs/source/notebooks/WPS_example.ipynb ..........     [ 37%]
22:44:25  pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb .                 [ 37%]
22:44:25  pavics-sdi-master/docs/source/notebooks/jupyter_extensions.ipynb .       [ 38%]
22:44:31  pavics-sdi-master/docs/source/notebooks/opendap.ipynb .......            [ 43%]
22:44:35  pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb ....F       [ 46%]
22:46:28  pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 57%]
22:47:12  ...............                                                          [ 67%]
22:47:18  pavics-sdi-master/docs/source/notebooks/rendering.ipynb ....             [ 70%]
22:47:23  pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ......          [ 74%]
22:47:34  finch-master/docs/source/notebooks/dap_subset.ipynb ..........           [ 81%]
22:47:42  finch-master/docs/source/notebooks/finch-usage.ipynb ......              [ 86%]
22:48:08  finch-master/docs/source/notebooks/subset.ipynb ...FF....FFFFF....FF     [100%]

This was referenced Apr 29, 2021
@tlvu
Copy link
Collaborator

tlvu commented May 5, 2021

@nikola-rados I am pretty sure you will also need to do something like my PR for Ouranos custom config bird-house/birdhouse-deploy-ouranos#10 your side. Let me know once you are done so we can sync this merge.

@nikola-rados
Copy link
Collaborator

@tlvu I have updated our config.

@tlvu
Copy link
Collaborator

tlvu commented May 6, 2021

@tlvu I have updated our config.

@nikola-rados Thanks so I'll merge this during the day tomorrow so you can go-live manually if required.

@tlvu tlvu merged commit 0179962 into master May 6, 2021
@tlvu tlvu deleted the docker-compose-update branch May 6, 2021 19:00
tlvu added a commit to bird-house/birdhouse-deploy-ouranos that referenced this pull request May 6, 2021
update docker-compose file version to 3.4

To match bird-house/birdhouse-deploy#152, otherwise our deployment will not start.
tlvu added a commit that referenced this pull request May 6, 2021
@tlvu
Copy link
Collaborator

tlvu commented May 6, 2021

Default logging retention should have been bumped to something more reasonable for prod. I forgot to push the bump here so bumping in PR #165

tlvu added a commit that referenced this pull request May 6, 2021
…on-to-something-more-reasonable-for-prod

bump default log retention to 500m instead of 2m, more suitable for prod

# Overview

Bump default log retention to 500m instead of 2m, more suitable for prod

Forgot to push during PR #152.

## Changes

**Non-breaking changes**
- Bump default log retention to 500m instead of 2m, more suitable for prod
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Free disk space problem due to docker images logging
5 participants