Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 [Feature] Log download stats from THREDDS server #444

Open
huard opened this issue Apr 5, 2024 · 3 comments
Open

💡 [Feature] Log download stats from THREDDS server #444

huard opened this issue Apr 5, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@huard
Copy link
Collaborator

huard commented Apr 5, 2024

Description

It would be useful for reporting purposes to monitor data downloads from THREDDS:

  • total volume of downloads (Gb/day);
  • per-file download volume (Gb/day);
  • per-datasets opendap streaming volumes (Gb/day);

References

This information can be parsed from NGINX logs, but those logs need to be exposed to Prometheus to be aggregated and archived within the current architecture.

Possible solutions:

Additional info

See also:

Concerned Organizations

@huard huard added the enhancement New feature or request label Apr 5, 2024
@fmigneault
Copy link
Collaborator

Consider downloads from WPS outputs and STAC data proxy endpoints as well for the same reasons.

@huard
Copy link
Collaborator Author

huard commented Apr 26, 2024

ESGF uses Beats and Logstash to collect logs and compute their stats. See https://drive.google.com/drive/folders/1LbvoYeQ_6L_bzTsO-EEhwqjIx1jZ-G1k

@fmigneault
Copy link
Collaborator

If the "node collector" can be located on the same instance, logstash seems like an interesting candidate. If there is no distinction between beats or logstash as "log producers", I would favor the 2nd architecture to limit the number of configurations/technologies involved.

mishaschwartz added a commit that referenced this issue May 14, 2024
## Overview

This version of canarie-api permits running the proxy (nginx) container
independently of the canarie-api application. This makes it easier to
monitor the logs of canarie-api and proxy containers simultaneously and
allows for the configuration files for canarie-api to be mapped to the
canarie-api containers where appropriate.

## Changes

**Non-breaking changes**
- New component version canarie-api:1.0.0

**Breaking changes**

## Related Issue / Discussion

- Resolves [issue id](url)

## Additional Information

Links to other issues or sources.

- This might make parsing the nginx logs slightly easier as well which
could help with #12 and #444

## CI Operations

<!--
The test suite can be run using a different DACCS config with
``birdhouse_daccs_configs_branch: branch_name`` in the PR description.
To globally skip the test suite regardless of the commit message use
``birdhouse_skip_ci`` set to ``true`` in the PR description.
Note that using ``[skip ci]``, ``[ci skip]`` or ``[no ci]`` in the
commit message will override ``birdhouse_skip_ci`` from the PR
description.
-->

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants