Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Datadog system agent #10380

Merged
merged 45 commits into from Dec 8, 2021
Merged

Add Datadog system agent #10380

merged 45 commits into from Dec 8, 2021

Conversation

markus-hinsche
Copy link
Contributor

@markus-hinsche markus-hinsche commented Nov 24, 2021

Motivation: For the benchmarking project, we want to collect CPU utilization and memory usage. Tracking these metrics manually introduces a lot of custom code (e.g. mprof, nvidia-smi, top) cluttering the test. To avoid this, we can have a datadog agent running in the background which reports the metrics every so many seconds.

Proposed changes:

  • add Datadog agent to send system (e.g. CPU+memory) metrics to Datadog
  • introduce a bash script that can be called from Github Actions yaml (avoid code duplication)
  • add NVML integration (if ACCELERATOR_TYPE=GPU)

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@github-actions
Copy link
Contributor

github-actions bot commented Dec 7, 2021

Commit: 9253866, The full report is available as an artifact.

Dataset: financial-demo, Dataset repository branch: fix-model-regression-tests (external repository), commit: 52a3ad3eb5292d56542687e23b06703431f15ead
Configuration repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m5s, train: 2m7s, total: 3m12s
1.0000 (0.00) 0.8800 (0.00) no data

@github-actions
Copy link
Contributor

github-actions bot commented Dec 7, 2021

Commit: 5a4c8a2, The full report is available as an artifact.

Dataset: financial-demo, Dataset repository branch: fix-model-regression-tests (external repository), commit: 52a3ad3eb5292d56542687e23b06703431f15ead
Configuration repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m33s, train: 3m17s, total: 4m49s
1.0000 (0.00) 0.8800 (0.00) no data

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

Commit: ae38aad, The full report is available as an artifact.

Dataset: financial-demo, Dataset repository branch: fix-model-regression-tests (external repository), commit: 52a3ad3eb5292d56542687e23b06703431f15ead
Configuration repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m25s, train: 2m56s, total: 4m21s
1.0000 (0.00) 0.8800 (0.00) no data

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

Hey @markus-hinsche! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot" (NLU)
# - "Hermit" (NLU)
# - "Private 1" (NLU)
# - "Private 2" (NLU)
# - "Private 3" (NLU)
# - "Sara" (NLU, Core)
# - "financial-demo" (NLU, Core)
# - "helpdesk-assistant" (NLU, Core)
# - "insurance-demo" (NLU, Core)
# - "retail-demo" (NLU, Core)

##########
## Available NLU configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

##########
## Available Core configurations
##########
# - "Rules"
# - "Rules + AugMemo"
# - "Rules + AugMemo + TED"
# - "Rules + Memo"
# - "Rules + Memo + TED"
# - "Rules + TED"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'main'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##
## Shortcuts:
## You can use the "all" shortcut to include all available configurations or datasets.
## You can use the "all-nlu" shortcut to include all available NLU configurations or datasets.
## You can use the "all-core" shortcut to include all available core configurations or datasets.

include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

/modeltest

include:
 - dataset: ["financial-demo"]
   config: ["Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"]

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

Commit: 208d68b, The full report is available as an artifact.

Dataset: financial-demo, Dataset repository branch: fix-model-regression-tests (external repository), commit: 52a3ad3eb5292d56542687e23b06703431f15ead
Configuration repository branch: main

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
Sparse + BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m23s, train: 3m35s, total: 4m57s
1.0000 (0.00) 0.8800 (0.00) no data

@markus-hinsche
Copy link
Contributor Author

This is ready to be merged from my side, but I can't merge yet because @tczekajlo requested changes

@markus-hinsche markus-hinsche merged commit 402798b into main Dec 8, 2021
@markus-hinsche markus-hinsche deleted the datadog-system-agent branch December 8, 2021 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants