Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog agent integration #2824

Merged
merged 6 commits into from
Sep 22, 2021
Merged

Datadog agent integration #2824

merged 6 commits into from
Sep 22, 2021

Conversation

timotheeg
Copy link
Contributor

@timotheeg timotheeg commented Sep 20, 2021

Context

We are implementing DataDog as a central monitoring system. The FormSg AWS account has been integrated to DataDog already, but more needs to be done:

  • Run DataDog Agent on our servers
  • Install various DataDog connectors (e.g. for express and winston)
  • Report custom data as needed

This PR only install the DataDog agent on the FormSG servers.

Approach

Run the DataDog agent on the FormSG server. It will report host system metrics.

The EB environment needs 2 new environment variables: DD_API_KEY (see here) and DD_ENV (staging|production).

Tested on staging, sample data here:
https://app.datadoghq.com/dashboard/b3u-egb-apu/formsg?tpl_var_env=staging&from_ts=1632260114510&to_ts=1632274514510&live=true

Note: The DD agent is configured to accept non-local traffic to allow host-Docker communication. Care should be taken to never expose the DogStatsD port 8125/8126(?) to other hosts, or the internet.

Deploy Notes

  • Early (e.g. 730am or 8am) add the 2 environment variables below (cause EB redeploy, but new env vars are not used yet)
  • Deploy new as normal

New environment variables:

  • DD_API_KEY : FormSG specific API key for DataDog
  • DD_ENV: staging or production

New dependencies:

@mantariksh
Copy link
Contributor

Early (e.g. 730am or 8am) add the 2 environment variables below (cause EB redeploy, but new env vars are not used yet)

@timotheeg can I check why we need to do this? adding environment variables is safe, we do it regularly (e.g. to update site banners)

@timotheeg
Copy link
Contributor Author

timotheeg commented Sep 21, 2021

Early (e.g. 730am or 8am) add the 2 environment variables below (cause EB redeploy, but new env vars are not used yet)

@timotheeg can I check why we need to do this? adding environment variables is safe, we do it regularly (e.g. to update site banners)

As we discussed earlier, this was me being paranoid. Changing env vars causes an EB deployment, which means old instances are put in drain mode. That is safe, provided we do not have long running requests which exceed the drain timeout (I haven't checked what's the drain timeout for our setup btw). Such long requests can break, so doing an early change when there's low traffic is just reducing risks. Since the majority of our requests are short, that note may indeed be overkill. But ideally, we have predictable response times for all requests (or built-in retries mechanisms for known long requests), guaranteeing that changing env vars is indeed completely safe at any time.

Btw, the same issue of possibly breaking long requests applies for all deploys.

@timotheeg timotheeg force-pushed the datadog_agent_integration branch 2 times, most recently from dde0339 to 70448a7 Compare September 22, 2021 01:21
@liangyuanruo
Copy link
Contributor

I would also note that enabling non-local traffic to allow host-Docker communication also means that we should never expose the DogStatsD port 8125/8126(?) to the internet.

@timotheeg
Copy link
Contributor Author

I would also note that enabling non-local traffic to allow host-Docker communication also means that we should never expose the DogStatsD port 8125/8126(?) to the internet.

That is correct. I don't think think we would ever open these ports (?). In fact they should not even be opened to any other hosts in the network.

Copy link
Contributor

@tshuli tshuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm except one nit

src/app/loaders/express/index.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@mantariksh mantariksh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one question

src/app/loaders/express/index.ts Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants