Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added datadog-metrics package #63

Closed
wants to merge 3 commits into from

Conversation

theburningmonk
Copy link
Contributor

@theburningmonk theburningmonk commented Jun 23, 2019

What did you implement:

Closes #51

How did you implement it:

Moved the datadog-metrics package into this repo, and updated the HTTP client to use it instead. Behaviour-wise there's no breaking changes.

Todos:

  • Write tests
  • Write documentation
  • Fix linting errors
  • Make sure code coverage hasn't dropped
  • Update the messages below

Is this ready for review?: YES
Is it a breaking change?: NO

Copy link
Contributor

@ChrisWestcottUK ChrisWestcottUK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to open source the datadog metrics package separately?

The HTTP client could be optionally passed an object with interface for metrics require to measure. This would mean that don't have to force users to use datadog but another provider using some very simple shim code.

Even if you do bring in this library, do we need the synchronous version at all? We could remove the dependency on datadog-metrics without it and keep it as simply StatsD (+ datadog extensions)

@theburningmonk
Copy link
Contributor Author

theburningmonk commented Jun 25, 2019

@ChrisWestcottUK it would be better to just open source that package, but as I explained in #51 the process for open sourcing new projects at DAZN is still pretty slow - and without it the HTTP client is only usable for DAZN employees.

Thought about making it take a shim as well, but decided to leave it for another PR.

I actually left out the synchronous version at first, happy to take it out if you guys think we're better off without it. At the moment, the default behaviour is to write to DogStatsD to CW logs already, so you'd have to explicitly say you want the synchronous version, and the datadog-metrics package should be lazily loaded so wouldn't add to your cold start if you just stick to the default, async metrics tracking.

@ChrisWestcottUK
Copy link
Contributor

Thank you @theburningmonk and I'm happy with that. Maybe the approach of get it out there first then work on it is a better one.

As a side note - I'd prefer not using process.env to pass config to this modules too, but we can iterate on that later maybe.

@gsingh1
Copy link
Contributor

gsingh1 commented Jun 27, 2019

It seems to me that using Datadog metrics with the http-client is quite a specific use case just for DAZN. As this project has now been open sourced, I don't think we should assume that all users of this package will be using Datadog and it feels like we are only bringing in this package for this one use case. At the least, there should be some seperation of concerns.

Perhaps it should be up to the user to decide if they want metrics at all, and if they do, they can roll their own or through other modules.

We'd still have to find a way to gracefully handle an alternative though if we do go down the route of not including Datadog metrics in the http client.

@theburningmonk
Copy link
Contributor Author

@gsingh1 On the first point about whether to record custom metrics from the HTTP client. It is a good practice that folks should be doing to harden around the system integration points. You need to monitor the performance (both latency and error rate) of systems that you depend on, how else would you know which IO op is the culprit when your function runs sluggishly, or times out, or errors? You need to leave those breadcrumbs so you can debug issues that happen in production.

If anything, we oughtta do more to harden these integration points, e.g.:

All these, are leading practices that we should support, or at least make it easy for users to adopt.

On a related note, if you haven't already, you should read Michael Nygard's Release It, and check out Hystrix which codifies a lot of the patterns (and many more) from Nygard's book.

That is in line with the goal of this project - to make productionizing serverless applications easy by helping you do the right thing by default.

I agree we should add options to disable the custom metrics, but the default should be to record the custom metrics from the HTTP client.

On the second point about the DogStatsD syntax. If you're going to support async tracking (by writing to CW Logs) then you'd have to settle on a format regardless. If you choose a custom format then everybody needs to parse it. Using the DogStatsD format then at least people who use DataDog don't have to parse it themselves. Again, we can make it customizable so you can add your own module for how you want to track custom metrics (or to disable it) but that's for another PR.

And lastly, as I mentioned in #51, removing metrics tracking would result in a breaking change that would cause DAZN people's dashboards, etc. to stop working. And I don't see a good argument for removing metrics tracking apart from making it easier for us to ship a version of the HTTP client that doesn't depend on the dazn-datadog-metrics package (hence why I did this PR).

@gsingh1
Copy link
Contributor

gsingh1 commented Jun 29, 2019

@theburningmonk Indeed it is true that we should all be monitoring our serverless applications with enhanced monitoring. Yes it is best practise and we should support this as much as we can, but to a degree. I am not aware of any other http clients that record custom metrics out of the box.

However, in the confines of the lambda-powertools-http-client, where its main purpose is to act as the http client in addition to decorating the request with correlation ids which is also a core part of this project, it just feels as if we forcing the datadog metrics upon the user. Maybe, returning timings is enough and gives users the chance to handle their own instrumentation?

Other libraries have some interesting approaches to deal with timing measurements, e.g:

The above examples make the timing metrics available to be used for instrumentation and whatever else the user desires but don't get in the way of enforcing users to use a specific vendor or format.

Another alternative is to make the datadog-metrics package an optional dependency in the meantime (if more discussions are required and as to not delay the breakage) and handle the exclusion in code.

That being said, I am more than happy to have this package included as part of powertools but just wanted to offer my opinion and provoke a thought for the wider users of the package outside DAZN to whom the datadog metrics may be obsolete.

@theburningmonk
Copy link
Contributor Author

@gsingh1 is your objection the fact that we're recording custom metrics out of the box, or that the custom metrics are currently recorded with DogStatsD format?

For the first point, I strongly believe that we should continue to record custom metrics, by default.

When you said:
image
This wouldn't be how I'd categorize what we're doing with the HTTP client as well as the rest of the packages here.

As stated in the design goals for this project:
image
The goal for this project, including the HTTP client, is to give you an opinionated set of tools that does the right thing (as according to our opinion of what's required for a production-ready app) by default. Tracking correlation IDs (and therefore building observability into the system) is certainly a big part of that, but not the only thing - monitoring all your integration points, and error handling, and having timeout, etc. are all part of "doing the right thing".

We have different design goals to other HTTP clients, which are general purpose client libraries for making HTTP requests, and they don't make assumptions about what you do to harden those HTTP interactions, so they give you the flexibility to do whatever you want. That's not our design goal.

We have an opinion about what you need to do to harden those HTTP interactions, and that's why we take a battery-included approach and have a much narrower focus. I certainly wouldn't want us to end up writing a general purpose HTTP client, so features like interceptors should be out of the scope. If you (as in, the user of this library) don't agree with our opinion of what's right (in terms of whether to record custom metrics, etc.), then you can still use your fav HTTP clients and use the other tools, like the correlation-ids module, and forward the correlation IDs via HTTP headers yourself.

For the second point, as I mentioned in my previous response, we gotta pick a format somehow and we picked what DAZN is using today - DataDog. I think we should offer the option to record the custom metrics in your own format (for ease of parsing and what not), but not right now. As things stand, non-DAZN users can't even install the HTTP client that's starving us of the external feedback we need to figure out how to make it actually useful for non-DAZN people.

I could have missed something, but I saw the options (to resolve the pressing issue of the HTTP client not being externally accessible) as:

  1. dropping custom metrics tracking
  2. open source the dazn-datadog-metrics package
  3. port the dazn-datadog-metrics to this project
  4. give you option on how to record custom metrics

I don't agree with option 1 because of the aforementioned design goals, and that it'll be a breaking change for internal DAZN users.

Option 2 would be great, but is likely going to take too long, unless one of you are happy to figure it out?

Option 4 would still require dazn-datadog-metrics for all internal DAZN users, and it would be a breaking change if we don't do it by default (as in, if we require extra configurations, and therefore code change to record custom metrics in DogStatsD format). Again, I'd like to avoid breaking change until we've had some external feedback to help decide future direction for this thing.

which kinda left me with option 3, which is where we are. Do you have other suggestions in terms of immediate actions we can take instead?

@simontabor
Copy link
Contributor

We might be able to reach a happy middle-ground here...

  • Allow metrics to be passed as an option. This should be an object with increment and histogram methods. This approach works for all use-cases - most of the time you can simply pass an instance of statds/hot-shots/datadog-metrics/whatever directly through, but custom implementations are easy
  • Move our datadog-metrics module to an optional dependency and make it the default option for metrics

After these steps, we'll unblock everyone and prevent breaking changes both internally and externally. We can still highly recommend that people pass metrics as an option.

Then we can take some time to properly review the http-client (and datadog-metrics) and the long-term approach we want to take. Personally, I don't think we should be building our own http client, and I also don't think we should force people to use Datadog, but these discussions shouldn't block fixing the existing solution for other people.

@theburningmonk
Copy link
Contributor Author

@simontabor do you mean keeping the existing behaviour of writing metrics in DogStatsD format, and give you a way to override it? That I believe we should support, and does not introduce a breaking change.

Let's talk about how far we go with taking the plumbing with the HTTP client in the other thread. The idea is not to build our own HTTP client, but just a wrapper around an existing one and maintain the narrow use cases we currently provide with the http-client module.

@theburningmonk
Copy link
Contributor Author

no longer necessary now that the datadog package is open sourced

@theburningmonk theburningmonk deleted the feature/add_datadog_metrics branch July 25, 2019 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dazn-datadog-metrics is not public
4 participants