Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog Replacement - Roll out New Relic #10997

Closed
Tracked by #11304 ...
lin-d-hop opened this issue Jun 13, 2023 · 18 comments
Closed
Tracked by #11304 ...

Datadog Replacement - Roll out New Relic #10997

lin-d-hop opened this issue Jun 13, 2023 · 18 comments
Assignees
Labels
priority We focus on this issue right now

Comments

@lin-d-hop
Copy link
Contributor

lin-d-hop commented Jun 13, 2023

Description

Datadog is one of our more expensive tools. We value it, but we just don't use it enough to justify.
Is there another tool that will serve our needs at a lower cost?

Acceptance Criteria & Tests

  1. Research potential alternatives
  2. Compare features and costs and make a proposal
  3. Get approval from budget holders
  4. Remove our paid Datadog Plan manually by June 28
  5. Set up OFN instance on new tool
  6. Document new tool
@Matt-Yorkley
Copy link
Contributor

Alright, after reviewing a lot of options I've settled on two that I think are viable. The difficulty here is that these things are generally quite expensive and they're aimed at medium-large enterprises that usually have a fair bit of money to spare, but we're in the unenviable position of being surprisingly big in scale but small in budget...

The current costs of our monitoring are $2500 per year plus ~$25 per month so somewhere around $2800 in total.

Option 1: Sentry.io (SaaS)

Sentry offers the holy trinity of logs, metrics and APM and they have a pretty decent pricing structure. I think we could go for the "teams" option, which is $26/month that comes with a certain usage allowance, plus additional cost based on usage. It's pretty difficult to predict what that "maybe extra" amount might be, but maybe it would up something like $30 - $40 per month in total.

They also have a free offering for Open Source non-profit projects here: https://sentry.io/for/open-source/ which we could apply for.

If we apply and get it, I think we should go with this option, otherwise if we don't get it I think it's a toss up between this and option 2...

Option 2: ElasticStack (self-hosted)

I looked at this a couple of years back and put together a working implementation and it seemed pretty good. At the time we were blocked from using the APM because it had a minimum requirement for Rails and Ruby versions that we didn't meet, but now we're on Rails 7.x and Ruby 3.x that's not an issue.

It would cost maybe $28/month for a DO droplet but we'd also have to fix it if/when it breaks. I don't think it'd be much maintenance. It could possibly replace some other tools like uptime monitoring, maybe error monitoring, and some other things (there's a lot of features), and there would be no extra cost for usage or for monitoring additional servers.

@mkllnk
Copy link
Member

mkllnk commented Jun 18, 2023

Sentry seems values aligned and even if we had to pay then the cost wouldn't be that different to a self-hosted stack. Self-hosting always adds work for us. I often prefer the self-hosting when it's based on open source while the commercial offers are totally closed but Sentry seems to be very open. I would like to support that.

My vote goes for Sentry.

@jibees
Copy link
Contributor

jibees commented Jun 19, 2023

My vote goes for Sentry as well, for the main reason they have an open-source sponsorship plan for free and I don't want to add tasks to team with another self hosted application.

@abdellani
Copy link
Member

I also vote for Sentry.

@dacook
Copy link
Member

dacook commented Jun 20, 2023

I think the preference is clear, but I agree with Sentry too. Seems like we are eligible for the free plan, and even if not, it's still way cheaper than Datadog.

@Matt-Yorkley
Copy link
Contributor

Hmmm interesting, so we can only apply for the Open Source thing after we've signed up, so I'm signing up to a 14 day trial now.

Secondly, Sentry also does error reporting so we could potentially replace both Bugsnag and Datadog?

@mkllnk
Copy link
Member

mkllnk commented Jun 20, 2023

I really like Bugsnag but it would be nice to have it all in one place. So if Sentry is just as good then we can switch.

@filipefurtad0
Copy link
Contributor

I really like Bugsnag but it would be nice to have it all in one place. So if Sentry is just as good then we can switch.

I've just noticed a performance tab on Bugsnag:

image

@dacook
Copy link
Member

dacook commented Jul 19, 2023

Just wondering about current status, so I've pulled together an update on what I can find:

Delivery Circle: Matt to write up a list of features required.

We have abandoned sentry:

I tried out Sentry on a free trial account and found it doesn't really do logs or metrics, so I'm not sure if it quite fits the bill...

Delivery Circle: next planned step is to try new relic, price is now quite reasonable > will put together an issue for it, maybe for Gaetan to pick up and pair

Delivery Circle: Gaetan prioritising setting up of New Relic

@dacook
Copy link
Member

dacook commented Jul 19, 2023

@Matt-Yorkley, in talking with @mkllnk we realised it would be helpful to be able to document what we need, alongside what is available, so I thought about putting together a matrix like this: https://docs.google.com/spreadsheets/d/1unIp7TPny81wylu2yFyrFn-i_2n0A2lAnCL5uGFxvcs/edit (anyone should be able to edit with the link).
Screen Shot 2023-07-19 at 11 32 50 am

Firstly, would you be able to update the list of requirements on the left (add/remove/break into smaller units)?
Then would you be able to fill out the matrix where possible? Hopefully this will help make it clearer to all what we're need to achieve and what the best option(s) will be.

@rioug
Copy link
Collaborator

rioug commented Jul 24, 2023

New Relic has some infrastructure monitoring : https://newrelic.com/platform/infrastructure. which can be set up via Ansible : https://docs.newrelic.com/docs/infrastructure/install-infrastructure-agent/config-management-tools/configure-infrastructure-agent-using-ansible/
I am not sure if it's something we need, and I don't know much about Ansible but I am happy to have a go if needed.

@lin-d-hop
Copy link
Contributor Author

lin-d-hop commented Aug 15, 2023

Some notes on the next steps to get this wrapped up.

1. Have a look at New Relic on AU. Specifically considering comparison to data-dog. Can we do everything we want?

If we don't get Thumbs up then abort, we're back to the drawing board. What is a drawing board?

2. Add payment details to New Relic

  • Decide which instance is paying
  • Decide if we are rolling out to all managed instances or only Core instances. Costs involved?
  • Set up payments

3. Roll out to all instances

  • All included servers provisioned to add New Relic.

4. Document

  • Add some wiki documentation to help other use and configure in the future

@lin-d-hop lin-d-hop changed the title Datadog Replacement Datadog Replacement - Roll out New Relic Aug 15, 2023
@lin-d-hop lin-d-hop mentioned this issue Aug 15, 2023
6 tasks
@rioug
Copy link
Collaborator

rioug commented Aug 18, 2023

  1. Have a look at New Relic on AU. Specifically considering comparison to data-dog. Can we do everything we want?

Quick note on this, New Relic also offers platform monitoring which we don't have set up right now ( They offer a third party ansible script so it should easy to add to ofn-install). I can't remember where it was discussed, but we agreed to just go with the basic for now, and look into adding more monitoring as we need.

@mkllnk
Copy link
Member

mkllnk commented Aug 21, 2023

I had a quick look at the current monitoring and it's quite comprehensive. We do need the infrastructure monitoring though to monitor memory usage.

In terms of pricing, it looks like we are using roughly 50% of our free plan at the moment. It will be interesting to see how that looks when we add infrastructure monitoring. If we keep it only on one server then we can have New Relic for free. We can also open one account per instance to get it for free. But then we have that annoying thing of needing to look at different accounts.

If we go down the paid option, we are probably going to pay around $15 - $30 per server. We can reduce the amount of data going in to reduce the cost but it adds up quickly.

I would probably sign up for the paid account, add all managed instances for standard monitoring and maybe one instance for the infrastructure monitoring.

Oh, hey, we should sign up for a non-for-profit account first. We may get all we need for free. Which organisation credentials do we use for this?

@mkllnk
Copy link
Member

mkllnk commented Aug 28, 2023

I opened a new account and will see what that will yield.

@mkllnk
Copy link
Member

mkllnk commented Aug 28, 2023

I tried to use the Ansible role to install the agent but our Ansible version is too old. I will try another method to test but we also need to update ofn-install. It's currently in a bad state, tests not running, out of date dependencies...

@mkllnk
Copy link
Member

mkllnk commented Oct 9, 2023

Progress so far:

  • I wrote a little role to install New Relic. It has been installed on six servers so far.
  • Our account (that Gaetan created) has not-for-profit status which allows us to ingest 1,000 GB of data.
  • Alerts are set up and notify the Slack channel #devops-alerts.
  • The ofn-install build has been partially fixed.

Next steps:

  • Completely fix the ofn-install build.
  • Monitor all managed servers.
  • Add all servers to alerts.

@mkllnk
Copy link
Member

mkllnk commented Oct 11, 2023

I think I completely all the acceptance criteria. Documentation:

https://github.com/openfoodfoundation/ofn-install/wiki/Server-monitoring#new-relic

@mkllnk mkllnk closed this as completed Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority We focus on this issue right now
Projects
Archived in project
Development

No branches or pull requests

8 participants