Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
API uptime monitoring #91
Hi @Alan-R, @Birne94, @BusinessFawn, @cetanu, @megetron, @funkyfuture, @WhileLoop, @zhaogp, @sdementen, @slmingol, @aweidner, @justinfay, @vit-goncharov, @jones77, @Fosity, @ppr-A320, @raghavakora, @DeadDuck, @hugogu, @lpodl,
We're investigating if there is a demand for additional tools in the tavern ecosystem at the moment. Uptime monitoring using tavern is looking the most exciting at the moment.
You'd use you existing Tavern tests to define uptime checks with a CLI based CI/CD deployment, keeping your uptime tests in sync with your code changes. The service would generate a performance overview dashboard, uptime page and also alerts when there was a problem.
Some other services that offer this kind of thing are Postman's monitoring, API fortesss or Runscope, but we think the clarity of Tavern's syntax, the multi-stage tests and the chance to re-use your integration tests could make it a substantial improvement on those tools.
Does this sound like a useful tool/service? If not, how do you monitor API's at the moment?
If you'd rather not reply here, please feel free to drop me an email on email@example.com.
Finally - Thanks so much for using Tavern and helping to build a community around the tool! We're looking at value add services around tavern so that we can spend more time working on the open source product and we hope that new products and tools built around tavern will help the main project thrive!
Our main use case for tavern so far is integration testing our backend services and how they interact with each other. I have played with the thought of having the tests run against a staging or even production server, but refrained from doing so for several reasons:
These reasons of course only result from my rather limited experience in the operational aspect of software engineering, so if there are any misconceptions here or any concepts I should know about, I would really like to know!
Since we deploy all of our services on AWS, we use their integrated monitoring tools for now. Each api server provides a simple
For catching issues in production, we have set a collection of alarms which include:
These alarms, together with stack traces from sentry are collected and pushed to a slack channel. If anything breaks or behaves abnormally, we usually receive a notification within seconds.
Additionally, AWS allows creating dashboards of different metrics which I have opened on a spare monitor most of the time. If I catch anything suspicious, I can usually investigate in no time.
I can see uptime monitoring for tavern become a valuable asset when building a system of many (internal or external) services which need to communicate with each other. Such a system provides many point of failures, one services might affect many others and so on. Getting early notifications if any of this happens (ideally before a user encounters this problem) is crucial, so short-interval integration tests look promising to me!
Overall, I am very happy with the development and community of this project and - even though I haven't had much time recently to play around with it - would really like to see it grow.
We use tavern for local acceptance tests for microservices and some of our code-driven reverse proxies. In addition we currently use Tavern to perform post-deployment verification (PDV) in various environments including production, where we utilize canaries.
We have a lot of monitoring systems in place currently, and most of the things that tickle us at night-time and lead to real problems are about raw metrics such as memory, cache, CPU, error rates, latency.
We would consider it, but we're unsure if it's going to change our lives at this point. @Birne94 makes a good point regarding analytics/log pollution as well, but this wasn't an immediate concern for myself.
We generally just run our PDV once after a deployment, and locally as part of our builds, and we're happy with that and haven't thought about performing them continuously on live systems.
Probably my biggest concern is the implications for the tavern library if such an application were made, I imagine there would be additions to suit the new application. We are wary of this. We like the lightweight nature of the library in it's current state. I'd probably like those things to go through a more democratic process (if that is even possible).