Hello,
Could you please add support of metrics below. (Or provide guide how to use them if they already exist.)
- Cost
- Total cost
- Cost per request
- Cost per feature – Could we identify somehow that the bunch of requests are related to the one feature?
- Token consumption
- Performance
- Latency, throughput, CPU, memory – I guess should be out of the box from container metrics, right?
- CPU, memory utilization per request
- Token usage
- Reliability
- Error rate and/or retry rate
- Success rate
- Uptime - Btw do we have any health checks now?
Finally, based on these metrics we want to have alerting, e.g.:
- High error rates
- Cost spikes
Hello,
Could you please add support of metrics below. (Or provide guide how to use them if they already exist.)
Finally, based on these metrics we want to have alerting, e.g.: