New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caseflow metrics for OIT #11198
Comments
Note: learned that OIT/OMB did not approve the metrics Chris suggested. Here is what was approved.
|
Questions clarified:
|
We track many of these in DataDog. e.g. https://app.datadoghq.com/dashboard/jcd-hng-7gh/caseflow-service-level-indicators |
Is it worth considering this issue an epic with stuff assigned to it? I ask because it seems like there should be a link between #10542 and this one. |
subsumed by #11994 |
Metrics we need to share with OIT/OMB
AMA appeals decisions - 132 - https://caseflow-looker.va.gov/looks/195
The metrics below this line were not actually the ones approved by OIT/OMB. See above
Caseflow Metric
1. System availability (percentage), measured as the percentage of HTTP responses returned with a non-500 status code
Original metric: “System availability (percentage)”
Revised metric: “System availability (percentage), measured as the percentage of HTTP responses returned with a non-500 status code”
Reporting: Monthly, with a target of 99.9% or higher. The DevOps team will create a dashboard to report this number using DataDog, Grafana, or similar.
Caseflow recommendation: This is an important measure of a system’s performance. There are two factors left ambiguous here, however. First, it is not clear whether downtime that results from a system dependency should count against this measure. Second, even when the system and its dependencies are listed with an available status, a bug or unknown outage in a dependency could result in an error being shown to the user.
Our view is that a measure of availability should include downtime as a result of dependencies, even when the system being measured is not at fault, because this reflects the experience of the system’s users, and could lead the team to take steps to insulate the system from dependency unreliability. For the same reason, we believe that a single measure of availability should be designed to measure errors propagated to the user, even when the system and its dependencies are nominally available.
By measuring at the load balancer the percentage of HTTP responses returned with a non-500 status code, we can understand the rate at which an error in the system or one of its dependencies is visible to a user. Note that this measure is contingent on proper semantic use of HTTP status codes, which is a practice of the Caseflow development team.
2. Average page load time (in seconds)
Reporting: Monthly, with a target of 3 seconds or less. Available through existing integration of New Relic.**
Caseflow recommendation: We concur with this measure as is. It is important to note, however, that depending on the architecture of any two systems, this measure may not present an apples-to-apples comparison. Caseflow is a Single-Page Application (SPA). This means that much of the application code is provided to the user’s browser on the first load, with additional client-server communication facilitating data transfer or updates. This is different from an application that re-renders a page of the application on the server on every request and provides just that page to the user’s browser.
3. Adoption of Caseflow Reader for appeal decisions (percentage)
Original metric: Average time (days) to process an appeal
Revised metric: “Adoption of Caseflow Reader for appeal decisions (percentage)”
*Reporting:*Monthly, with a target of 98% or more. This metric is already included in the Caseflow Product Impact Statement, and Chris Given will port it to Looker for ease of access.
Caseflow recommendation: There is no single measure of an appeal’s processing time. Caseflow tracks all VA decision reviews, not just Board Appeals. Even for appeals, there are multiple different dockets that the appellant might select, each of which will entail a different duration of wait before being distributed to a judge for a decision.
An outcome-oriented metric that is more directly connected to the value that Caseflow offers is in usage of Caseflow Reader by attorneys and Veterans Law Judges. Reader is designed for these users to help them more efficiently complete time-consuming tasks related to evidence review and annotation as they draft decisions on appeals. Using Reader is not mandatory, and attorneys and VLJs can access the same information through VBMS if they prefer. Adoption of Caseflow Reader thus measures the extent to which Caseflow is delivering a better tool that empowers VA employees to improve the timeliness of the appeals process.
4. Customer Effort Score
Original metric: “Average System Usability Scale (SUS) score of all systems in the investment that are currently in the target architecture”
Revised metric: “Average Customer Effort Score (CES)”
Reporting: Quarterly, with a target of 5 (out of 7) or higher. Survey to be designed and administered by the Caseflow design team, ideally within the Caseflow application itself.
Caseflow recommendation: SUS is not intended for this sort of ongoing measurement of software with which the user is familiarized. SUS measures perceived usability as reported by the user and is generally administered proximate to a usability test. Instead, we recommend a simpler, single question instrument, the Customer Effort Score (CES). CES assesses the subjective ease of use of a specific task. When these responses are aggregated at the system level, they can provide a view of overall system usability; the disaggregated data provides additional value to a design team in identifying specific work flows that are pain points for users.
5. Mean (average) time to recovery (minutes)
Original metric: “Number of identified shared services (per quarter) integrated into Benefits Integrated Platform (BIP)”
Revised metric: “Mean (average) time to recovery (minutes)”
Reporting: Quarterly, with a target of 30 minutes or less. To be reported by the DevOps team on the basis of data captured in incident post-mortems written in response to outages during the reporting period.
Caseflow recommendation: Although using shared services was an important consideration in developing the initial architecture of Caseflow, our view is that Caseflow is now fully integrated with all the services required for known business needs. As a result, this measure is not useful going forward.
We suggest in its place “mean (average) time to recovery,” measuring the team’s ability to respond to an outage. Outages entail significant cost to our customers in the form of lost productivity. Effective DevOps practices ensure that when such an error is identified, the team has ready options at its disposal to quickly return the system to an operational state.
The text was updated successfully, but these errors were encountered: