Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caseflow metrics for OIT #11198

Closed
lpciferri opened this issue Jun 25, 2019 · 5 comments
Closed

Caseflow metrics for OIT #11198

lpciferri opened this issue Jun 25, 2019 · 5 comments
Assignees

Comments

@lpciferri
Copy link
Contributor

lpciferri commented Jun 25, 2019

Metrics we need to share with OIT/OMB

Number Metric Status Numerator Denominator
1 Percentage of all cases certified with Caseflow Needs CorpDB data, blocked
2 Percent of non-denial decisions with an EP created within 7 days Needs CorpDB data, blocked
3 Veteran show rates at Board hearings (Total scheduled meetings held – postposed hearings/total scheduled meetings from Caseflow Hearing Schedule) Testing in Looker here https://caseflow-looker.va.gov/looks/193 (though I think this is just AMA hearings)
4 Percentage AMA reviews established within 7 days Testing in Looker here https://caseflow-looker.va.gov/looks/192 (appeals only to start)
5 Adoption of Caseflow Reader for appeal decisions (percentage) Got denominator. Need help with opened in Reader Legacy appeals decisions - 7767 - https://caseflow-looker.va.gov/looks/194
AMA appeals decisions - 132 - https://caseflow-looker.va.gov/looks/195
6 Average Customer Effort Score (CES) Will do next month. Need to finish then send out survey.
7 Mean (average) time to recovery (minutes) To do

The metrics below this line were not actually the ones approved by OIT/OMB. See above

Caseflow Metric

1. System availability (percentage), measured as the percentage of HTTP responses returned with a non-500 status code

Original metric: “System availability (percentage)”
Revised metric: “System availability (percentage), measured as the percentage of HTTP responses returned with a non-500 status code”
Reporting: Monthly, with a target of 99.9% or higher. The DevOps team will create a dashboard to report this number using DataDog, Grafana, or similar.

Caseflow recommendation: This is an important measure of a system’s performance. There are two factors left ambiguous here, however. First, it is not clear whether downtime that results from a system dependency should count against this measure. Second, even when the system and its dependencies are listed with an available status, a bug or unknown outage in a dependency could result in an error being shown to the user.

Our view is that a measure of availability should include downtime as a result of dependencies, even when the system being measured is not at fault, because this reflects the experience of the system’s users, and could lead the team to take steps to insulate the system from dependency unreliability. For the same reason, we believe that a single measure of availability should be designed to measure errors propagated to the user, even when the system and its dependencies are nominally available.

By measuring at the load balancer the percentage of HTTP responses returned with a non-500 status code, we can understand the rate at which an error in the system or one of its dependencies is visible to a user. Note that this measure is contingent on proper semantic use of HTTP status codes, which is a practice of the Caseflow development team.

2. Average page load time (in seconds)

Reporting: Monthly, with a target of 3 seconds or less. Available through existing integration of New Relic.**

Caseflow recommendation: We concur with this measure as is. It is important to note, however, that depending on the architecture of any two systems, this measure may not present an apples-to-apples comparison. Caseflow is a Single-Page Application (SPA). This means that much of the application code is provided to the user’s browser on the first load, with additional client-server communication facilitating data transfer or updates. This is different from an application that re-renders a page of the application on the server on every request and provides just that page to the user’s browser.

3. Adoption of Caseflow Reader for appeal decisions (percentage)

Original metric: Average time (days) to process an appeal
Revised metric: “Adoption of Caseflow Reader for appeal decisions (percentage)”
*Reporting:*Monthly, with a target of 98% or more. This metric is already included in the Caseflow Product Impact Statement, and Chris Given will port it to Looker for ease of access.

Caseflow recommendation: There is no single measure of an appeal’s processing time. Caseflow tracks all VA decision reviews, not just Board Appeals. Even for appeals, there are multiple different dockets that the appellant might select, each of which will entail a different duration of wait before being distributed to a judge for a decision.

An outcome-oriented metric that is more directly connected to the value that Caseflow offers is in usage of Caseflow Reader by attorneys and Veterans Law Judges. Reader is designed for these users to help them more efficiently complete time-consuming tasks related to evidence review and annotation as they draft decisions on appeals. Using Reader is not mandatory, and attorneys and VLJs can access the same information through VBMS if they prefer. Adoption of Caseflow Reader thus measures the extent to which Caseflow is delivering a better tool that empowers VA employees to improve the timeliness of the appeals process.

4. Customer Effort Score

Original metric: “Average System Usability Scale (SUS) score of all systems in the investment that are currently in the target architecture”
Revised metric: “Average Customer Effort Score (CES)”
Reporting: Quarterly, with a target of 5 (out of 7) or higher. Survey to be designed and administered by the Caseflow design team, ideally within the Caseflow application itself.

Caseflow recommendation: SUS is not intended for this sort of ongoing measurement of software with which the user is familiarized. SUS measures perceived usability as reported by the user and is generally administered proximate to a usability test. Instead, we recommend a simpler, single question instrument, the Customer Effort Score (CES). CES assesses the subjective ease of use of a specific task. When these responses are aggregated at the system level, they can provide a view of overall system usability; the disaggregated data provides additional value to a design team in identifying specific work flows that are pain points for users.

5. Mean (average) time to recovery (minutes)

Original metric: “Number of identified shared services (per quarter) integrated into Benefits Integrated Platform (BIP)”
Revised metric: “Mean (average) time to recovery (minutes)”
Reporting: Quarterly, with a target of 30 minutes or less. To be reported by the DevOps team on the basis of data captured in incident post-mortems written in response to outages during the reporting period.

Caseflow recommendation: Although using shared services was an important consideration in developing the initial architecture of Caseflow, our view is that Caseflow is now fully integrated with all the services required for known business needs. As a result, this measure is not useful going forward.

We suggest in its place “mean (average) time to recovery,” measuring the team’s ability to respond to an outage. Outages entail significant cost to our customers in the form of lost productivity. Effective DevOps practices ensure that when such an error is identified, the team has ready options at its disposal to quickly return the system to an operational state.

@lpciferri lpciferri self-assigned this Jun 25, 2019
@lpciferri
Copy link
Contributor Author

lpciferri commented Jun 25, 2019

Note: learned that OIT/OMB did not approve the metrics Chris suggested. Here is what was approved.

Metric ID 2. Metric Description 3. Unit of Measure 4. Performance Measurement Category Mapping 5. Agency Baseline Capability 6. 2018 Target 7. 2019 Target 8. Measurement Condition 9. Reporting Frequency
1709076050 Caseflow - Percentage of all cases certified with Caseflow, which assists Agency Original Jurisdiction in uploading all required documents of record to a shared electronic file repository and automatically prepares the Certification of Appeal to transfer the appeal to the Board of Veterans’ Appeals. (Number of appeals transferred to the Board using Caseflow/Total number of appeals transferred to the Board (manual certifications of paper appeals and Caseflow certifications)) Strategic and Business Results 85.7 90 90 Over target Monthly
1709076052 Caseflow Dispatch - Percent of non-denial decisions with an EP created within 7 days. Caseflow facilitates the transfer of cases to the Regional Office following issuance of a Board of Veterans’ Appeals decision, with an end product (EP) created to track work. Not creating an EP increases the possibility of miscommunication and slows average processing times, delaying ratings adjustment and working of remand orders. (Number EPs with date<=7 days after outcoding date/number non-denial decisions) Strategic and Business Results 79.6 80 80 Over target Quarterly
1812216039 Increase Veteran show rates at Board hearings using Caseflow Hearing Schedule, which enables a standardized scheduling process providing the ability to schedule hearings at both regional offices and alternate locations closer to Veterans’ homes. (Total scheduled meetings held – postposed hearings/total scheduled meetings from Caseflow Hearing Schedule) Strategic and Business Results   N/A 75 Over target Quarterly
1812216040 Percentage of AMA reviews established within 7 days Date in Intake appeal established minus date entered Strategic and Business Results   N/A 90% Over target Monthly
1812216042 Adoption of Caseflow Reader for appeal decisions (percentage) Date of decision minus date of appeal establishment Customer Satisfaction (Results)   N/A 98% Over target Annual
1812216043 Average Customer Effort Score (CES) 5 out of 7 positive responses or better Customer Satisfaction (Results)   N/A 5 out of 7 or better Over target Annual
1812216044 Mean (average) time to recovery (minutes) Outage recovery time minus outage report time Innovation   N/A 30 minutes or less Under target Quarterly

@lpciferri
Copy link
Contributor Author

lpciferri commented Jul 9, 2019

Questions clarified:

  1. Can you confirm the time window we're evaluating for "Monthly" metrics is June 1 - June 30? and "Quarterly" metrics April 1 - June 30? If it's through the end of the month, what is the exact date that these metrics are due? Yes, it’s for the calendar month. And yes, it’s the calendar quarter, not the fiscal quarter, so you’ll give us from April 1 to June 30.
  2. Mean (average) time to recovery (minutes) - Should this metric include outages due to dependencies or not? (We recommend to exclude outages due to dependencies. For example, when CSEM in Austin, TX had an outage, Caseflow was down. We hope this doesn't count against our scores.) Also, can you clarify the timing for this metric - is this 24/7, or should this reflect some definition of business hours? (For example, if there is an outage at midnight, our team will respond when online the next morning.) I agree that it shouldn’t include dependencies as that doesn’t measure the actual application. I also don’t think it should include planned maintenance window outages, just unplanned ones. I also agree that outage recoveries should occur within a business hours window, from 7:00 am to 6:00 pm ET, which provides an hour of extra catch-up time on either side.
  3. Percentage of AMA reviews established within 7 days - calculation listed above is date in Intake appeal established minus date entered. Can you please clarify what you mean by date entered? It’s my understanding that there is a lag between the time an appeal request is entered in the system and the date it becomes an End Product (EP). That is what this metric refers to.
  4. Adoption of Caseflow Reader for appeal decisions (percentage) - I want to clarify the unit of measure for this. I think you've copied and pasted the wrong description. This was from Chris Given’s recommendation. He was supposed to have migrated the metric to Looker before he left. See below.

@pkarman
Copy link
Contributor

pkarman commented Jul 19, 2019

We track many of these in DataDog. e.g. https://app.datadoghq.com/dashboard/jcd-hng-7gh/caseflow-service-level-indicators

@carodew
Copy link
Contributor

carodew commented Jul 24, 2019

Is it worth considering this issue an epic with stuff assigned to it? I ask because it seems like there should be a link between #10542 and this one.

@pkarman
Copy link
Contributor

pkarman commented Sep 27, 2019

subsumed by #11994

@pkarman pkarman closed this as completed Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants