Skip to content

Post Trial Analysis

Joshua Selsky edited this page Feb 10, 2014 · 13 revisions

Clinician-Patient

As patients finish trials, their self-report data will be statistically analyzed against each regimen. During a clinician-patient meeting after a patient finishes a trial, the analysis results will be viewed in visualizations that are part of the Trialist front end.

Process

  • A nightly job will run inside of ohmage (the Open mHealth DSU) that will kick off the analysis process.
  • For each patient with a completed trial:
    • The raw self-report data will be converted to an intermediate representation more suitable for visualization and processing. See the Data Input Format section below.
    • The transformed data will be stored for later retrieval and visualization.
    • The transformed data will also be passed to our post-analysis processing hosted in Open CPU.
    • When the analysis completes, the analyzed data set will be stored for later retrieval and visualization. See the Data Output Format section below.
  • The patient's data points and final results analysis will be available through the ohmage stream API. See the Reading Data section below.

Completed Trial

A completed trial is defined by a patient exercising all regimen cycles that were defined in the patient setup process. The length of a trial is calculated from the period length and the number of cycles:

numberOfDaysInPeriod * 2 * numberOfCycles = totalNumberOfDaysInTrial

To determine whether a patient has completed their trial, the DSU looks at the date the patient submitted the "Start Trial" survey.

Trial lengths can be from 28 to 84 days. Patients are instructed to perform daily self-report against the outcomes being measured. In certain cases, patients may generate less than or more than one daily response, so the total number of daily self-reports may not match the trial length exactly.

By default, the nightly job will look for trials that completed the previous day. The job will also allow a date parameter and a "process all" parameter so the outcome analysis can be run again for a particular date or for all trials.

Data Input Format

The data input ("intermediate representation") to the analysis function is a JSON object that contains a metadata object and data array where each element in the array represents a self-reported data point. Each data point is bucketed by regimen and cycle.

The data list will be sorted by time ascending. All timestamps are W3C ISO-8601 formatted.

Note that it is possible for a day to have more than one data point and for there to be daily (or longer) gaps between data points.

Example Data

{
    "metadata": {
        "regimen_a":["Tylenol", "Complementary treatment: including but not limited to physical activity (exercise, stretching, yoga), mindfulness (meditation, relaxation, music therapy)"],
        "regimen_b":["Hydrocodone combination product (e.g., Vicodin, Norco)"],
        "trial_start_date":"2013-11-01",
        "trial_end_date":"2013-12-26",
        "regimen_duration":7,
        "number_of_cycles":4,
        "cycle_ab_pairs":"AB,AB",
        "cognitiveFunctionPromptKey":"cognitiveFunctionFoggyThinkingPrompt"
    }   
    "data": [
        {
            "timestamp":"2013-11-01T20:05:00.000-08:00",
            "regimen":"A",
            "cycle":1,
            "averagePainIntensity":5,
            "enjoymentOfLife":5,
            "generalActivity":5,
            "fatiguePrompt":3,
            "drowsinessPrompt":4,
            "constipationPrompt":2,
            "cognitiveFunctionFoggyThinkingPrompt":1,
            "painSharpness":2,
            "painHotness":6,
            "painSensitivity":6,
            "sleepDisturbancePrompt":3
        }
    ]
}

Metadata

Key Definition
regimen_a An array of strings that represent the patient's regimen A choices.
regimen_b An array of strings that represent the patient's regimen B choices.
trial_start_date The date at which the phone app calculated the patient's start date. If the patient tapped the "start trial" button before 8pm, their trial starts on the current day. If the button was tapped after 8pm, the trial starts the following day.
trial_end_date The end date that is calculated based on the start date and the patient's setup configuration.
regimen_duration The number of days a patient adheres to a regimen.
number_of_cycles The number of AB cycles. The shortest trial is 2 cycles of 7-day periods for a total of 28 days. The longest trial is 3 cycles of 14-day periods for a total of 84 days.
cycle_ab_pairs The randomized AB pairs. The possible values are AB,AB; AB,BA; BA,AB; BA,BA.
cognitiveFunctionPromptKey Defines which cognitive function question the patient was asked. The value will be either cognitiveFunctionFoggyThinkingPrompt or cognitiveFunctionWorkingHarderPrompt.

Data

Key Definition
timestamp The timestamp at which the survey was submitted.
regimen The regimen the patient was taking. A string that will have the value A or B.
cycle The cycle for the above regimen. An integer between 1 and 4.
averagePainIntensity The patient's response to the prompt, "What number best describes your pain on average during the past 24 hours?" The possible values are the whole numbers between 0 and 10 inclusive where 0 represents "No pain" and 10 represents "Pain as bad as you can imagine".
enjoymentOfLife The patient's response to the prompt, "What number best describes how much pain has interfered with your enjoyment of life during the past 24 hours?" The possible values are the whole numbers between 0 and 10 inclusive where 0 represents "Does not interfere" and 10 represents "Completely interferes".
generalActivity The patient's response to the prompt, "What number best describes how much pain has interfered with your general activity during the past 24 hours?" The possible values are the whole numbers between 0 and 10 inclusive where 0 represents "Does not interfere" and 10 represents "Completely interferes".
fatiguePrompt The patient's response to the prompt, "I felt fatigued during the past 24 hours ...". The possible values are the whole numbers between 1 and 5 inclusive where 1 represents "Not at all" and 5 represents "Very much".
drowsinessPrompt The patient's response to the prompt, "How often did you feel drowsy or sleepy today?". The possible values are the whole numbers between 1 and 6 inclusive where 1 represents "None of the time" and 6 represents "All of the time".
constipationPrompt The patient's response to the prompt, "How often over the past 24 hours did you feel constipated?". The possible values are whole numbers between 1 and 5 inclusive where 1 represents "Not at all" and 5 represents "Very much".
cognitiveFunctionFoggyThinkingPrompt The patient's response to the prompt, "My thinking has been foggy ...". The possible values are the whole numbers between 1 and 5 inclusive where 1 represents "Not at all" and 5 represents "Very much".
cognitiveFunctionWorkingHarderPrompt The patient's response to the prompt, "I have had to work harder than usual to keep track of what I was doing ...". The possible values are the whole numbers between 1 and 5 inclusive where 1 represents "Not at all" and 5 represents "Very much".
painSharpness The patient's response to the prompt, "How sharp does your pain feel?". The possible values are the whole numbers from 0 to 10 inclusive where 0 represents "Not sharp" and 10 represents "Like a Knife".
painHotness The patient's response to the prompt, "How hot does your pain feel?". The possible values are the whole numbers from 0 to 10 inclusive where 0 represnts "Not hot" and 10 represents "On Fire".
painSensitivity The patient's response to the prompt, "How sensitive is your skin to light touch or clothing?". The possible values are the whole numbers from 0 to 10 inclusive where 0 represnts "Not sensitive" and 10 represents "Like Raw or Sunburned Skin".
sleepDisturbancePrompt The patient's response to the prompt, "My sleep quality last night was ...". The possible values are the whole numbers between 1 and 5 inclusive where 1 represents "Very poor" and 5 represents "Very good".

Special Cases

  • painSharpness, painHotness and painSensitivity will either all be present of none will be present. These are the optional neuropathic pain outcomes.
  • Only one of cognitiveFunctionFoggyThinkingPrompt or cognitiveFunctionWorkingHarderPrompt will be present.

Data Output Format

The output from the analysis is a JSON object with seven nested objects that contain the analysis results for each outcome being tracked. In addition to the analysis results, each outcome object will also contain a metadata object. TODO -- Discuss metadata object. The metadata object will only be present if errors occur during analysis. It will contain information that will be relevant to debugging and explanatory information for the clinician.

Outcomes

  • Pain
  • Sleep Problems
  • Constipation
  • Drowsiness
  • Thinking Problems
  • Fatigue
  • Neuropathic Pain

Example Data

The values for each of the keys below do not represent actual analysis output.

{
   "pain":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },    
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }          
   },
   "sleep_problems":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   },
   "constipation":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   },
   "drowsiness":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   },
   "thinking_problems":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   },
   "fatigue":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   },
   "neuropathic_pain":{
      "successful_run":true,
      "graph_5":{
         "more_effective_regimen":"A",
         "median_effect":0.5,
         "upper_bound":0.75,
         "upper_bound_regimen":"A",
         "lower_bound":0.1,
         "lower_bound_regimen":"B"
      },
      "graph_6":{
         "b_clinically_better":1.0,
         "b_marginally_better":0.75,
         "a_clinically_better":0.5,
         "a_marginally_better":0.25
      }
   }
}

Definitions

Outcome Object

Key Definition
successful_run A boolean which denotes whether analysis could be successfully performed or not.

Graph 5

Key Definition
more_effective_regimen A string that will be "A" or "B". The regimen that was more effective in treating the outcome.
median_effect The increase in effectiveness of more_effective_regimen compared to the other regimen. Describes the median of a range of possible comparisons between the two regimens calculated through a 95% confidence interval.
upper_bound The maximum comparative advantage A can have over B or the minimum comparative advantage B can have over A. Represents a percent.
upper_bound_regimen If A, upper_bound is the maximum comparative advantage of A over B; if B, upper_bound is the minimum comparative advantage of B over A.
lower_bound The maximum comparative advantage B can have over A or the minimum comparative advantage A can have over B. Represents a percent.
lower_bound_regimen If A, lower_bound is the minimum comparative advantage of A over B; if B, lower_bound is the maximum comparative advantage of B over A.

Graph 6

Key Definition
b_clinically_better Probability that regimen B is at least 20% more effective than regimen A.
b_marginally_better Probability that regimen B is more effective than regimen A, but by less than 20%.
a_clinically_better Probability that regimen A is at least 20% more effective than regimen B.
a_marginally_better Probability that regimen A is more effective than regimen B, but by less than 20%.

Successful Run

If successful_run is false for a particular outcome, a metadata object will be included in the output. The metadata object has a wildcard schema with data intended as technical information for researchers or programmers.

If successful_run is false for a particular outcome, the values for the graph objects will indicate a toss-up between the regimens.

"graph_5":{
    "more_effective_regimen":"A",
    "median_effect":100,
    "upper_bound":100,
    "upper_bound_regimen":"A",
    "lower_bound":0,
    "lower_bound_regimen":"B"
},
"graph_6":{
    "b_clinically_better":25,
    "b_marginally_better":25,
    "a_clinically_better":25,
    "a_marginally_better":25
}

Finding the Users in a Trial

In ohmage, groups of users are known as classes. Privileged class users can read other user's data. For Trialist, the clinicians and RAs are privileged. In order to determine the usernames visible to a privileged user, the ohmage Class Read API should be used. In the interest of limiting data visibility to only those who should have access, there are three classes for Trialist and those three classes correspond to web front ends rooted at different URIs.

Class URN URL Path Git Branch in this Repo Comments
urn:class:trialist /trialist master The class for official PREEMPT RCTs. There may be more than one class URN here. TBD.
urn:class:trialist:internal /trialist_internal internal-testing The class for internal team testing.
urn:class:trialist:mock /trialist_mock mock-trial The class for mock trial testing.

Reading Data

The ohmage Stream Read API will be used for reading participant data.

Parameters for Reading Intermediate Normalized Self-Report Data

Parameter Name Parameter Value
auth_token The authentication token for the logged-in user.
client The name of the software client accessing the API.
observer_id io.omh.trialist
observer_version 2013013000
stream_id data
stream_version 2013013000
username The ohmage username for the patient whose data is being requested.
chronological false

Parameters for Reading Analysis Results Data

Parameter Name Parameter Value
auth_token The authentication token for the logged-in user.
client The name of the software client accessing the API.
observer_id io.omh.trialist
observer_version 2013013000
stream_id results
stream_version 2013013000
username The ohmage username for the patient whose data is being requested.
chronological false

Controlling Output

Setting chronological to false will return the query results in descending order. This is desirable because in certain cases both the normalized data point stream and the analysis data stream may have multiple results and the most recent result is the correct one to visualize (e.g., in the case where the analysis software has been tuned over time, the latest analysis result should be shown).

Data Schemas

Follow the links below to view Concordia schemas for the two different data types.