# CHI Notebook 2
### This notebook is in R, and the goal is to answer some questions of the CGM data and explore visualization apps for CGM data

## Introduction

## In this class  we discuss Consumer health informatics and provide some examples of the kinds of data collected, the analyses performed, and visualizations used to educate and motivate individuals with type 2 diabetes as an example. 

### Continuous Glucose Monitoring (CGM) Data
### we will be looking at the data from a [continuous glucose monitor](https://www.niddk.nih.gov/health-information/diabetes/overview/managing-diabetes/continuous-glucose-monitoring) 
### First we will read in a dataset of CGM data collected from volunteers for a research study
### CGM data is collected every ____

require(dplyr)
cgm<- read.csv ("CGM_vals.csv", header=T, na.strings="")
names(cgm)
cgm$DisplayTime<-  strptime(cgm$DisplayTime, format = "%m/%d/%Y %H:%M", tz = "")
cgm$DisplayTime<- as.POSIXct(cgm$DisplayTime)

cgm$bg.value<- cgm$Value
cgm<- arrange(cgm, ID)

##  Intensity data
### This data represents the amount of activity the user had 

In [1]:
pa.intensity<- read.csv ("Intensity.csv", header=T, na.strings="")
names(pa.intensity)
pa.intensity$ActivityMinute<-  strptime(pa.intensity$ActivityMinute, format = "%m/%d/%Y %H:%M", tz = "")
pa.intensity$ActivityMinute <- as.POSIXct(pa.intensity$ActivityMinute)
pa.intensity$ID<- pa.intensity$id

pa.steps<- arrange(pa.intensity,ID)

ERROR: Error in arrange(pa.intensity, ID): could not find function "arrange"


## Steps Data

In [None]:
pa.steps <- read.csv ("min_steps.csv", header=T, na.strings="")
names(pa.steps)
pa.steps$ActivityMinute <-  strptime(pa.steps$ActivityMinute, format = "%m/%d/%Y %H:%M", tz = "")
pa.steps$ActivityMinute <- as.POSIXct(pa.steps$ActivityMinute)



## Heart Rate

In [2]:
hr<- read.csv ("hr.csv", header=T, na.strings="")
names(hr)
hr$Time<-  strptime(hr$Time, format = "%m/%d/%Y %H:%M", tz = "")
hr$Time<- as.POSIXct(hr$Time)
hr$hr.value<-hr$Value

### Choose every 5th row of fibit data to match

In [4]:
row.index.steps<-  1:nrow(pa.steps)
choose.index.steps<- row.index.steps[seq(1,max(row.index.steps),5)]
pa.steps<- pa.steps[c(choose.index.steps), ]

row.index.intensity<-  1:nrow(pa.intensity)
choose.index.intensity<- row.index.intensity[seq(1,max(row.index.intensity),5)]
pa.intensity<- pa.intensity[c(choose.index.intensity), ]

row.index.hr<-  1:nrow(hr)
choose.index.hr<- row.index.hr[seq(1,max(row.index.hr),5)]
hr<- hr[c(choose.index.hr), ]

ERROR: Error in nrow(pa.steps): object 'pa.steps' not found


## Match CGM and FitBit Data

In [9]:
require(dplyr)
cgm<- arrange(cgm, ID)
pa.steps<- arrange(pa.steps,ID)
pa.intensity<- arrange(pa.intensity, ID)
hr<- arrange(hr, ID)

###
res <- rep(0,length(cgm$ID))
res2 <- rep(0,length(pa.steps$ID))
res3 <- rep(0,length(pa.intensity$ID))
res4 <- rep(0,length(hr$ID))


for( i in 1:length(res)) res[i] <- ifelse(cgm$ID[i] %in% pa.steps$ID ==T, 1, 0)
for( i in 1:length(res2)) res2[i] <- ifelse(pa.steps$ID[i] %in% cgm$ID ==T, 1, 0)
for( i in 1:length(res3)) res3[i] <- ifelse(pa.intensity$ID[i] %in% cgm$ID ==T, 1, 0)
for( i in 1:length(res4)) res4[i] <- ifelse(hr$ID[i] %in% cgm$ID ==T, 1, 0)


cgm$match<- res
pa.steps$match<-res2
pa.intensity$match<-res3
hr$match<-res4


cgm<- subset(cgm, match==1)
pa.steps<- subset(pa.steps, match==1)
pa.intensity<- subset(pa.intensity, match==1)
hr<- subset(hr, match==1)


Loading required package: dplyr

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



ERROR: Error in arrange(cgm, ID): object 'cgm' not found


 ## Create the combined dataset 

In [None]:
length( unique(cgm$ID))
length( unique(pa.steps$ID))
length( unique(pa.intensity$ID))
length( unique(hr$ID))

nrow(cgm)
nrow(pa.steps)
nrow(pa.intensity)
nrow(hr)

hr<-hr[1:31260,c( "ID","Time","hr.value" )]
cgm<- cgm[1:31260,c("ID","bg.value", "DisplayTime")]
pa.steps<-pa.steps[1:31260,c("ID","steps") ]
pa.intensity<-pa.intensity[1:31260, c("ID","Intensity")]

combined<-cbind(hr,cgm,pa.steps,pa.intensity)
combined <- combined[,c (1,3,5,6,8,10)]
combined$bg.value<- as.integer(combined$bg.value)
combined$Time <- format(combined$Time, "%H:%M")


##   EXAMPLE PLOT OF DAILY GLUCOSE CURVE WITH CONTROL LINES

In [None]:
id<-5
day<-3
combined.id<-subset(combined,ID==id)
low.index<- day*288
high.index<-low.index +288
plot(combined.id$DisplayTime[low.index:high.index],combined.id$bg.value[low.index:high.index],
ylab="blood glucose", xlab="Time")
abline(h=70, col=2)
abline(h=180, col=2)

###  CREATE  A LAG VARIABLE

In [None]:
require(DataCombine)
combined<-slide(combined, Var='bg.value', NewVar='bg.value.lag', TimeVar='DisplayTime', GroupVar='ID', slideBy=-1)
combined<- subset(combined, is.na(bg.value.lag)==F )

## CREATE A CUMULATIVE SUM STEPS VARIABLE TO USE AS A PREDICTOR

In [None]:
combined$csum.steps <- ave(combined$steps, combined$ID, FUN=cumsum)

## MODEL THE EFFECTS OF ACTIVITY  ON GLUCOSE ACCOUTING FOR PRIOR GLUCOSE

In [None]:
mdl<- lme(bg.value~  bg.value.lag + csum.steps + hr.value + Intensity, random=~1|ID, data=combined)
summary(mdl)

## IMPROVED MODEL 

In [None]:
mdl<- lme(bg.value~  bg.value.lag + csum.steps , random=~1|ID, data=combined)
summary(mdl)
