# Step 3: execute Cox Proportional Hazards model

This notebook performs the Cox Proportional Hazards model on the collaboration at hand. As we have previously identified the column names (see [1_check_variables.ipynb](1_check_variables.ipynb)), we now want to execute this ML algorithm on the data available in the stations. Specifically, we want to use the following input variables:

- Age
- Clinical tumor stage
- Clinical nodal stage

The outcome/objective of the model is the right-censored variable called survival.

**Task: run the cell below, to install the Vantage CoxPH client package**

In [None]:
# This also installs the package vtg
devtools::install_github('iknl/vtg.coxph', subdir="src")

Now the R-package is installed, we can start executing this package.

**Task: fill in the correct connection details in the cell below, and execute this cell**

In [None]:
setup.client <- function() {
  # Define parameters
  username <- ""
  password <- ""
  host <- ''
  api_path <- '/api'
  
  # Create the client
  client <- vtg::Client$new(host, api_path=api_path)
  client$authenticate(username, password)

  return(client)
}

# Create a client
client <- setup.client()

We are now connected to the Vantage central server, and have access to several collaborations.

**Task: execute the cell below, to which collaboration(s) do we have access?**

In [None]:
# Get a list of available collaborations
print( client$getCollaborations() )

Now we can select the collaboration we want to use. Specify the correct collaboration from the previous cell, by entering the collaboration ID below.

**Task: enter the collaboration ID, and execute the cell**

In [None]:
# Select a collaboration
client$setCollaborationId(1)

Now we can specify the model characteristics we want to learn. We can set the input variables (called `expl_vars`) and the right-censored outcome (`time_col` and `censor_col`).

**Task: enter the correct input and outcome variables and execute the cell**

In [None]:
# Define explanatory variables, time column and censor column
expl_vars <- c("age",
               "clinical.T.Stage",
               "Clinical.N.Stage")
time_col <- "Survival.time"
censor_col <- "deadstatus.event"

Finally, we can execute the model, and inspect the results.

**Task: execute the cell below. What output do you get? What does this output represent?**

In [None]:
result <- vtg.coxph::dcoxph(client, expl_vars, time_col, censor_col)
result