<style>
a {
    color: white !important;
    text-decoration: none !important;
    font-family: 'Garamond', serif !important;
}
a:hover {
    color: lightgreen !important; 
    font-family: 'Garamond', serif !important;
}
h3, li {
    font-family: 'Garamond', serif !important;
}
</style>


### **Modelling Survival from Malignant Melanoma**

<style>
body {
    font-family: 'Garamond', serif !important;
}

a {
    color: white !important;
    text-decoration: none !important;
}
a:hover {
    color: lightgreen !important; 
}

h3, li, p {
    font-family: 'Garamond', serif !important;
}
</style>

This analysis explores the modelling of survival from malignant melanoma, using the Melanoma_df dataset from
MedDataSets. The hypothesis is whether survival probabilities, from melanoma, decrease over time, accounting for
predictors. Statistical approaches to quantifying survival from melanoma are documented in academic literature.
Florell et. al (2005) deployed a multivariate logistic regression analysis to evaluate prognostic and survival statistics
of the familial melanoma population from a Utah database. Pollack et. al (2011) calculated unadjusted cause-specific
survival for 68,495 primary melanoma cases, diagnosed from 1992 to 2005. Through multivariate analysis, Pollack
et. al (2011) found that 5-year melanoma survival increased from 87.7% to 90.1%, for diagnosed cases between
1992-1995 compared to 1999-2001. Jalving et. al (2024) used a Cox proportional hazards model to investigate factors
associated with progression-free survival from melanoma, a study that included 2490 patients with advanced cases.
Lastly, Karponis et. al (2025) assessed incidence and mortality trends for melanomia in situ (MIS) and malignant
melanoma (MM), in England, between 2001 and 2020. Here, Karponis et. al (2025) utilised join-point regression
analysis, to calculate average percentage change in mortality, alongisde age-standardised incidence of the condition.

---

In [2]:
library(stats)
library(knitr)
library(ggplot2)
library(magick)
library(forecast)
library(fpp2)
library(GGally)
library(gridExtra)
library(knitr)
library(patchwork)
library(BSDA)
library(dplyr)
library(GLMsData)
library(MedDataSets)
library(NHSRdatasets)
library(medicaldata)
library(predictmeans)
library(tidyverse) 
library(MASS)
library(nnet)
library(survival)
library(VGAM) 
library(mlbench)
library(ResourceSelection)
library(pROC)
library(caret)
library(datasets) 
library(foreign)
library(brant)
library(svyVGAM)
library(pscl)
library(tibble)
library(cards)
library(cardx)
library(lubridate)
library(ggsurvfit)
library(gtsummary)
library(tidycmprsk)
library(survminer)
library(pwr)
library(blockrand)
library(randomizeR)

"package 'knitr' was built under R version 4.4.3"
"package 'ggplot2' was built under R version 4.4.3"
"package 'magick' was built under R version 4.4.3"
Linking to ImageMagick 6.9.12.98
Enabled features: cairo, freetype, fftw, ghostscript, heic, lcms, pango, raw, rsvg, webp
Disabled features: fontconfig, x11

"package 'forecast' was built under R version 4.4.3"
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

"package 'fpp2' was built under R version 4.4.3"
── [1mAttaching packages[22m ────────────────────────────────────────────── fpp2 2.5 ──

[32m✔[39m [34mfma      [39m 2.5     [32m✔[39m [34mexpsmooth[39m 2.3

"package 'fma' was built under R version 4.4.3"
"package 'expsmooth' was built under R version 4.4.3"


"package 'GGally' was built under R version 4.4.3"
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2


Attaching package: 'GGally'


The following object is masked from 'package:fma':

 

In [3]:
data(package = "MedDataSets")
library(MedDataSets)
?Melanoma_df
view(Melanoma_df)

Data sets in package 'MedDataSets':

Aids2_df                Australian AIDS Survival Data
Cushings_df             Diagnostic Tests on Patients with Cushing's
                        Syndrome
GAGurine_df             Level of GAG in Urine of Children
Melanoma_df             Survival from Malignant Melanoma
Mixtures_Drug_tbl_df    Drug Mixture
Pima_te_df              Diabetes in Pima Indian Women
Pima_tr2_df             Diabetes in Pima Indian Women
Pima_tr_df              Diabetes in Pima Indian Women
Puromycin_df            Reaction Velocity of an Enzymatic Reaction
ToothGrowth_df          The Effect of Vitamin C on Tooth Growth in
                        Guinea Pigs
VADeaths_matrix         Death Rates in Virginia (1940)
VA_df                   Veteran's Administration Lung Cancer Trial
anorexia_df             Anorexia Data on Weight Change
antibiotics_tbl_df      Pre-existing Conditions in Children
avandia_tbl_df          Cardiovascular Problems for Two Types of
                      

Melanoma_df            package:MedDataSets             R Documentation

_S_u_r_v_i_v_a_l _f_r_o_m _M_a_l_i_g_n_a_n_t _M_e_l_a_n_o_m_a

_D_e_s_c_r_i_p_t_i_o_n:

     The dataset name has been changed to 'Melanoma_df' to avoid
     confusion with other datasets from packages in the R ecosystem and
     to follow the naming conventions of the 'MedDataSets' package. The
     suffix '_df' indicates that this dataset is a data frame, helping
     to distinguish it from other datasets within the package and from
     those in the broader R ecosystem. The original content of the
     dataset has not been modified in any way.

_U_s_a_g_e:

     data(Melanoma_df)
     
_F_o_r_m_a_t:

     A data frame with 205 observations and 7 variables:

     time An integer representing the survival time of the patients (in
          months).

     status An integer indicating the status of the patient at the end
          of the study; typically coded as 1 

In [5]:
summary(Melanoma_df)

      time          status          sex              age             year     
 Min.   :  10   Min.   :1.00   Min.   :0.0000   Min.   : 4.00   Min.   :1962  
 1st Qu.:1525   1st Qu.:1.00   1st Qu.:0.0000   1st Qu.:42.00   1st Qu.:1968  
 Median :2005   Median :2.00   Median :0.0000   Median :54.00   Median :1970  
 Mean   :2153   Mean   :1.79   Mean   :0.3854   Mean   :52.46   Mean   :1970  
 3rd Qu.:3042   3rd Qu.:2.00   3rd Qu.:1.0000   3rd Qu.:65.00   3rd Qu.:1972  
 Max.   :5565   Max.   :3.00   Max.   :1.0000   Max.   :95.00   Max.   :1977  
   thickness         ulcer      
 Min.   : 0.10   Min.   :0.000  
 1st Qu.: 0.97   1st Qu.:0.000  
 Median : 1.94   Median :0.000  
 Mean   : 2.92   Mean   :0.439  
 3rd Qu.: 3.56   3rd Qu.:1.000  
 Max.   :17.42   Max.   :1.000  

In [6]:
cor(Melanoma_df)

Unnamed: 0,time,status,sex,age,year,thickness,ulcer
time,1.0,0.31614601,-0.146499215,-0.30151794,-0.485504359,-0.2354087,-0.26475748
status,0.316146,1.0,-0.098967345,0.01596386,0.138166927,-0.2047216,-0.27032555
sex,-0.1464992,-0.09896735,1.0,0.06833741,-0.002645159,0.1854126,0.16797915
age,-0.3015179,0.01596386,0.068337413,1.0,0.188229089,0.2124798,0.12606294
year,-0.4855044,0.13816693,-0.002645159,0.18822909,1.0,-0.1333454,-0.03312562
thickness,-0.2354087,-0.20472162,0.185412563,0.21247979,-0.133345424,1.0,0.42445931
ulcer,-0.2647575,-0.27032555,0.167979154,0.12606294,-0.033125618,0.4244593,1.0
