Skip to content

R training for Public Health Scotland analysts using SMR data

Notifications You must be signed in to change notification settings

jackhannah95/smr-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMR Training

This repository contains materials used to train Public Health Scotland analysts in R using Scottish Morbidity Record (SMR) data. While originally written to be relevant to analysts working in LIST, they should also be relevant to analysts in other teams who use SMR data.

Please note that this GitHub repository contains the main copy of this training material. Any local copies which exist on the network will not be maintained.

Instructions for Running

To download this repository, click the green 'Clone or download' button and then click 'Download ZIP'. Unzip the folder in a location on the network which is accessible via the RStudio server.

To open the project in the RStudio server, click File -> Open Project -> navigate to the folder where the project is saved -> open the smr-training.Rproj file.

RStudio Projects

This code uses RStudio Projects, which are a way of bundling together related files and scripts. RStudio Projects come with a .Rproj file, and wherever this file is saved is where RStudio sets the working directory, from which other filepaths can be defined relatively using the here package. A new project which follows the recommended structure within PHS can be created using the phstemplates package.

Type getwd() into the RStudio console to get the working directory for this project.

SPSS Equivalent Functions

The below table contains an approximate and non-exhaustive list of equivalent functions in R and SPSS which are commonly used in analysis of SMR data. The R functions come from the dplyr, tidyr and magrittr packages, part of the tidyverse collection of packages.

Please note that, where not explicitly stated, it is assumed in the R code listed in the below table that the data have first been piped (%>% or %<>%) to the function, for example:

  • new_df <- old_df %>%
       arrange(x) %>%
       filter(x = first(x))

  • df %<>%
       select(x, y) %>%
       mutate(z = x + y)

R SPSS
arrange(x) SORT CASES BY X (A)
arrange(desc(x)) SORT CASES BY X (D)
first(x) FIRST(X)
last(x) LAST(X)
filter(x == 2) SELECT IF X = 2
filter(x != 2) SELECT IF NOT (X = 2)
select(x) /KEEP X
select(-x) /DROP X
mutate(x = 2) COMPUTE X = 2
drop_na(x) SELECT IF NOT (SYSMIS(X))
df %<>%
   left_join(lookup, by = "common_variable")
MATCH FILES
   /FILE = *
   /TABLE = "/PATH/TO/LOOKUP"
   /BY COMMON_VARIABLE
df %<>%
   group_by(x) %>%
   summarise(y = sum(y)) %>%
   ungroup()
AGGREGATE OUTFILE = *
   /BREAK X
   /Y = SUM(Y)

About

R training for Public Health Scotland analysts using SMR data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages