generated from opensafely/research-template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
01_longCovidSymp_cr_split_by_stp.do
95 lines (70 loc) · 2.97 KB
/
01_longCovidSymp_cr_split_by_stp.do
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
/*==============================================================================
DO FILE NAME: 01_longCovidSymp_check_case_control_source.do
PROJECT: Long covid symptoms
DATE: 28th May 2022
AUTHOR: Kevin Wing
DESCRIPTION OF FILE: Checks that source case and control files are only specific to one region (can't so this in dummy data as you specify the regions yourself!)
DATASETS USED: .csv files from study definitioons
DATASETS CREATED: none
OTHER OUTPUT: logfiles, printed to folder analysis/$logdir
t
sysdir set PLUS "/Users/kw/Documents/GitHub/households-research/analysis/adofiles"
sysdir set PERSONAL "/Users/kw/Documents/GitHub/households-research/analysis/adofiles"
==============================================================================*/
sysdir set PLUS ./analysis/adofiles
sysdir set PERSONAL ./analysis/adofiles
pwd
* Open a log file
cap log close
log using ./logs/01_longCovidSymp_split_by_stp.log, replace t
*(1)=========Split cases into separate stp files== ==========
import delimited ./output/input_covid_communitycases_correctedCaseIndex.csv, clear
*tab just so I can see list of stps
safetab stp
*stp is always set to 1 in dummy data so manually splitting up here (just for dummy data)
*COMMENT OUT THE FOLLOWING WHEN NOT RUNNING ON DUMMY DATA
*replace stp="STP2" if _n<300
**END OF COMMENT OUT
*stps are coded E54000005-9, 10, 12-17, 20-27, 29, 33, 35-37, 40-44, 49
*files need to be .csv format as this is what the matching program needs as input
foreach i of numlist 5/9 {
preserve
capture noisily keep if stp=="E5400000`i'"
capture noisily export delimited using "./output/input_covid_communitycases_stp`i'.csv", replace
count
capture noisily safetab stp
restore
}
foreach i of numlist 10 12/17 20/27 29 33 35/37 40/44 49 {
preserve
capture noisily keep if stp=="E540000`i'"
capture noisily export delimited using "./output/input_covid_communitycases_stp`i'.csv", replace
count
capture noisily safetab stp
restore
}
*(2)=========Split controls into separate stp files============
import delimited ./output/input_controls_contemporary.csv, clear
safetab stp
*stp is always set to 1 in dummy data so manually splitting up here (just for dummy data)
*COMMENT OUT THE FOLLOWING WHEN NOT RUNNING ON DUMMY DATA
*replace stp="STP2" if _n>300
**END OF COMMENT OUT
*stps are coded E54000005-9, 10, 12-17, 20-27, 29, 33, 35-37, 40-44, 49
foreach i of numlist 5/9 {
preserve
capture noisily keep if stp=="E5400000`i'"
capture noisily export delimited using "./output/input_controls_contemporary_stp`i'.csv", replace
count
capture noisily safetab stp
restore
}
foreach i of numlist 10 12/17 20/27 29 33 35/37 40/44 49 {
preserve
capture noisily keep if stp=="E540000`i'"
capture noisily export delimited using "./output/input_controls_contemporary_stp`i'.csv", replace
count
capture noisily safetab stp
restore
}
log close