-
Notifications
You must be signed in to change notification settings - Fork 3
/
Eng_Diagnosis.Rmd
657 lines (462 loc) · 26.4 KB
/
Eng_Diagnosis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
---
title: "Getting started with dxpr: Diagnosis"
author: "Hsiang-Ju, Chiu, Yi-Ju Tseng, Chun-Ju, Chen"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Intro: Dx}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage{UTF-8}{inputenc}
---
# Description
The proposed open-source dxpr package is a software tool aimed at expediting an integrated analysis of electronic health records (EHRs). By implementing dxpr package, it is easier to integrate, analyze, and visualize clinical data.
In this part, the instruction of how dxpr package works with diagnosis records is provided.
### Development version
```r
install.packages("remotes")
# Install development version from GitHub
remotes::install_github("DHLab-TSENG/dxpr")
library(dxpr)
```
```{r setup, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(collapse = T, comment = "#>")
options(tibble.print_min = 4L, tibble.print_max = 4L)
#remotes::install_github("DHLab-TSENG/dxpr")
library(dxpr)
```
### Data Format
dxpr (diagnosis part) is used to pre-process diagnosis codes of EHRs. To execute functions in dxpr, the EHR data input should be a data frame object in R, and contain at least three columns: patient ID, ICD diagnosis codes and date.
Column names or column order of these three columns does not need to necessarily follow a rule. Each required column name will be an argument in functions. Detailed information of required data type of every column and argument of functions can be found in the reference section.
Also, in the R ecosystem, DBI, odbc, and other packages provide access to databases within R. As long as the data is retrieved from databases to a data frame in R, the processes are the same as the following example.
```r
con <- DBI::dbConnect(odbc::odbc(),
Driver = "[your driver's name]",
Server = "[your server's path]",
Database = "[your database's name]",
UID = "[Database user]",
PWD = "[Database password]")
dxDataFile <- dbSendQuery(con, "SELECT * FROM example_table WHERE ID = 'sampleID'")
dbFetch(dxDataFile)
```
### Sample file
A sample rda file is included in dxpr package:
This dataset is a simulated medical dataset of 38 patients with overall 300 records.
```{r}
head(sampleDxFile)
```
### ICD-CM code with two format
dxpr package uses ICD-CM codes as diagnosis standard. There are two formats of ICD-9 and ICD-10 diagnostic codes, decimal (with a decimal point separating the code) and short format, respectively. So two tables of ICD-9-CM and ICD-10-CM are generated to deal with different user needs: `ICD9DxwithTwoFormat` and `ICD10DxwithTwoFormat`.
ICD-9-CM
```{r}
# ICD-9-CM_Short
head(ICD9DxwithTwoFormat$Short)
# ICD-9-CM_Decimal
head(ICD9DxwithTwoFormat$Decimal)
```
ICD-10-CM
```{r}
# ICD-10-CM_Short
head(ICD10DxwithTwoFormat$Short)
# ICD-10-CM_Decimal
head(ICD10DxwithTwoFormat$Decimal)
```
## I.Code Integration
### A. ICD diagnostic code format transformation:Short <-> Decimal
dxpr package helps users to standardize the ICD-9 and ICD-10 diagnostic codes into a uniform format before further code grouping. The formats used for different grouping methods are shown as **Table 1**.
**Table 1** Format of code classification methods
| |ICD format|
|--------|----|
|Clinical Classifications Software (CCS)|short format|
|Comorbidity |short format|
|Phenome-Wide Association Studies (PheWAS)|decimal format|
Since formats of ICD codes used within a dataset could be different, users can choose a target type (short or decimal) according to the corresponding grouping method.
For example, if a user wants to group data by CCS, then ICD codes should be transformed into **short** format.
Code standardization for ICD-9 and ICD-10 are executed seperately. There are two ways to distinguish the version of ICD diagnostic code (ICD-9/ICD-10) used in data: one is a specific extra column that records version used (data type in this column should be numeric `9` or `10`), the other is a specific date that is the starting date of using ICD-10 in the dataset. For example, reimbursement claims with a date required to use ICD-10 codes in the United States and Taiwan are October 1st, 2015 and January 1st, 2016, respectively.
**Warning message**
Besides, code standardization functions generate data of diagnosis codes with potential error to help researchers identify the potential coding mistake that may affect the result of following clinical data analysis.
There are two error type:**wrong format** and **wrong version**. The former one means the ICD code does not exist (maybe because ICD is wrongly coded or with a wrong place of decimal point). And the latter one means the version is wrong (still use ICD 9 after `icd10usingDate`, etc.).
Users can check data after receiving the warning message.
dxpr package also provides an overview of error ICD data by Pareto chart (function `plotICDError`).
#### A-1. Uniform decimal format
The standardization function `icdDxShortToDecimal` converts the ICD diagnostic codes to a uniform decimal format, which can be used for grouping diagnostic code to PheWAS classification.
```{r, message = TRUE, warning = TRUE}
# Short to decimal
decimal <- icdDxShortToDecimal(dxDataFile = sampleDxFile,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015/10/01")
```
In this example, the starting using date of ICD-10 is "2015/10/01" (format: "YYYY/MM/DD").
Also, there are 9 ICD codes labeled as "wrong format", and 7 ICD labeled as "wrong version".
The results are:
```{r}
decimal$ICD[6:10]
```
```{r}
decimal$Error
```
`decimal$Error` shows individual error ICD codes in descending order.
#### A-2. Uniform short format
`icdDxDecimalToShort` function converts the diagnostic codes to the short format, which can be used for grouping to CCS and comorbidities classification.
```{r, message = FALSE, warning = FALSE}
# Decimal to short
short <- icdDxDecimalToShort(dxDataFile = sampleDxFile,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015/10/01")
short$ICD[6:10]
```
### B. Data integration
Functions in data integration section collapse ICD codes into a smaller number of clinically meaningful categories that are more useful for presenting descriptive statistics than individual ICD diagnostic codes are.
dxpr package supports four strategies to group EHR diagnosis codes, including CCS, PheWAS, comorbidities, and customized defined grouping methods.
The output of code classification contains two data frames.
**1) groupedDT**
**Table 2** groupedDT
Short/Decimal|ID|ICD|Date|GroupType
----|--|-----|---|----
ICD short/Decimal|patient ID| ICD |Admission date| group of code classification
The original row order of the data is remained the same in groupedDT, and only one extra column `GroupType` is added.
**2) summarised_groupedDT**
**Table 3** summarised_groupedDT
ID|GroupType|FirstCaseDate|EndCaseDate|Count|Period
|---------|------------|--------------|--------------|------------|---------|
patient ID|group of code classification|first admission date|last admission date|counts of period|record period
summarised_groupedDT summarised the ICD codes in the same group of the same patient together.
The two outputs can be used in the following functions. **groupedDT** can be used to select relevant cases (function `selectCases`) and calculate condition era (function `getConditionEra`). **summarised_groupedDT** can be used to convert the long format of grouped data into a wide format (function `groupedDataLongToWide`) which is fit to other analytical and plotting packages.
Users can choose the column information of `groupType` is "category" or "description" (`isDescription` = `TRUE` or `FALSE`)
For example, the ccs description is "Tuberculosis" while the category is "1".
#### B-1-1. Clinical Classifications Software (CCS)
The CCS classification for ICD-9 and ICD-10 codes is a diagnostic categorization scheme that can employ in many types of projects analyzing data on diagnoses.
**Stop updating since 2019.**
**1) single-level**: `icdDxToCCS`
Both ICD-9-CM and ICD-10-CM code contains 260 single-level CCS categories which can be corresponded with each other.
```{r, message = FALSE, warning = FALSE}
## ICD to CCS with category description
CCS_description <- icdDxToCCS(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
isDescription = TRUE)
head(CCS_description$groupedDT, 5)
head(CCS_description$summarised_groupedDT, 5)
## ICD to CCS with category
CCS_category <- icdDxToCCS(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
isDescription = FALSE)
head(CCS_category$groupedDT, 5)
head(CCS_category$summarised_groupedDT, 5)
```
**2) multi-level**: `icdDxToCCSLvl`
Multi-level CCS in ICD-9-CM has four levels, and multi-level CCS in ICD-10-CM has two levels.
```{r, message = FALSE, warning = FALSE}
## ICD to CCS multiple level 2 description
CCSlvl_description <- icdDxToCCSLvl(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
CCSLevel = 2,
isDescription = TRUE)
head(CCSlvl_description$groupedDT, 5)
head(CCSlvl_description$summarised_groupedDT, 5)
## ICD to CCS multiple level 3 category
CCSLvl_category <- icdDxToCCSLvl(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
CCSLevel = 3,
isDescription = FALSE)
```
#### B-1-2. Clinical Classifications Software Refined (CCSR)
The CCSR classification for ICD-10 codes is a diagnostic categorization scheme that can employ in many types of projects analyzing data on diagnoses.
Unlike CCS, a diagnosis code can be classified into more than one categories in CCSR. However, it is only applicable to ICD-10.
```{r, message = FALSE, warning = FALSE}
## ICD to CCSR with category description
CCSR_description <- icdDxToCCSR(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
isDescription = TRUE)
head(CCSR_description$groupedDT, 5)
head(CCSR_description$summarised_groupedDT, 5)
```
#### B-2. PheWAS
The dxpr package applied PheWAS, performing a hierarchical grouping of ICD-9 diagnostic codes and ICD-10 diagnostic codes (beta version).
```{r, message = FALSE, warning = FALSE}
## ICD to PheWAS
PheWAS <- icdDxToPheWAS(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
isDescription = FALSE)
PheWAS$groupedDT[7:11]
PheWAS$summarised_groupedDT[7:11]
```
#### B-3. Comorbidities
The dxpr package provides three grouping methods of comorbidity as below:
**1) AHRQ**
AHRQ comorbidity measure dataset is based on AHRQ Elixhauser Comorbidity Index.
```{r, message = FALSE, warning = FALSE}
AHRQ <- icdDxToComorbid(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
comorbidMethod = AHRQ)
AHRQ$groupedDT[160:164]
head(AHRQ$summarised_groupedDT, 5)
```
**2) Charlson**
Charlson comorbidity measure dataset is based on Quan's translations of the Charlson Comorbidity Index.
```{r, message = FALSE, warning = FALSE}
Charlson <- icdDxToComorbid(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
comorbidMethod = charlson)
Charlson$groupedDT[160:164]
head(Charlson$summarised_groupedDT, 5)
```
**3) Elixhauser**
The Elixhauser comorbidity software is one in a family of databases and software tools developed as part of the Healthcare Cost and Utilization Project (HCUP).
```{r, message = FALSE, warning = FALSE}
ELIX <- icdDxToComorbid(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
comorbidMethod = elix)
ELIX$groupedDT[160:164]
head(ELIX$summarised_groupedDT, 5)
```
#### B-4. Customize group method
The dxpr package provided customized grouping functions, in which researches can define the grouping categories; therefore, it is more flexible for grouping ICD diagnostic codes.
For example, researcher can declare a customized grouping table for **chronic kidney disease** category, and convert an existing dataset into a grouped chronic kidney disease dataset.
There are two functions for customized defined grouping method based on precise and fuzzy grouping method, respectively.
**1) Precise method**: `icdDxToCustom`
```{r, message = FALSE, warning = FALSE}
# CustomGroupingTable
groupingTable <- data.frame(Group = rep("Chronic kidney disease",6),
ICD = c("N181","5853","5854","5855","5856","5859"),
stringsAsFactors = FALSE)
CustomGroup <- icdDxToCustom(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
customGroupingTable = groupingTable)
CustomGroup$groupedDT[10:14]
head(CustomGroup$summarised_groupedDT, 5)
```
**2) Fuzzy method**: `icdDxToCustomGrep`
```{r, message = FALSE}
# CustomGroupingTable
grepTable <- data.frame(Group = "Chronic kidney disease",
grepIcd = "^585|^N18",
stringsAsFactors = FALSE)
CustomGrepGroup <- icdDxToCustomGrep(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
customGroupingTable = grepTable)
CustomGrepGroup$groupedDT[10:14]
head(CustomGrepGroup$summarised_groupedDT, 5)
```
## II. Data Wrangling
### A. Cases selection
The query function `selectCases` can select the cases matching defined case conditions (been diagnosed with certain condition for certain times within a specific duration). User can select cases by diagnostic categories, such as CCS category, ICD codes, etc.
The output of this function provides the start and end dates of the cases, the
number of days between them, and the most common ICD codes used in the case
definition.
```{r, message = FALSE, warning = FALSE}
Case <- selectCases(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
groupDataType = ccslvl2,
icd10usingDate = "2015/10/01",
caseCondition = "Diseases of the urinary system",
caseCount = 1,
caseName = "Selected")
head(Case)
```
### B. Get eligible period of patient records
The function `getEligiblePeriod` is used for querying the earliest and latest admission date for each patient.
```{r, message = FALSE}
admissionDate <- getEligiblePeriod(dxDataFile = sampleDxFile,
idColName = ID,
dateColName = Date)
head(admissionDate)
```
### C. Data split
Function `splitDataByDate` extracts data by a specific clinical event (e.g., first diagnosis dates of chronic diseases).
Users can define a table of clinical index dates of each patient. The date can generate by `selectCases` function or first/last admission date by `getEligiblePeriod` function.
This function can split data through classifying the data recorded before or after the defined index date and calculating the period between the record date and index date based on a self-defined window gap.
```{r, message = FALSE}
indexDateTable <- data.frame(ID = c("A0","B0","C0","D0"),
indexDate = c("2009-07-25", "2015-12-26",
"2015-12-05", "2017-01-29"),
stringsAsFactors = FALSE)
Data <- splitDataByDate(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
indexDateFile = indexDateTable,
gap = 30)
head(Data, 5)
```
### D. Condition era
The concept of condition era is committed to the length of the persistence gap: when the time interval of any two consecutive admissions for certain conditions is smaller than the length of the persistence gap, then these two admission events will be aggregated into the same condition era.
Function `getConditionEra` calculates condition era by using the grouped categories or self-defining groups of each patient and then generates a table with individual IDs, the first and last record of an era, and the sequence number of each episode.
```{r, message = FALSE, warning = FALSE}
Era <- getConditionEra(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015-10-01",
gapDate = 30,
groupDataType = ccs,
isDescription = FALSE)
head(Era)
```
### E. EDA preparation
After data integration, dxpr package provides a function to convert long format of grouped data into wide format which is fit to other analytical and plotting packages.
There are two type of output: numeric and binary (`numericOrBinary` = `B` or `N` )
```{r, message = FALSE, warning = FALSE}
#binary
groupedData_Wide <- groupedDataLongToWide(dxDataFile = ELIX$groupedDT,
idColName = ID,
categoryColName = Comorbidity,
dateColName = Date,
numericOrBinary = B)
head(groupedData_Wide, 5)
```
```{r, message = FALSE, warning = FALSE}
# numeric
groupedData_Wide <- groupedDataLongToWide(dxDataFile = ELIX$groupedDT,
idColName = ID,
categoryColName = Comorbidity,
dateColName = Date,
numericOrBinary = N)
head(groupedData_Wide, 5)
```
## III. Visualization
Visualization provides overview of clinical data.
### A. Pareto chart of error ICD list
Through code standardization, the functions `icdDxDecimalToShort` and `icdDxShortToDecimal` generate a table of diagnosis codes with potential errors.
The Pareto chart includes bar plot and line chart to visualize individual possible error ICD codes represented in descending order and cumulative total.
```{r, message = FALSE, warning = FALSE}
error <- icdDxDecimalToShort(dxDataFile = sampleDxFile,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015/10/01")
Plot_error1 <- plotICDError(errorFile = error$Error,
icdVersion = all,
wrongICDType = all,
others = TRUE,
topN = 10)
```
<img src="https://raw.githubusercontent.com/DHLab-TSENG/dxpr/master/image/plotICDError.png" style="display:block; margin:auto; width:70%;">
For instance, if a user chooses top 10 of common error ICD in dataset (`topN` = `10`), then the Pareto chart output shows with top 10 error codes in this dataset and a list of the detail of error ICD codes.
```{r, message = FALSE}
Plot_error1$ICD
```
The most common error ICD is *A0.11* which has 20 admission records and error type is "wrong format"
Also, users can divide **ICD-9** by the prefix of the ICD code: 0, 1, 2,..., 9, V and E (`groupICD = TRUE`)
ICD-9-CM divided into 19 chapters:
001-139: Infectious And Parasitic Diseases
140-239: Neoplasms
....
For instance, if user chooses top 3 of common error *ICD-9*, the output Pareto chart shows with top 3 error codes and a list of the detail of error ICD codes.
```{r, message = FALSE}
Plot_error2 <- plotICDError(errorFile = error$Error,
icdVersion = 9,
wrongICDType = all,
groupICD = TRUE,
others = TRUE,
topN = 3)
```
<img src="https://raw.githubusercontent.com/DHLab-TSENG/dxpr/master/docs/reference/plotError-1.png" style="display:block; margin:auto; width:70%;">
```r
Plot_error2$ICD
#> ICDGroup groupCount CumCountPerc MostICDInGroup ICDPercInGroup WrongType
#> 1: A 13 41.94% A01.05 61.54% Wrong version
#> 2: 7 9 70.97% 75.52 44.44% Wrong format
00#> 3: 0 5 87.1% 001 100% Wrong format
#> 4: Others 4 100% E03.0 100% Wrong version
```
The most common error ICD is *A01.05* which has 13 admission records and the error type is "wrong version".
### B. histogram plot
`plotDiagCat` function provides an overview of grouping category of the diagnostic code in histogram plot. User can observe the proportion of diagnostic categories in their dataset.
```{r, message = FALSE, warning = FALSE}
groupedDataWide <- groupedDataLongToWide(ELIX$groupedDT,
idColName = ID,
categoryColName = Comorbidity,
dateColName = Date)
plot1 <- plotDiagCat(groupedDataWide = groupedDataWide,
idColName = ID,
topN = 10,
limitFreq = 0.01)
```
<img src="https://raw.githubusercontent.com/DHLab-TSENG/dxpr/master/image/plotDiagSing.png" style="display:block; margin:auto; width:70%;">
The first group, for instance, is grouped into "RENLFAIL" of ELIX comorbidity index.
```{r}
plot1$sigCate
```
This function can also do the Chi-square test and Fisher’s exact test to see if it is statistical significantly different between the diagnostic categories of case and control.
The default level of significance is of 5% (p = 0.05).
```{r, message = FALSE, warning = FALSE}
selectedCaseFile <- selectCases(dxDataFile = sampleDxFile,
idColName = ID,
icdColName = ICD,
dateColName = Date,
icd10usingDate = "2015/10/01",
groupDataType = ccslvl2,
caseCondition = "Diseases of the urinary system",
caseCount = 1)
groupedDataWide <- groupedDataLongToWide(ELIX$groupedDT, ID, Comorbidity, Date,
selectedCaseFile = selectedCaseFile)
plot2 <- plotDiagCat(groupedDataWide = groupedDataWide,
idColName = ID,
groupColName = selectedCase,
topN = 10,
limitFreq = 0.01,
pvalue = 0.05)
```
<img src="https://raw.githubusercontent.com/DHLab-TSENG/dxpr/master/image/plotDiagMult.png" style="display:block; margin:auto; width:70%;">
There are stastitcal significant difference in "RENLFAIL" of ELIX comorbidity index between case and control.
```{r}
plot2$sigCate
```
## Reference
### I. Code transformation
ICD-9-CM code (2014): https://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html
ICD-10-CM code (2019-2022): https://www.cms.gov/Medicare/Coding/ICD10
https://www.findacode.com/search/search.php
https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/HospitalAppendix_F.pdf
### II. Code grouping
**CCS (Clinical Classifications Software)**
ICD-9-CM (2015): https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp
https://www.hcup-us.ahrq.gov/toolssoftware/ccs/Multi_Level_CCS_2015.zip
ICD-10-CM (2019): https://www.hcup-us.ahrq.gov/toolssoftware/ccsr/ccsr_archive.jsp
https://www.hcup-us.ahrq.gov/toolssoftware/ccs10/ccs_dx_icd10cm_2019_1.zip
**CCSR (Clinical Classifications Software Refined)**
ICD-10-CM (v2022-1): https://www.hcup-us.ahrq.gov/toolssoftware/ccsr/ccs_refined.jsp
**PheWAS**
ICD-9-Phecode (version 1.2, 2015): https://phewascatalog.org/phecodes
ICD-10 Phecode (version 1.2 beta, 2019): https://phewascatalog.org/phecodes_icd10cm
**Comorbidities**
ICD-9-AHRQ (2012-2015): https://www.hcup-us.ahrq.gov/toolssoftware/comorbidity/comorbidity.jsp#references
ICD-10-AHRQ (2019): https://www.hcup-us.ahrq.gov/toolssoftware/comorbidityicd10/comformat_icd10cm_2019_1.txt
ICD-9-Charlson: http://mchp-appserv.cpe.umanitoba.ca/Upload/SAS/ICD9_E_Charlson.sas.txt
ICD-10-Charlson: http://mchp-appserv.cpe.umanitoba.ca/Upload/SAS/ICD10_Charlson.sas.txt
ICD-9-Elixhauser (2012-2015): https://www.hcup-us.ahrq.gov/toolssoftware/comorbidity/comorbidity.jsp#references
ICD-10-Elixhauser (2019): https://www.hcup-us.ahrq.gov/toolssoftware/comorbidityicd10/comformat_icd10cm_2019_1.txt