-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 5883369
Showing
63 changed files
with
13,317 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
Package: Rmonize | ||
Type: Package | ||
Title: Support Retrospective Harmonization of Data | ||
Version: 1.0.1 | ||
Authors@R: | ||
c(person(given = "Guillaume", | ||
family = "Fabre", | ||
role = c("aut", "cre"), | ||
email = "guijoseph.fabre@gmail.com", | ||
comment = c(ORCID = "0000-0002-0124-9970")), | ||
person("Maelstrom-research group", | ||
role=c("fnd"))) | ||
Maintainer: Guillaume Fabre <guijoseph.fabre@gmail.com> | ||
Description: Functions to support rigorous retrospective data harmonization | ||
processing, evaluation, and documentation across datasets from different | ||
studies based on Maelstrom Research guidelines. The package includes the | ||
core functions to evaluate and format the main inputs that define the | ||
harmonization process, apply specified processing rules to generate | ||
harmonized data, diagnose processing errors, and summarize and evaluate | ||
harmonized outputs. The main inputs that define the processing are a | ||
DataSchema (list and definitions of harmonized variables to be generated) | ||
and Data Processing Elements (processing rules to be applied to generate | ||
harmonized variables from study-specific variables). The main outputs of | ||
processing are harmonized datasets, associated metadata, and tabular and | ||
visual summary reports. As described in | ||
Maelstrom Research guidelines for rigorous retrospective data | ||
harmonization (Fortier I and al. (2017) <doi:10.1093/ije/dyw075>). | ||
License: GPL-3 | ||
LazyData: true | ||
Depends: R (>= 3.4) | ||
Imports: dplyr (>= 1.1.0), rlang, stringr, tidyr, crayon, haven, utils, | ||
fs, fabR (>= 2.0.0), madshapR | ||
Suggests: janitor, car, knitr | ||
URL: https://github.com/maelstrom-research/Rmonize/ | ||
BugReports: https://github.com/maelstrom-research/Rmonize/issues | ||
RoxygenNote: 7.2.3 | ||
VignetteBuilder: knitr | ||
Encoding: UTF-8 | ||
Language: en-US | ||
NeedsCompilation: no | ||
Packaged: 2023-12-20 16:52:38 UTC; guill | ||
Author: Guillaume Fabre [aut, cre] (<https://orcid.org/0000-0002-0124-9970>), | ||
Maelstrom-research group [fnd] | ||
Repository: CRAN | ||
Date/Publication: 2023-12-21 16:30:04 UTC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
bf50bea9d1cc9129fb706d303b689576 *DESCRIPTION | ||
0ab457c3e5eec24233535e292f660a49 *NAMESPACE | ||
fad0853e7a8f13c7f4672c15ccfda2b3 *NEWS.md | ||
41991d11778dd8e6ae86217d81b48ba3 *R/00-import-from-madshapR.R | ||
53c13717949302761598bdcdb2bb32a8 *R/01-utils.R | ||
3a365a666412a54f452bfdfde1b89902 *R/02-harmo_process_harmonization.R | ||
5da1cd4bc06bbcec23a45962d41680c7 *R/03-harmonized_data_evaluate.R | ||
0c4fbf469dab718bd61dd3ed475a4078 *R/04-harmonized_data_summarize.R | ||
258b557a71aa07abc81d5fb7fe916492 *R/05-harmonized_data_visualize.R | ||
1b54137d4ae82b40a266122de6d1201b *R/Rmonize-package.R | ||
30473863dc61a6763ab0e3fc9a4414a1 *README.md | ||
2836b2c5db38c9819380bd093383d89d *build/partial.rdb | ||
ff70bbda47a299185435e70394cd80f0 *build/vignette.rds | ||
c55d2d23abff5dca755002de2fd59a6e *data/Rmonize_DEMO.rda | ||
c1b9c0d0eac32d433fc090109ef1cabe *inst/WORDLIST | ||
cf0d88805ee1cb048eb11ff1bd88956a *inst/doc/a-Glossary-and-templates.R | ||
651e86fc34c5a870430596950bac697e *inst/doc/a-Glossary-and-templates.Rmd | ||
215a19ff1d1ac9ca2e399b2da3df585e *inst/doc/a-Glossary-and-templates.html | ||
cf0d88805ee1cb048eb11ff1bd88956a *inst/doc/b-Data-processing-elements.R | ||
dd565779c4eed1cb36ed7bfd60a720e5 *inst/doc/b-Data-processing-elements.Rmd | ||
0c53ffa75047c9c01a885dbda5da3a19 *inst/doc/b-Data-processing-elements.html | ||
a4b22c9cec9c2b67460ed40ef757f5af *inst/doc/c-Example-with-Rmonize_DEMO.R | ||
8c0f211c628d343e5a8f7008cd30491d *inst/doc/c-Example-with-Rmonize_DEMO.Rmd | ||
06c705ce33467d188b6221e8d77a61db *inst/doc/c-Example-with-Rmonize_DEMO.html | ||
564666a2a16494478d205ec1f55f31c7 *man/Rmonize-package.Rd | ||
7fda5bfa7a3d1183f1d8212f421320e9 *man/Rmonize_DEMO.Rd | ||
c8f7a2640b2e456dcb742a5213a32bd9 *man/Rmonize_help.Rd | ||
7727ce6a976e9679ed88af24c5d07828 *man/Rmonize_templates.Rd | ||
d259ad678473874c4306ab68b0872d02 *man/as_data_dict.Rd | ||
d88ef7e030e090e5df435a0e3672964f *man/as_data_proc_elem.Rd | ||
b70d6c025e6d08337f6c39a9ab27abba *man/as_dataschema.Rd | ||
f63f4dec4d027dc32e0fe3c521319737 *man/as_dataschema_mlstr.Rd | ||
1ab90b88583c61e96914c75fdfb7053e *man/as_dataset.Rd | ||
4bf4167df0feec0728f471ecfd97e42f *man/as_dossier.Rd | ||
e157b56ffbab1279572c9b92dda3db94 *man/as_harmonized_dossier.Rd | ||
5478e2862ba5ad3a8f0f858b6eb5957a *man/bookdown_open.Rd | ||
553ddc568949babb77aca23a6ae9dd51 *man/data_dict_apply.Rd | ||
8cc40b23ce5830c623bf02ac4dbdcc98 *man/data_dict_evaluate.Rd | ||
945e834191fdae7aa32c07fe1797602e *man/data_dict_extract.Rd | ||
10f1594c21afd01fa8dc8173ef1e5ae5 *man/dataschema_evaluate.Rd | ||
c740bef912c858af98bf0a2abb5e37df *man/dataschema_extract.Rd | ||
04b898c19d70dfc9cdfe4fbee31f8f75 *man/dataset_evaluate.Rd | ||
74c32e28ebd360e8e8c6b1d04c2557f2 *man/dataset_summarize.Rd | ||
5624151e4884ae4824b7baaf992fd42b *man/dataset_visualize.Rd | ||
2fad6730b0858d3791518a4a9b862425 *man/dossier_create.Rd | ||
fbb667e37efc99716e8d9289c81e28d6 *man/dossier_evaluate.Rd | ||
847635a7d77f99a9a4c055ad20d6a6df *man/dossier_summarize.Rd | ||
5a7d2982c964337268173b8fd137dd80 *man/figures/fig_readme.png | ||
ce3dc6dab597db8cc0b7aa7891eed2b1 *man/harmo_process.Rd | ||
f84312d6c9b24d1c763c178a7b76bb1c *man/harmonized_dossier_evaluate.Rd | ||
fff6cc8f04ecadf214ed21de9a4cb620 *man/harmonized_dossier_summarize.Rd | ||
807a1748f5e3c74e8826745bba8f7f67 *man/harmonized_dossier_visualize.Rd | ||
0d9c7f49ca27b0350761f55ecf5bd6cd *man/is_data_proc_elem.Rd | ||
400c8b84468856d30218f9421463ad65 *man/is_dataschema.Rd | ||
eda8fb22078d83d58e0622cd34e4c470 *man/is_dataschema_mlstr.Rd | ||
c6f092b26dcf223382e55fa8801cf3b7 *man/pooled_harmonized_dataset_create.Rd | ||
0d4d180b940b2f09ff608528e539a9fa *man/reexports.Rd | ||
e5455210434ca6b0035b1ddb29c135b6 *man/show_harmo_error.Rd | ||
651e86fc34c5a870430596950bac697e *vignettes/a-Glossary-and-templates.Rmd | ||
dd565779c4eed1cb36ed7bfd60a720e5 *vignettes/b-Data-processing-elements.Rmd | ||
8c0f211c628d343e5a8f7008cd30491d *vignettes/c-Example-with-Rmonize_DEMO.Rmd | ||
df77331e292754e7d823c16b5c16cf9e *vignettes/datatables.R |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
# Generated by roxygen2: do not edit by hand | ||
|
||
export(Rmonize_help) | ||
export(Rmonize_templates) | ||
export(as_data_dict) | ||
export(as_data_proc_elem) | ||
export(as_dataschema) | ||
export(as_dataschema_mlstr) | ||
export(as_dataset) | ||
export(as_dossier) | ||
export(as_harmonized_dossier) | ||
export(bookdown_open) | ||
export(data_dict_apply) | ||
export(data_dict_evaluate) | ||
export(data_dict_extract) | ||
export(dataschema_evaluate) | ||
export(dataschema_extract) | ||
export(dataset_evaluate) | ||
export(dataset_summarize) | ||
export(dataset_visualize) | ||
export(dossier_create) | ||
export(dossier_evaluate) | ||
export(dossier_summarize) | ||
export(harmo_process) | ||
export(harmonized_dossier_evaluate) | ||
export(harmonized_dossier_summarize) | ||
export(harmonized_dossier_visualize) | ||
export(is_data_proc_elem) | ||
export(is_dataschema) | ||
export(is_dataschema_mlstr) | ||
export(pooled_harmonized_dataset_create) | ||
export(show_harmo_error) | ||
import(dplyr) | ||
import(fabR) | ||
import(fs) | ||
import(haven) | ||
import(stringr) | ||
import(tidyr) | ||
importFrom(crayon,bold) | ||
importFrom(crayon,green) | ||
importFrom(madshapR,as_category) | ||
importFrom(madshapR,as_data_dict) | ||
importFrom(madshapR,as_data_dict_mlstr) | ||
importFrom(madshapR,as_dataset) | ||
importFrom(madshapR,as_dossier) | ||
importFrom(madshapR,as_taxonomy) | ||
importFrom(madshapR,as_valueType) | ||
importFrom(madshapR,bookdown_open) | ||
importFrom(madshapR,col_id) | ||
importFrom(madshapR,data_dict_apply) | ||
importFrom(madshapR,data_dict_evaluate) | ||
importFrom(madshapR,data_dict_extract) | ||
importFrom(madshapR,data_dict_filter) | ||
importFrom(madshapR,data_extract) | ||
importFrom(madshapR,dataset_evaluate) | ||
importFrom(madshapR,dataset_summarize) | ||
importFrom(madshapR,dataset_visualize) | ||
importFrom(madshapR,dataset_zap_data_dict) | ||
importFrom(madshapR,dossier_create) | ||
importFrom(madshapR,dossier_evaluate) | ||
importFrom(madshapR,dossier_summarize) | ||
importFrom(madshapR,is_category) | ||
importFrom(madshapR,is_data_dict) | ||
importFrom(madshapR,is_data_dict_mlstr) | ||
importFrom(madshapR,is_dataset) | ||
importFrom(madshapR,is_dossier) | ||
importFrom(madshapR,is_taxonomy) | ||
importFrom(madshapR,valueType_adjust) | ||
importFrom(rlang,":=") | ||
importFrom(rlang,.data) | ||
importFrom(rlang,is_error) | ||
importFrom(rlang,is_warning) | ||
importFrom(utils,browseURL) | ||
importFrom(utils,capture.output) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
|
||
# Rmonize 1.0.1 | ||
|
||
Bug corrections and enhancements after testing with real data. | ||
|
||
## Bug fixes and improvements | ||
|
||
### Improvement in handling pooled data | ||
|
||
The functions `harmo_process()`, `pool_harmonized_dataset_create()`, | ||
`harmonized_dossier_create()`, `harmonized_dossier_evaluate()`, | ||
`harmonized_dossier_summarize()`, `harmonized_dossier_visualize()` share | ||
the same parameter “harmonized_col_dataset” which is (if exists) the | ||
name of the column referring the input dataset names. If this column | ||
exists and is declared by the user, this will be used across the | ||
pipeline as a grouping/separating variable. By default, the name of each | ||
dataset will be used instead. | ||
|
||
rename DEMO_file_harmo into Rmonize_DEMO and update examples | ||
|
||
suppress the parameter overwrite = TRUE in the functions xxx_visualize() | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/38> | ||
|
||
in visual reports, void confusing changes in color scheme in visual | ||
reports. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/37> | ||
|
||
Histograms for date variables display valid ranges. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/31> | ||
|
||
in reports, change % NA as proportion in reports. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/29> | ||
|
||
`harmonized_dossier_visualize()` report shows variable labels in the | ||
same language. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/28> | ||
|
||
put id_creation in script and in rule in dpe (as in direct_mapping) | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/27> | ||
|
||
Allow special characters in names of datasets and data_dicts | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/23> | ||
|
||
In visual reports, the bar plot only appears when there are multiple | ||
missing value types, otherwise only the pie chart is shown. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/22> | ||
|
||
enhance harmonized_dossier_visualize() output | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/17> | ||
|
||
enhance `show_harmo_error()` output | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/5> | ||
|
||
in reports, all of the percentages are now included under “Other values | ||
(non categorical)”, which gives a single value. | ||
|
||
- <https://github.com/maelstrom-research/Rmonize/issues/4> | ||
|
||
Function recode with special character is possible now | ||
|
||
# Rmonize 1.0.0 | ||
|
||
Functions to support rigorous retrospective data harmonization | ||
processing, evaluation, and documentation across datasets in a dossier | ||
based on Maelstrom Research guidelines. The package includes the core | ||
functions to evaluate and format the main inputs that define the | ||
harmonization process, apply specified processing rules to generate | ||
harmonized data, diagnose processing errors, and summarize and evaluate | ||
harmonized outputs. | ||
|
||
This is still a work in progress, so please let us know if you used a | ||
function before and is not working any longer. | ||
|
||
## Helper functions and objects | ||
|
||
- `Rmonize_help()` Call the help center for full documentation | ||
- `dowload_templates()` Call the help center to the download template | ||
page | ||
- `Rmonize_DEMO` Built-in material allowing the user to test the package | ||
with demo data | ||
|
||
## Assess and manipulate input files | ||
|
||
- `as_data_proc_elem()` Validate and coerce any object as a Data | ||
Processing Elements | ||
- `as_dataschema()`, `as_dataschema_mlstr()` Validate and coerce any | ||
object as the DataSchema | ||
- `as_harmonized_dossier()` Validate and coerce any object as an | ||
harmonized dossier | ||
- `dataschema_extract()` Extract and create the DataSchema from a data | ||
processing elements | ||
|
||
## Data processing | ||
|
||
- `harmo_process()` Generate harmonized dataset(s) and annotated Data | ||
Processing Elements. This function internally runs other functions, | ||
which are : | ||
|
||
- `harmo_parse_process_rule()`, | ||
`harmo_process_add_variable()`,`harmo_process_case_when()`, | ||
`harmo_process_direct_mapping()`,`harmo_process_id_creation()`, | ||
`harmo_process_impossible()`,`harmo_process_merge_variable()`, | ||
`harmo_process_operation()`,`harmo_process_other()`, | ||
`harmo_process_paste()`,`harmo_process_recode()`, | ||
`harmo_process_rename()`,`harmo_process_undetermined()` | ||
|
||
- `pooled_harmonized_dataset_create()` Generate the pooled dataset from | ||
harmonized datasets in a dossier | ||
|
||
## Evaluation of the harmonization process | ||
|
||
- `show_harmo_error()` Generate a summary of the annotated Data | ||
Processing Elements | ||
- `data_proc_elem_evaluate()`,`dataschema_evaluate()`, | ||
`harmonized_dossier_evaluate()`,`harmonized_dossier_summarize()`, | ||
`harmonized_dossier_visualize()` Generate a quality assessment reports | ||
and summary statistics of inputs and outputs. | ||
|
||
## import from madshapR package: | ||
|
||
- Shape and prepare input (datasets and data dictionaries) : | ||
|
||
`as_data_dict()`,`is_data_dict()`, | ||
`as_data_dict_mlstr()`,`is_data_dict_mlstr()`, | ||
`as_dataset()`,`is_dataset()`, `as_dossier()`,`is_dossier()`, | ||
`as_taxonomy()` | ||
|
||
- Extract and manipulate information from input : | ||
|
||
`data_extract()`,`data_dict_extract()`, | ||
`data_dict_apply()`,`dataset_zap_data_dict()`,`dossier_create()` | ||
`valueType_adjust()` | ||
|
||
- Assess input data : | ||
|
||
`dataset_evaluate()`, `data_dict_evaluate()`,`dossier_evaluate()`, | ||
`dataset_summarize()`,`dossier_summarize()` | ||
|
||
- Visualize input data : | ||
|
||
`bookdown_template()`,`bookdown_render()`,`bookdown_open()`, | ||
`dataset_visualize()` |
Oops, something went wrong.