Skip to content

Commit

Permalink
version 1.0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
GuiFabre authored and cran-robot committed Dec 22, 2023
0 parents commit 5883369
Show file tree
Hide file tree
Showing 63 changed files with 13,317 additions and 0 deletions.
45 changes: 45 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Package: Rmonize
Type: Package
Title: Support Retrospective Harmonization of Data
Version: 1.0.1
Authors@R:
c(person(given = "Guillaume",
family = "Fabre",
role = c("aut", "cre"),
email = "guijoseph.fabre@gmail.com",
comment = c(ORCID = "0000-0002-0124-9970")),
person("Maelstrom-research group",
role=c("fnd")))
Maintainer: Guillaume Fabre <guijoseph.fabre@gmail.com>
Description: Functions to support rigorous retrospective data harmonization
processing, evaluation, and documentation across datasets from different
studies based on Maelstrom Research guidelines. The package includes the
core functions to evaluate and format the main inputs that define the
harmonization process, apply specified processing rules to generate
harmonized data, diagnose processing errors, and summarize and evaluate
harmonized outputs. The main inputs that define the processing are a
DataSchema (list and definitions of harmonized variables to be generated)
and Data Processing Elements (processing rules to be applied to generate
harmonized variables from study-specific variables). The main outputs of
processing are harmonized datasets, associated metadata, and tabular and
visual summary reports. As described in
Maelstrom Research guidelines for rigorous retrospective data
harmonization (Fortier I and al. (2017) <doi:10.1093/ije/dyw075>).
License: GPL-3
LazyData: true
Depends: R (>= 3.4)
Imports: dplyr (>= 1.1.0), rlang, stringr, tidyr, crayon, haven, utils,
fs, fabR (>= 2.0.0), madshapR
Suggests: janitor, car, knitr
URL: https://github.com/maelstrom-research/Rmonize/
BugReports: https://github.com/maelstrom-research/Rmonize/issues
RoxygenNote: 7.2.3
VignetteBuilder: knitr
Encoding: UTF-8
Language: en-US
NeedsCompilation: no
Packaged: 2023-12-20 16:52:38 UTC; guill
Author: Guillaume Fabre [aut, cre] (<https://orcid.org/0000-0002-0124-9970>),
Maelstrom-research group [fnd]
Repository: CRAN
Date/Publication: 2023-12-21 16:30:04 UTC
62 changes: 62 additions & 0 deletions MD5
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
bf50bea9d1cc9129fb706d303b689576 *DESCRIPTION
0ab457c3e5eec24233535e292f660a49 *NAMESPACE
fad0853e7a8f13c7f4672c15ccfda2b3 *NEWS.md
41991d11778dd8e6ae86217d81b48ba3 *R/00-import-from-madshapR.R
53c13717949302761598bdcdb2bb32a8 *R/01-utils.R
3a365a666412a54f452bfdfde1b89902 *R/02-harmo_process_harmonization.R
5da1cd4bc06bbcec23a45962d41680c7 *R/03-harmonized_data_evaluate.R
0c4fbf469dab718bd61dd3ed475a4078 *R/04-harmonized_data_summarize.R
258b557a71aa07abc81d5fb7fe916492 *R/05-harmonized_data_visualize.R
1b54137d4ae82b40a266122de6d1201b *R/Rmonize-package.R
30473863dc61a6763ab0e3fc9a4414a1 *README.md
2836b2c5db38c9819380bd093383d89d *build/partial.rdb
ff70bbda47a299185435e70394cd80f0 *build/vignette.rds
c55d2d23abff5dca755002de2fd59a6e *data/Rmonize_DEMO.rda
c1b9c0d0eac32d433fc090109ef1cabe *inst/WORDLIST
cf0d88805ee1cb048eb11ff1bd88956a *inst/doc/a-Glossary-and-templates.R
651e86fc34c5a870430596950bac697e *inst/doc/a-Glossary-and-templates.Rmd
215a19ff1d1ac9ca2e399b2da3df585e *inst/doc/a-Glossary-and-templates.html
cf0d88805ee1cb048eb11ff1bd88956a *inst/doc/b-Data-processing-elements.R
dd565779c4eed1cb36ed7bfd60a720e5 *inst/doc/b-Data-processing-elements.Rmd
0c53ffa75047c9c01a885dbda5da3a19 *inst/doc/b-Data-processing-elements.html
a4b22c9cec9c2b67460ed40ef757f5af *inst/doc/c-Example-with-Rmonize_DEMO.R
8c0f211c628d343e5a8f7008cd30491d *inst/doc/c-Example-with-Rmonize_DEMO.Rmd
06c705ce33467d188b6221e8d77a61db *inst/doc/c-Example-with-Rmonize_DEMO.html
564666a2a16494478d205ec1f55f31c7 *man/Rmonize-package.Rd
7fda5bfa7a3d1183f1d8212f421320e9 *man/Rmonize_DEMO.Rd
c8f7a2640b2e456dcb742a5213a32bd9 *man/Rmonize_help.Rd
7727ce6a976e9679ed88af24c5d07828 *man/Rmonize_templates.Rd
d259ad678473874c4306ab68b0872d02 *man/as_data_dict.Rd
d88ef7e030e090e5df435a0e3672964f *man/as_data_proc_elem.Rd
b70d6c025e6d08337f6c39a9ab27abba *man/as_dataschema.Rd
f63f4dec4d027dc32e0fe3c521319737 *man/as_dataschema_mlstr.Rd
1ab90b88583c61e96914c75fdfb7053e *man/as_dataset.Rd
4bf4167df0feec0728f471ecfd97e42f *man/as_dossier.Rd
e157b56ffbab1279572c9b92dda3db94 *man/as_harmonized_dossier.Rd
5478e2862ba5ad3a8f0f858b6eb5957a *man/bookdown_open.Rd
553ddc568949babb77aca23a6ae9dd51 *man/data_dict_apply.Rd
8cc40b23ce5830c623bf02ac4dbdcc98 *man/data_dict_evaluate.Rd
945e834191fdae7aa32c07fe1797602e *man/data_dict_extract.Rd
10f1594c21afd01fa8dc8173ef1e5ae5 *man/dataschema_evaluate.Rd
c740bef912c858af98bf0a2abb5e37df *man/dataschema_extract.Rd
04b898c19d70dfc9cdfe4fbee31f8f75 *man/dataset_evaluate.Rd
74c32e28ebd360e8e8c6b1d04c2557f2 *man/dataset_summarize.Rd
5624151e4884ae4824b7baaf992fd42b *man/dataset_visualize.Rd
2fad6730b0858d3791518a4a9b862425 *man/dossier_create.Rd
fbb667e37efc99716e8d9289c81e28d6 *man/dossier_evaluate.Rd
847635a7d77f99a9a4c055ad20d6a6df *man/dossier_summarize.Rd
5a7d2982c964337268173b8fd137dd80 *man/figures/fig_readme.png
ce3dc6dab597db8cc0b7aa7891eed2b1 *man/harmo_process.Rd
f84312d6c9b24d1c763c178a7b76bb1c *man/harmonized_dossier_evaluate.Rd
fff6cc8f04ecadf214ed21de9a4cb620 *man/harmonized_dossier_summarize.Rd
807a1748f5e3c74e8826745bba8f7f67 *man/harmonized_dossier_visualize.Rd
0d9c7f49ca27b0350761f55ecf5bd6cd *man/is_data_proc_elem.Rd
400c8b84468856d30218f9421463ad65 *man/is_dataschema.Rd
eda8fb22078d83d58e0622cd34e4c470 *man/is_dataschema_mlstr.Rd
c6f092b26dcf223382e55fa8801cf3b7 *man/pooled_harmonized_dataset_create.Rd
0d4d180b940b2f09ff608528e539a9fa *man/reexports.Rd
e5455210434ca6b0035b1ddb29c135b6 *man/show_harmo_error.Rd
651e86fc34c5a870430596950bac697e *vignettes/a-Glossary-and-templates.Rmd
dd565779c4eed1cb36ed7bfd60a720e5 *vignettes/b-Data-processing-elements.Rmd
8c0f211c628d343e5a8f7008cd30491d *vignettes/c-Example-with-Rmonize_DEMO.Rmd
df77331e292754e7d823c16b5c16cf9e *vignettes/datatables.R
74 changes: 74 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Generated by roxygen2: do not edit by hand

export(Rmonize_help)
export(Rmonize_templates)
export(as_data_dict)
export(as_data_proc_elem)
export(as_dataschema)
export(as_dataschema_mlstr)
export(as_dataset)
export(as_dossier)
export(as_harmonized_dossier)
export(bookdown_open)
export(data_dict_apply)
export(data_dict_evaluate)
export(data_dict_extract)
export(dataschema_evaluate)
export(dataschema_extract)
export(dataset_evaluate)
export(dataset_summarize)
export(dataset_visualize)
export(dossier_create)
export(dossier_evaluate)
export(dossier_summarize)
export(harmo_process)
export(harmonized_dossier_evaluate)
export(harmonized_dossier_summarize)
export(harmonized_dossier_visualize)
export(is_data_proc_elem)
export(is_dataschema)
export(is_dataschema_mlstr)
export(pooled_harmonized_dataset_create)
export(show_harmo_error)
import(dplyr)
import(fabR)
import(fs)
import(haven)
import(stringr)
import(tidyr)
importFrom(crayon,bold)
importFrom(crayon,green)
importFrom(madshapR,as_category)
importFrom(madshapR,as_data_dict)
importFrom(madshapR,as_data_dict_mlstr)
importFrom(madshapR,as_dataset)
importFrom(madshapR,as_dossier)
importFrom(madshapR,as_taxonomy)
importFrom(madshapR,as_valueType)
importFrom(madshapR,bookdown_open)
importFrom(madshapR,col_id)
importFrom(madshapR,data_dict_apply)
importFrom(madshapR,data_dict_evaluate)
importFrom(madshapR,data_dict_extract)
importFrom(madshapR,data_dict_filter)
importFrom(madshapR,data_extract)
importFrom(madshapR,dataset_evaluate)
importFrom(madshapR,dataset_summarize)
importFrom(madshapR,dataset_visualize)
importFrom(madshapR,dataset_zap_data_dict)
importFrom(madshapR,dossier_create)
importFrom(madshapR,dossier_evaluate)
importFrom(madshapR,dossier_summarize)
importFrom(madshapR,is_category)
importFrom(madshapR,is_data_dict)
importFrom(madshapR,is_data_dict_mlstr)
importFrom(madshapR,is_dataset)
importFrom(madshapR,is_dossier)
importFrom(madshapR,is_taxonomy)
importFrom(madshapR,valueType_adjust)
importFrom(rlang,":=")
importFrom(rlang,.data)
importFrom(rlang,is_error)
importFrom(rlang,is_warning)
importFrom(utils,browseURL)
importFrom(utils,capture.output)
152 changes: 152 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@

# Rmonize 1.0.1

Bug corrections and enhancements after testing with real data.

## Bug fixes and improvements

### Improvement in handling pooled data

The functions `harmo_process()`, `pool_harmonized_dataset_create()`,
`harmonized_dossier_create()`, `harmonized_dossier_evaluate()`,
`harmonized_dossier_summarize()`, `harmonized_dossier_visualize()` share
the same parameter “harmonized_col_dataset” which is (if exists) the
name of the column referring the input dataset names. If this column
exists and is declared by the user, this will be used across the
pipeline as a grouping/separating variable. By default, the name of each
dataset will be used instead.

rename DEMO_file_harmo into Rmonize_DEMO and update examples

suppress the parameter overwrite = TRUE in the functions xxx_visualize()

- <https://github.com/maelstrom-research/Rmonize/issues/38>

in visual reports, void confusing changes in color scheme in visual
reports.

- <https://github.com/maelstrom-research/Rmonize/issues/37>

Histograms for date variables display valid ranges.

- <https://github.com/maelstrom-research/Rmonize/issues/31>

in reports, change % NA as proportion in reports.

- <https://github.com/maelstrom-research/Rmonize/issues/29>

`harmonized_dossier_visualize()` report shows variable labels in the
same language.

- <https://github.com/maelstrom-research/Rmonize/issues/28>

put id_creation in script and in rule in dpe (as in direct_mapping)

- <https://github.com/maelstrom-research/Rmonize/issues/27>

Allow special characters in names of datasets and data_dicts

- <https://github.com/maelstrom-research/Rmonize/issues/23>

In visual reports, the bar plot only appears when there are multiple
missing value types, otherwise only the pie chart is shown.

- <https://github.com/maelstrom-research/Rmonize/issues/22>

enhance harmonized_dossier_visualize() output

- <https://github.com/maelstrom-research/Rmonize/issues/17>

enhance `show_harmo_error()` output

- <https://github.com/maelstrom-research/Rmonize/issues/5>

in reports, all of the percentages are now included under “Other values
(non categorical)”, which gives a single value.

- <https://github.com/maelstrom-research/Rmonize/issues/4>

Function recode with special character is possible now

# Rmonize 1.0.0

Functions to support rigorous retrospective data harmonization
processing, evaluation, and documentation across datasets in a dossier
based on Maelstrom Research guidelines. The package includes the core
functions to evaluate and format the main inputs that define the
harmonization process, apply specified processing rules to generate
harmonized data, diagnose processing errors, and summarize and evaluate
harmonized outputs.

This is still a work in progress, so please let us know if you used a
function before and is not working any longer.

## Helper functions and objects

- `Rmonize_help()` Call the help center for full documentation
- `dowload_templates()` Call the help center to the download template
page
- `Rmonize_DEMO` Built-in material allowing the user to test the package
with demo data

## Assess and manipulate input files

- `as_data_proc_elem()` Validate and coerce any object as a Data
Processing Elements
- `as_dataschema()`, `as_dataschema_mlstr()` Validate and coerce any
object as the DataSchema
- `as_harmonized_dossier()` Validate and coerce any object as an
harmonized dossier
- `dataschema_extract()` Extract and create the DataSchema from a data
processing elements

## Data processing

- `harmo_process()` Generate harmonized dataset(s) and annotated Data
Processing Elements. This function internally runs other functions,
which are :

- `harmo_parse_process_rule()`,
`harmo_process_add_variable()`,`harmo_process_case_when()`,
`harmo_process_direct_mapping()`,`harmo_process_id_creation()`,
`harmo_process_impossible()`,`harmo_process_merge_variable()`,
`harmo_process_operation()`,`harmo_process_other()`,
`harmo_process_paste()`,`harmo_process_recode()`,
`harmo_process_rename()`,`harmo_process_undetermined()`

- `pooled_harmonized_dataset_create()` Generate the pooled dataset from
harmonized datasets in a dossier

## Evaluation of the harmonization process

- `show_harmo_error()` Generate a summary of the annotated Data
Processing Elements
- `data_proc_elem_evaluate()`,`dataschema_evaluate()`,
`harmonized_dossier_evaluate()`,`harmonized_dossier_summarize()`,
`harmonized_dossier_visualize()` Generate a quality assessment reports
and summary statistics of inputs and outputs.

## import from madshapR package:

- Shape and prepare input (datasets and data dictionaries) :

`as_data_dict()`,`is_data_dict()`,
`as_data_dict_mlstr()`,`is_data_dict_mlstr()`,
`as_dataset()`,`is_dataset()`, `as_dossier()`,`is_dossier()`,
`as_taxonomy()`

- Extract and manipulate information from input :

`data_extract()`,`data_dict_extract()`,
`data_dict_apply()`,`dataset_zap_data_dict()`,`dossier_create()`
`valueType_adjust()`

- Assess input data :

`dataset_evaluate()`, `data_dict_evaluate()`,`dossier_evaluate()`,
`dataset_summarize()`,`dossier_summarize()`

- Visualize input data :

`bookdown_template()`,`bookdown_render()`,`bookdown_open()`,
`dataset_visualize()`

0 comments on commit 5883369

Please sign in to comment.