Skip to content
Speed up the process of simple analysis using data-set template
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
man
.Rbuildignore
.gitignore
.~lock.template.xlsx#
00-test.R
DESCRIPTION
NAMESPACE
README.md
autostudy.Rproj
template.xlsx
~$template.xlsx

README.md

autostudy - clean data for swift analysis in R

The autostudy R package validate and help the analysis of files provided in the right format.

The idea

As an epidemiologist in a French public hospital part of the work is to help physician with their studies. This can be quite time consuming as the data is almost always in the worst possible shape and data management is a pain. On the other hand the analysese themselves are often simple.

This package is based on the assumption that you asked the data to be handed to you in a predefined format. The package will automatically validate the data and output logs to help the clinician to put the data in the right shape.

A template for data gathering and an explicative document has to be provided to the clinician.

Some elements

convert_to_ are wrapper around common dm functions to maximise the data imported and remove some unauthorized ones.

The converted df will be checked against the original one to fine the new NAs

errors:

  • a df with "table", "column", "line", "error_type"
  • try to the make it as clear and concise as possible
    • maybe output whole line with errors and outline the errors themselves

To Do

  • Make some faulty files to add the error handling
  • in template :
    • impose the data types
    • idem for date format, separate info int categories and date_format
    • see if I can protect the raws of the var dict
    • add a "pretty name variable" (for plot labels and tables)
  • make a mapping table of "var_type" vs "r type" (many to one)
  • juste say were it went wrong in the data importation
  • after look for outliers / produce a report for validate (descriptor)
  • document functions
  • When the data format is finished, provide template and explicative document
You can’t perform that action at this time.