Skip to content

Meta data

jeromyanglim edited this page Oct 19, 2014 · 5 revisions

The file

I use the file data/meta.xls to store meta-data related to analysis projects. Spreadsheets are very easy to work with and enter data. I use the xls file extension rather than xlsx because the gdata import package does not throw an error when the xls file is open (which is a nice feature when you are reimporting metadata while still viewing the metadata).

Psychological scales

I often deal with psychological tests (e.g., personality, well-being, clinical measures). Such tests include multiple where each item is typically measured on a common scale (e.g., 1 to 5 strongly disagree to strongly agree). This meta data can then be combined with scoring functions such as scoreItems in the psych package. Variables in such sheets include

  • id: name of the variable in R (one per item; typically named foo1, foo2, ... fooK where foo is a short abbreviation of the test name and the number indicates the item number)
  • itemnumber: A number typically 1, 2, and so on. This is important for sorting.
  • reverse: Many tests often have reversed items. I.e., ((max + min) - score); I indicate reversal status using = not reversed and -1 = reversed.
  • text: The item text
  • scale: Scoring of tests vary. In the simplest form, there is a one to one mapping between items and scales and this can be recorded in a single variable called scale. However, other scenarios require different data structures: (a) many items to many scales requires one column per scale with indicators for item inclusion; (b) two-levels of one-to-one item to scale mapping can be represented by two columns. This includes the simple set of scales plus a total score as well as hierarchical tests (e.g., 30 facets and 5 factors of personality where each factor has six facets).

Variable Labels (meta.variablelabels)

It is common that variables names need to be replaced with labels in some form of tabular output. Thus, meta.variablelabels is a place to store these replacement rules (i.e., variable corresponds to the variable name). label corresponds to the label that will replace the variable name in some tabular output.

Clone this wiki locally