Derotation of data
is necessary if brands, products or something similar (levels) are assessed and each respondent (case) rates a different selection of that levels. So each case has several responses. Derotation now means, that the data
is switched from case-level to responses-level.
Example: q1_1/q1_2
: On a scale from 1 to 10, how much do you like the following drinks?
2: cola
3: lemonade
4: beer
``data``
id | drink_1 | drink_2 | q1_1 | q1_2 | gender |
case1 | 1 | 3 | 2 | 8 | 1 |
case2 | 1 | 4 | 9 | 5 | 2 |
case3 | 2 | 4 | 6 | 10 | 1 |
derotated ``data``
drink | drink_levelled | q1 | gender | |
case1 | 1 | 1 | 2 | 1 |
case1 | 2 | 3 | 8 | 1 |
case2 | 1 | 1 | 9 | 2 |
case2 | 2 | 4 | 5 | 2 |
case3 | 1 | 2 | 6 | 1 |
case3 | 2 | 4 | 10 | 1 |
To identify which case rates which levels, some key-/level-variables are included in the data
, in this example drink_1
and drink_2
. Variables (for example gender
) that are not included to this loop can also be added.
The DataSet
method takes a few parameters:
levels
:dict
oflist
Contains all key-/level-variables and the name for the new levelled variable. All key-/level-variables must have the same
value_map
.>>> levels = {'drink': ['drink_1', 'drink_2']}
mapper
:list
ofdicts
oflist
Contains the looped questions and the new
column
name to which the looped questions will be combinded.>>> mapper = [{'q1': ['q1_1', 'q1_2']}]
other
:str
orlist
ofstr
Contains all variables that should be assumed to the derotated
data
, but which are not included in the loop.>>> other = 'gender'
unique_key
:str
Name of varibale that identifies cases in the initial
data
.>>> unique_key = 'id'
dropna
:bool
, defaultTrue
If a case rates less then the possible counts of levels, these responses will be droped.
>>> ds = dataset.derotate(levels = {'drink': ['drink_1', 'drink_2']}, ... mapper = [{'q1': ['q1_1', 'q1_2']}], ... other = 'gender', ... unique_key = 'id', ... dropna = True)
It is possible that also arrays
are looped. In this case a mapper can look like this:
>>> mapper = [{'q12_1': ['q12a[{q12a_1}].q12a_grid', 'q12b[{q12b_1}].q12b_grid', ... 'q12c[{q12c_1}].q12c_grid', 'q12d[{q12d_1}].q12d_grid']}, ... {'q12_2': ['q12a[{q12a_2}].q12a_grid', 'q12b[{q12b_2}].q12b_grid', ... 'q12c[{q12c_2}].q12c_grid', 'q12d[{q12d_2}].q12d_grid']}, ... {'q12_3': ['q12a[{q12a_3}].q12a_grid', 'q12b[{q12b_3}].q12b_grid', ... 'q12c[{q12c_3}].q12c_grid', 'q12d[{q12d_3}].q12d_grid']}, ... {'q12_4': ['q12a[{q12a_4}].q12a_grid', 'q12b[{q12b_4}].q12b_grid', ... 'q12c[{q12c_4}].q12c_grid', 'q12d[{q12d_4}].q12d_grid']}, ... {'q12_5': ['q12a[{q12a_5}].q12a_grid', 'q12b[{q12b_5}].q12b_grid', ... 'q12c[{q12c_5}].q12c_grid', 'q12d[{q12d_5}].q12d_grid']}, ... {'q12_6': ['q12a[{q12a_6}].q12a_grid', 'q12b[{q12b_6}].q12b_grid', ... 'q12c[{q12c_6}].q12c_grid', 'q12d[{q12d_6}].q12d_grid']}, ... {'q12_7': ['q12a[{q12a_7}].q12a_grid', 'q12b[{q12b_7}].q12b_grid', ... 'q12c[{q12c_7}].q12c_grid', 'q12d[{q12d_7}].q12d_grid']}, ... {'q12_8': ['q12a[{q12a_8}].q12a_grid', 'q12b[{q12b_8}].q12b_grid', ... 'q12c[{q12c_8}].q12c_grid', 'q12d[{q12d_8}].q12d_grid']}, ... {'q12_9': ['q12a[{q12a_9}].q12a_grid', 'q12b[{q12b_9}].q12b_grid', ... 'q12c[{q12c_9}].q12c_grid', 'q12d[{q12d_9}].q12d_grid']}, ... {'q12_10': ['q12a[{q12a_10}].q12a_grid', 'q12b[{q12b_10}].q12b_grid', ... 'q12c[{q12c_10}].q12c_grid', 'q12d[{q12d_10}].q12d_grid']}, ... {'q12_11': ['q12a[{q12a_11}].q12a_grid', 'q12b[{q12b_11}].q12b_grid', ... 'q12c[{q12c_11}].q12c_grid', 'q12d[{q12d_11}].q12d_grid']}, ... {'q12_12': ['q12a[{q12a_12}].q12a_grid', 'q12b[{q12b_12}].q12b_grid', ... 'q12c[{q12c_12}].q12c_grid', 'q12d[{q12d_12}].q12d_grid']}, ... {'q12_13': ['q12a[{q12a_13}].q12a_grid', 'q12b[{q12b_13}].q12b_grid', ... 'q12c[{q12c_13}].q12c_grid', 'q12d[{q12d_13}].q12d_grid']}]]
Can be also writen like this:
>>> for y in frange('1-13'): ... q_group = [] ... for x in ['a', 'b', 'c', 'd']: ... var = 'q12{}'.format(x) ... var_grid = var + '[{' + var + '_{}'.format(y) + '}].' + var + '_grid' ... q_group.append(var_grid) ... mapper.append({'q12{}'.format(y): q_group})
So the derotated dataset
will lose its meta
information about the mask
and only the columns
q12_1
to q12_13
will be added. To receive back the mask
structure, use the method dataset.to_array()
:
>>> variables = [{'q12_1': u'label 1'}, ... {'q12_2': u'label 2'}, ... {'q12_3': u'label 3'}, ... {'q12_4': u'label 4'}, ... {'q12_5': u'label 5'}, ... {'q12_6': u'label 6'}, ... {'q12_7': u'label 7'}, ... {'q12_8': u'label 8'}, ... {'q12_9': u'label 9'}, ... {'q12_10': u'label 10'}, ... {'q12_11': u'label 11'}, ... {'q12_12': u'label 12'}, ... {'q12_13': u'label 13'}] >>> ds.to_array('qTP', variables, 'Var_name')
variables
can also be a list of variable-names, then the mask-items
will be named by its belonging columns
.
arrays
included in other
will keep their meta
structure.