-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
View key clean-up #67
Comments
The more I think about this the more I wonder if the meaning of One option would be instead of: x-position | method | relationship | relative to | weight | shortname We could move to: x-position | method/s | condition/s | relative to | weight | shortname Since the third part of the view key actually describes the conditions placed on |
Examples of frequency-only keys: ############################ Counts
x|f|:|||counts
1 2 3 4 5
1 1 7 3 2 7
2 4 3 5 4 6
3 6 2 4 6 3
############################ Column base
x|f|x:|||cbase
1 2 3 4 5
cbase 11 12 12 12 16
############################ Column base percentages
x|f|:|y||counts
a b c d e
x 9 58 25 17 44
y 36 25 42 33 38
z 55 17 33 50 19
############################ Row base
x|f|:y|||rbase
rbase
1 20
2 22
3 21
############################ Row base percentages
x|f|:|x||counts
a b c d e
x 5 35 15 10 35
y 18 14 23 18 27
z 29 10 19 29 14
############################ Intersection base
x|f|x:y|||base
rbase
cbase 63
############################ Intersection base percentage
x|f|:|xy||counts
a b c d e
x 2 11 5 3 11
y 6 5 8 6 10
z 10 3 6 10 5
############################ Unfiltered column base percentages
x|f|:|y@||counts
a b c d e
x 9 58 25 17 44
y 36 25 42 33 38
z 55 17 33 50 19
############################ Unfiltered y base percentages
x|f|:|@y||counts
a b c d e
x 9 58 25 17 44
y 36 25 42 33 38
z 55 17 33 50 19
############################ Unfiltered row base percentages
x|f|:|x@||counts
a b c d e
x 5 35 15 10 35
y 18 14 23 18 27
z 29 10 19 29 14
############################ Unfiltered x base percentages
x|f|:|@x||counts
a b c d e
x 5 35 15 10 35
y 18 14 23 18 27
z 29 10 19 29 14
############################ Unfiltered total N percentages
x|f|:|N||counts
a b c d e
x 1 7 3 2 7
y 4 3 5 4 6
z 6 2 4 6 3
############################ Column logic
x|f|x[{1,2}]:|||clogic
1 2 3 4 5
clogic 5 10 8 6 13
############################ Column count logic
x|f|x[{1,2}(1)]:|||cclogic
1 2 3 4 5
cclogic 3 4 2 3 4
############################ Column arithemtic logic
x|f.math:f|x[{1,2}-{3}]:|||calogic
1 2 3 4 5
calogic -1 8 4 0 10
############################ Row logic
x|f|:y[{3,4}]|||rlogic
rlogic
1 5
2 9
3 10
############################ Row count logic
x|f|:y[{3,4}(1)]|||rlogic
rclogic
1 3
2 5
3 4
############################ Row arithemtic logic
x|f:f.math|:y[{3,4}-{5}]|||ralogic
ralogic
1 -2
2 3
3 7
############################ Intersection logic
x|f|x[{1,2}]:y[{3,4}]|||base
rlogic
clogic 63
############################ Block logic rows
x|f|x[{1,2}],x[{2,3}]:|||clogic
1 2 3 4 5
clogic1 5 10 8 6 13
clogic2 10 5 9 10 9
############################ Block logic columns
x|f|:y[{3,4}],y[{4,5}]|||rlogic
rlogic1 rlogic2
1 5 9
2 9 10
3 10 9
############################ Intersection block logic
x|f|x[{1,2}],x[{2,3}]:y[{3,4}],y[{4,5}]|||base
rlogic1 rlogic2
clogic1 14 19
clogic2 19 19
############################ Effective column base
x|f.eff:f|x:||weight|ecbase
1 2 3 4 5
ecbase 11 12 12 12 16
############################ Effective row base
x|f:f.eff|:y||weight|ernet
ernet
1 5
2 9
3 10
############################ Effective intersection base
x|f.eff|x:y||weight|base
erbase
ecbase 63 These examples include something we're not planning to support for a while yet: ############################ Unfiltered column base percentages
x|f|:|y@||counts
############################ Unfiltered y base percentages
x|f|:|@y||counts
############################ Unfiltered row base percentages
x|f|:|x@||counts
############################ Unfiltered x base percentages
x|f|:|@x||counts
############################ Unfiltered total N percentages
x|f|:|N||counts In these cases:
|
Nested notation
Following is an example of notation describing column logic on the 2nd Nested notation also requires the presence of As with the absence of x|f|>x1[{1,2}]:|||cnlogic
x0 x1 1 2 3 4 5
1 clogic 2 4 9 8 3
2 clogic 3 2 1 2 4
3 clogic 6 7 3 4 7
4 clogic 3 3 5 1 5 As with the presence of x|f|x0>x1[{1,2}]:|||cnlogic
x0 x1 1 2 3 4 5
cbase clogic 2 4 9 8 3 Other than the explicit x|f|>x1[{1,2}]:y|||cnlogic
x0 x1 rbase
1 clogic 26
2 clogic 12
3 clogic 27
4 clogic 17 Nested notation also applies to relative notation, let's assume the x|f|>x1[{1,2}]:|y0||cnlogic
y0 1 1 2 2
y1 1 2 1 2
x0 x1
1 clogic 34 42 74 45
2 clogic 22 75 63 23
3 clogic 58 87 22 36
4 clogic 63 63 15 17 In any case echewing explicit level notation will always be interpreted as the last-level. So if the |
This will be resolved by #290. |
There are some problems with the current view key notation that need to be cleaned up.
method colon-delimiting
The method-part of the view key needs to be colon-delimit-able so that it it can describe the effect of different methods acting on x and y. Where only 1 method is named and both x and y are present, the same method should be assumed to be working on both.
The general rule should be that
method_a:method_b|x:y
means the intersection ofmethod_a(x)
bymethod_b(y)
, a more concrete example beingfrequency:mean|x:y
means the intersection offrequency(x)
bymean(y)
.By extension, though, this renders what is currently
frequency|x:y
incorrect as the key for a column base row because this should mean the intersection offrequency(x)
byfrequency(y)
, or in plain speak where the row and column bases intersect (e.g. the number of cases inx
andy
).As a consequence, the correct key for a column base row should be simply
frequency|x
and for a row base columnfrequency|y
. Incidentally this is perfectly in keeping with the fundamental meaning offrequency|
as basic counts, since the mention of eitherx
ory
is an implied collapse of all their values, respectively.More examples (assuming
x
andy
each have 3 possible values):... and so on.
Another important change that should be made is to use the conventional curly brace for set notation, so logic descriptors should be written as
x[{1,2}]:
instead ofx[(1,2)]:
. Currently the curly brace is used for answer count, but the two uses should be swapped. In this way one answer from codes 1 or 2 would be written asx[{1,2}(1)]:
.Due to the required delimitable-nature of the method-part of the view key, it may be prudent to put in place some truncation rules that method names must adhere to. For example instead of
frequency
perhaps simplyf
will suffice, especially given that it's so common. for other methods a 6-character limit per sub/method-part (to allow for needed abbreviations likestddev
,stderr
and so on) would help condense the overall key length and improve readability.To avoid ambiguity, what is currently the relation part of the view key must always include a colon.
|:|
means no conditions placed on either x or y|x:|
collapsed x, no conditions placed on y|:y|
collapsed y, no conditions placed on x|x:y|
collapsed x and yThe new convention means you should never see something like
|y:x|
because the left-hand side will always describe x and the right-hand side will always describe y.In accordance with all of these proposed changes, the above view keys would become:
However, all of these examples use the same method on
x
andy
, which will often not be the case. Where a different method is used on each, both methods must be named and must be colon-delimited.In conjunction with the need for descriptive stats to be named using sub-methods, this leads to:
Including the change for set notation, block nets also need to appear in discrete
x
/y
-blocks delimited with a comma, meaning they will change from|x[(1,2),(3,4),(5,6):y
to|x[{1,2}],x[{3,4}],x[{5,6}]:
This both corrects for ambiguity compared to complex logic and to provide for a comma-delimited relationship between the multiple methods and x/y.Given the likely eventuality of other block methods the conventions should be similarly lazy, where
f|x[{1,2}],x[{3,4}],x[{5,6}]:
is effectively shorthand forf,f,f:f|x[{1,2}],x[{3,4}],x[{5,6}]:
.This is more relevant when imagining the needs of a block of descriptive stats, in which case
d.mean,d.stddev,d.stderr:f|x:
is more meaningful. In any case, parts that are not mentioned explicitly imply uniform application, so as to prevent the need for something liked.mean,d.stddev,d.stderr:f|x,x,x:
.effective base
Effective base view keys should indicate a sub-method of frequency and must name a weight-part. What is currently
x|frequency|x:y|||ebase
should becomex|f.eff:f|x:||weight|ecbase
. Similarly, an effective row base would bex|f:f.eff|:y||weight|erbase
.The text was updated successfully, but these errors were encountered: