Demographics: Using numbers instead of brackets? #177
Replies: 6 comments 12 replies
-
Great overview! I think this makes a lot of sense, and I especially like the idea of limiting income to thousands. The only thing is that if we reduce the number of "Company Size" options we also lose some granularity, but since we're gaining granularity for the other three questions maybe that's fine.
We could also use a slider and have the slider increase in $5,000 or $10,000 increments? |
Beta Was this translation helpful? Give feedback.
-
What is the wanted/needed precision? I think the survey should not ask personal questions on a higher specificity level than absolutely needed. With brackets I feel like there is some anonymity in the vagueness of the brackets and I'm a little more relaxed in answering. I think it's also faster to answer brackets as I do not need to consider the proper precision to answer in a semi-anonymous way. Also if I'm asked a bracketed question, it's way faster for me to answer as I probably know the ballpark easily. For example, in Finland we do not talk about our salaries on a yearly level in USD. We generally talk about our pay on a monthly level and in EUR. So to answer a survey like this I already need to do some math in my head. Brackets help me know the useful precision and may speed up the math. So, no, I do not have a number for my yearly salary on the top of my mind. Another thing to consider about the input type number is the accessibility side of it. Creating truly accessible forms is surprisingly difficult and input number might be problematic. Here are some pointers: https://technology.blog.gov.uk/2020/02/24/why-the-gov-uk-design-system-team-changed-the-input-type-for-numbers/ I might stop answering the survey if I'm asked too precise private question as it feels more insecure and targeted. Why does the survey maker need my exact salary? The precision requirement reasoning (or lax requirements) would need to be written to the survey in a convincing way. This text would further slow down filling the form. And even if the intentions asking precise private questions are pure, leaks happen, so criminals might their hands on personal information due to a breach or such. Regarding form filling speed, I feel that manual typing/clicking is not the part we should optimise solely. Instead privacy, mental burden and ease of answering should also be considered in addition to mechanical speed. Thank you for discussing these things in the public! I was motivated to try to help share my thoughts on the matter. |
Beta Was this translation helpful? Give feedback.
-
About the company size: For very small companies, it is a difference whether you ask for employees or the total count. "Just me!" in your proposal probably refers to a freelancer? I'd suggest to start the next bracket with 1 and not with 2. From my experience it is not overly uncommon to be the only employee, but have 1 to 3 bosses, for example. Looking forward to the survey!:) |
Beta Was this translation helpful? Give feedback.
-
Quick informal poll: https://twitter.com/SachaGreif/status/1688447006459805696 |
Beta Was this translation helpful? Give feedback.
-
Since the concensus is that we're keeping brackets, should we try and make the brackets more granular? For reference, the current ones are (in thousands of USD):
Maybe we could do this instead?
The reason for the change would be that in my early attempts to generate box plot charts off the ranges (by using the average of each range) I got quite a few charts where the various percentiles overlap, which I assume is because there are too many respondents concentrated in too few datapoints? |
Beta Was this translation helpful? Give feedback.
-
Other UI idea I just had to adapt to the user's privacy comfort level: ![]() |
Beta Was this translation helpful? Give feedback.
-
I posted this as a blog post, but I’m also including a modified version here for discussion
This spawned out of the discussion about changing "Salary" to "Income" as @ShaineRosewel proposed using a number input instead of brackets for salary.
There are actually four demographics questions in State of X surveys where the answer is essentially a number, yet we ask respondents to select a bracket:
age, years of experience, company size, and income.
The arguments for brackets are:
The arguments for numerical input are:
Which one is faster?
We can actually calculate this!
Average reading speed for non-fiction is around 240 wpm (= 250ms/word) 1
Therefore, we can approximate reading time for each question by multiplying number of brackets × average words per bracket (wpb) × 250ms.
However, this assumes the respondent reads all brackets from top to bottom, but this is a rare worst case scenario.
Usually they stop reading once they find the bracket that matches their answer, and they may even skip some brackets, performing a sort of manual binary search. We should probably halve these times to get a more realistic estimate.
Average typing speed is 200 cpm 2 (≈ 300ms/character). This means we can approximate typing time for each question by multiplying the number of digits on average × 300ms.
Let’s see how this works out for each question:
As you can see, despite our initial intuition that brackets are faster, the time it takes to read each bracketed question vastly outweighs typing time for all questions!
Of course, this is a simplification. There are models in HCI, such as KLM that can more accurately estimate the time it takes for certain UI flows.
For example, here are some of the variables we left out in our analysis above:
and then focus the input so they can write in it, which takes an additional click (estimated as 0.2s in KLM)
However, given the vast difference in times, I don't think a more accurate model would change the conclusion much.
What about sliders?
Sliders are uncommon in surveys, and for good reason.
They offer the most benefit in UIs where changes to the value provide feedback, and allow users to iteratively approach the desired value by reacting to this feedback.
For example:
In surveys, there is usually no feedback, which eliminates this core benefit.
When the number is known in advance, sliders are usually a poor choice, except when we have very few numbers to choose among (e.g. a 1-5 rating)
and the slider UI makes it very clear where to click to select each of them, or we don't much care about the number we select (e.g. search flights by departure time).3
None of our demographics questions falls in this category (unless bracketed, in which case why not use regular brackets?).
There are several reasons for this:
<input type=number>
all the things?Efficiency is not the only consideration here.
Privacy is a big one. These surveys are anonoymous, but respondents are still often concerned about entering data they consider sensitive.
Also, for the efficiency argument to hold true, the numerical answer needs to be top of mind, which is not always the case.
I summarize my recommendations below.
Age
This is a two digit number, that is always top of mind. Number input.
Years of experience
This is a 1-2 digit number, and it is either top of mind, or very close to it. Number input.
Company size
While most people know their rough company size, they very rarely would be able to provide an exact number without searching.
This is a good candidate for brackets.
However, the number of brackets should be reduced from the current 9 (does the difference between 2-5 and 6-10 employees really matter?),
and their labels should be copyedited for scannability.
We should also take existing data into account.
Looking at the State of CSS 2022 results for this question,
it appears that about one third of respondents work at companies with 2-100 people,
so we should probably not combine these 5 brackets into one, like I was planning to propose.
101 to 1000 employees is also the existing bracket with the most responses (15.1%), so we could narrow it a little,
shifting some of its respondents to the previous bracket.
Taking all these factors into consideration,
I proposed the following brackets:
Income
The question tht started it all is unfortunately the hardest.
Income is a number that people know (or can approximate).
It is faster to type, but only marginally (1.75s vs 1.5s).
We can however reduce the keystrokes further (from 1.5s to 0.6s on average) by asking people to enter thousands.
The biggest concern here is privacy.
Would people be comfortable sharing a more precise number?
We could mitigate this somewhat by explicitly instructing respondents to round it further, e.g. to the nearest multiple of 10:
However, this assumes that the privacy issues are about granularity, or about the number being too low (and rounding to 10s could help with both).
However, David Karger made an excellent point in the comments,
that people at the higher income brackets may also be reluctant to share their income:
Another idea was to offer UI that lets users indicate that the number they have entered is actually an upper or lower bound.
Of course, a dropdown PLUS a number input is much slower than using brackets, but if only a tiny fraction of respondents uses it, it does not affect the analysis of the average case.
However, after careful consideration and input, both qualitative and quantitative, it appears that privacy is a much bigger factor than I had previously realized.
Even though I was aware that people see income level as sensitive data (more so in certain cultures than others),
I had not fully realized the extent of this.
In the end, I think the additional privacy afforded by brackets far outweighs any argument for efficiency or data analysis convenience.
@SachaG what do you think?
Footnotes
https://www.sciencedirect.com/science/article/abs/pii/S0749596X19300786 ↩
https://www.typingpal.com/en/blog/good-typing-speed ↩
Slider Design: Rules of Thumb, NNGroup, 2015 ↩
KLM is a poor model for dragging tasks for two reasons:
First, it regards dragging as simply a combination of three actions: button press, mouse move, button release.
But we all know from experience that dragging is much harder than simply pointing, as managing two tasks simultaneously (holding down the mouse button and moving the pointer) is almost always harder than doing them sequentially.
Second, it assumes that all pointing tasks have a fixed cost (1.1s), which may be acceptable for actual pointing tasks, but the inaccuracy is magnified for dragging tasks.
A lot of HCI literature (and even NNGroup) refers to the Steering Law to estimate the time it takes to use a slider,
however modern sliders (and scrollbars) do not require steering, as they are not constrained to a single axis:
once dragging is initiated, moving the pointer in any direction adjusts the slider, until the mouse button is released.
Fitts Law actually appears to be a better model here, and indeed there are many papers extending it to dragging.
However, evaluating this research is out of scope for this post. ↩
Beta Was this translation helpful? Give feedback.
All reactions