## Who and How
When building a survey, need to consider:
1. Who you want to learn about
2. How to administer the survey

### Weights
- Some demographic groups are more likely to complete surveys than others
- Can mitigate this effect using weights
- **Weights**: uses the demographics of sample & frequency of their respective responses to match the demographics of the actual population
    - If demographic groups are over/under represented, and their answers differ along demographic delineations, the survey results won't accurately reflect the population as a whole
    - Compare survey response sample to known data (census, etc)
    - Apply less weight to overrepresented groups, more to underrepresented groups
        - Weights determined by dividing true population percentage by sample percentage
- Major polling companies (Gallup et al) have their own weighting schemes
- Demographic definitions range from broad (eg race, sex) to narrow (eg a certain sex of a certain race falling within an age range)
    - Narrower slices are more vulnerable to bias from weighting if they aren't representative of the demographic

### Power
- From larger samples, can provide more precise estimates and detect smaller effects
- **Effect**: the ratio of the size of parameter of interest to the variance in the parameter estimate
    - Example: compare the ages of employees at 2 different company locations
        - Parameter of interest: difference in ages
        - Variance estimate is the pooled variance of the ages at the 2 locations
        - Effect size of 1 implies interest in seeing if age difference between the 2 locations is as large or larger than the total variance (pooled from both locations)
- **Statistical power**: the ability to detect an effect in the data
    - Larger samples have more statistical power
    - Statistics more closely resemble the true population since individual observations have less influence
- Estimates of variance (noise) are scaled by sample size
    - Standard error of sample mean is std dev divided by sample size
    - Larger sample size equates to smaller standard error
- Statistical power is a balance between:
    - $\mu$: size of parameter of interest in the population
    - $\sigma$: amount of variance in population
    - $n$: sample size
    - Notice $\mu$ and $\sigma$ depend on knowledge of the population mean and variance
        - If this was known, wouldn't need to do a survey
        - These are typically estimated from previous surveys and/or domain knowledge
    - Ideally, collect the smallest $n$ that gives power to detect your effect
### Question and response option order
- **Random assignment to groups**: don't want extraneous factors to affect the experiment
    - Equally likely any participant will be assigned to any group
    - Random assignment helps create groups with similar proportions of gender, age, iq, etc
- **Randomization** is also useful for survey questions
    - Order questions are asked can influence responses
    - Questions near the end are more likely to be skipped
    - Answering an earlier question can impact answers later in the survey
- Showing questions in a random order can help resolve these order problems
    - Limitations with paper surveys and questions with dependencies
    - Multiple response questions (multiple choice, constant sum, ranking)
        - Item shown first is typically chosen more often
    - Text shown on the left side of screen/paper is often given preference (for languages read left to right)