# Statistical Analysis Project Proposal

## Analysis 1: Impact of Social Connection on Mental Health During COVID-19

### Research Question
To what extent do virtual and in-person social interactions correlate with self-reported mental health during the COVID-19 pandemic?

### Variables and Exploration Plan
- **Outcome Variable**: `WELLNESS_self_rated_mental_health`
  - Description: Self-reported mental health rating on an ordinal scale
  - Visualization: Histogram
    - Rationale: To illustrate the overall distribution of mental health ratings, helping to identify potential skewness or biases in the data.

- **Predictor Variables**: `CONNECTION_activities_*`
  - **CONNECTION_activities_video_chat_p3m**
    - Description: Frequency of video chat interactions in past 3 months
    - Visualization: Bar chart
      - Rationale: To assess the distribution across different frequency categories, allowing comparison with other types of communication.

  - **CONNECTION_activities_visited_friends_p3m**
    - Description: Frequency of in-person friend visits in past 3 months
    - Visualization: Bar chart 
      - Rationale: To compare with virtual interactions and explore preferences for in-person versus digital contact.

  - **CONNECTION_activities_phone_p3m**
    - Description: Frequency of phone calls in past 3 months
    - Visualization: bar chart
      - Rationale: To show distribution and compare with other communication methods.


  - **Composite Social Interaction Score**
    - Description: A combined measure of all social interaction types, created by summing standardized frequencies.
    - Visualization: Scatterplot against mental health
      - Rationale: To visualize the potential linear relationship, with an added trend line to show correlation.
    
### Analysis Method
Simple Linear Regression will be used to determine the relationship between the composite social interaction score and mental health ratings. This method helps identify significant predictors and the strength of their association.

### Hypothesis and Expected Results
- Hypothesis: Higher levels of social interaction, including virtual forms, are positively associated with better mental health ratings.
- Expected Outcomes: A positive regression slope is anticipated, indicating that increased social interactions correlate with improved mental health. It is expected that in-person interactions will have a stronger positive correlation than virtual ones.

## Analysis 2: Burnout and Work-From-Home Transition

### Research Question
Does transitioning to a work-from-home setting during the pandemic significantly affect reported burnout levels compared to those who did not transition?

### Variables and Exploration Plan
- **Outcome Variables**: Burnout measures (`WELLNESS_malach_pines_burnout_measure_*`)
  - **WELLNESS_malach_pines_burnout_measure_tired**
    - Description: Frequency of feeling tired.
    - Visualization: Histogram and box plot for WFH and non-WFH groups.
      - Rationale: To compare the distribution and median between groups.

  - **WELLNESS_malach_pines_burnout_measure_hopeless**
    - Description: Frequency of feeling hopeless.
      - Visualization: box plots.
      -   Rationale: To show median differences and data spread between groups.

  - **WELLNESS_malach_pines_burnout_measure_depressed**
    - Description: Frequency of feeling depressed
    - Visualization: histogram.
      - Rationale: To directly compare distribution shape. 

  - **Composite Burnout Score**
    - Description: A combined measure calculated by averaging normalized scores of all burnout indicators.
    - Visualization: Histograms for each group.
      - Rationale: To visualize and compare the distribution of burnout levels.

- **Grouping Variable**: `WORK_shift_from_home`
  - Description: Whether employee transitioned to WFH
  - Visualization: Bar chart showing proportions of each category.
    - Rationale: To check the sample sizes of each group.
### Analysis Method
Bootstrap hypothesis testing will be conducted to compare mean burnout scores between WFH and non-WFH groups. This non-parametric method is ideal for non-normal data and provides robust confidence intervals for mean differences.

### Hypothesis and Expected Results
- Hypothesis: Employees who transitioned to WFH are expected to report higher levels of burnout due to factors such as isolation and work-life balance challenges.
- Expected Outcomes: Significant differences in burnout scores are anticipated, with WFH employees showing higher average scores.

## Analysis 3: Predictors of Loneliness During COVID-19

### Research Question
Which demographic and lifestyle factors most strongly predict loneliness levels during the COVID-19 pandemic?

### Variables and Exploration Plan
- **Outcome Variable**: UCLA Loneliness Scale measures (`LONELY_ucla_loneliness_scale_*`)
  - Description: Measures of perceived loneliness, such as lacking companionship.
  - Visualization: Histograms with density curves and box plots by age groups.
    - Rationale: To show distribution shapes and explore patterns by age.

- **Predictor Variables**:
  - **DEMO_age**
    - Description: Age of respondent
    - Visualization: Histogram
      - Rationale: To understand the age distribution and identify potential biases.

  - **GEO_housing_live_with_*** (Multiple binary variables)
    - Description: Living situation indicators
    - Visualization: bar chart
      - Rationale: To compare loneliness levels across different living arrangements.
  - **DEMO_relationship_status**
    - Description: Current relationship status
    - Visualization: Box plots 
      - Rationale: To examine how relationship status impacts loneliness levels.
### Analysis Method
Bootstrapping will be used to construct confidence intervals for each predictor's effect on loneliness scores. This approach handles non-parametric data and allows for robust estimation of relationships.

### Hypothesis and Expected Results
- Hypothesis: Age, living situation, and relationship status will be significant predictors of loneliness during the pandemic.
- Expected Outcomes: Stronger associations with living situation are anticipated, suggesting that living alone may contribute more to feelings of loneliness than other demographic factors.