## 1. One-sample hypothesis testing

### - **Research Question**:
We wish to determine whether the average life satisfaction of participants is above or in line with the national benchmark level. Assuming that the national benchmark average for life satisfaction is 5.0 (on a 10-point scale), we used this benchmark value as a reference to test whether life satisfaction in the project data was significantly above or below this standard.

### - **Variables**:
  - **Analytical variable**: `WELLNESS_life_satisfaction`, a score that represents an individual's overall satisfaction with their life, often closely related to factors such as social support and mental health.
  - **Rationale for variable selection**: life_satisfaction is an important indicator of well-being, and can help to measure an individual's overall assessment of life, particularly whether they have had a positive experience in terms of social relationships and emotional support.
  - **Visualisation**: use histograms and box and line plots to show the distribution of `WELLNESS_life_satisfaction`. The histogram will show concentrated trends in life satisfaction, and the box-and-line plot will provide detailed information on medians and quartiles and show outliers in the data.
  
### - **Analysis Plan**:
  - **Hypothesis**:
    - **Original Hypothesis (H0)**: the mean value of life satisfaction is equal to the baseline value, i.e. H0:μ=5.0.
    - **Alternative hypothesis (H1)**: the mean value of life satisfaction is different from the benchmark value, i.e. H0:μ!=5.0.
    
  - **Prerequisites**:
    - **Random Sampling**: the sample needs to be randomly selected from the overall population to ensure the representativeness of the simulated data.
    - **Independence**: each observation in the sample is assumed to be independent, i.e. each individual's life satisfaction score is not influenced by others.  
    
  - **Simulation method**:
    1. calculate the sample mean from the original data and record it.
    2. Assuming that the original hypothesis H0) holds, generate a large number of randomly generated simulation samples, each with a mean value of 5.0 (i.e., assuming that life satisfaction meets the national benchmark).
    3. compute the distribution of means for the generated samples and find where the observed mean (i.e., the mean of the original data) falls within that distribution.
    4. calculate the p-value, which is the sum of the probabilities that the observed mean is at the extremes on either side of the simulated distribution, to determine whether the difference between the observed mean and the hypothesized benchmark is significant.
    5. compare the p-value to the level of significance (usually 0.05). If the p-value is less than 0.05, the original hypothesis is rejected and life satisfaction is considered significantly different from the benchmark.

### - **Expected Results and Interpretation**:
  - **Expected Results**: if the p-value is significantly less than 0.05, the original hypothesis is rejected, indicating that the life satisfaction of the current program population is significantly different from the baseline value of 5.0.
  - **Interpretation of results**: If the results of the analyses support the alternative hypothesis that the mean value of life satisfaction significantly deviates from the baseline value, it would indicate that the state of well-being of the project participants varies from the expectation, providing data support for further well-being enhancement measures.
  - **Possible conclusions**: Based on the results of the analyses, specific policy recommendations or psychological interventions can be derived. For example, if life satisfaction is low, there may be a need to enhance community socialisation activities to improve individuals' sense of social support.

## **2.Two-sample hypothesis testing**: 

### - **Research Question**:
To examine differences in life satisfaction between individuals in states of emotional and social loneliness and those in non-lonely states. Specifically, we want to understand whether loneliness status reduces individuals' life satisfaction, which will help understand the important role of social support in well-being.

### - **Variables**:
  - **Variable**: `WELLNESS_life_satisfaction`, which represents participants' life satisfaction scores. This score captures an individual's overall satisfaction with life and has a direct correlation with mental health and well-being.
  - **Variable**: `LONELY_dejong_emotional_social_loneliness_scale_rejected`, a dichotomous variable indicating whether the participant is lonely or not. This variable categorises the population into loneliness and non-loneliness groups.
  - **Rationale for variable selection**: Comparing life satisfaction between lonely and non-lonely people can help us explore how social and emotional isolation affects an individual's sense of well-being, thus providing data to support the effectiveness of social support.
  - **Visualisation**: a side-by-side box-and-line graph is used to show the distribution of life satisfaction for the two groups (‘lonely’ vs. ‘non-lonely’). This type of chart visualises the median, interquartile range and potential outliers of life satisfaction for each group, helping us to make initial observations of differences in the means and distributions of the two groups.

### - **Analysis Plan**:
  - **Hypotheses**:
   - **Primary Hypothesis (H0)**: the mean values of life satisfaction are equal for the lonely and non-lonely status groups, i.e.H0: μlone  = μnon-lone.
    
   - **Alternative Hypothesis (H1)**: The mean values of life satisfaction of people in the lonely and non-lonely states are not equal, i.e. H1: μlone  !=  μnon-lone.
    
  - **Prerequisites**:
    - **Random sampling**: each group of data should be randomly selected from the overall population to ensure a representative sample.
    - **Independence**: the observations within each group should be independent of each other, i.e. the solitary state will not affect each other to ensure the reliability of the Bootstrap results.
    
   - **Simulation method**:
     1. Calculate the sample means from the “lonely” and “non-lonely” groups, and record the difference between the two means. 
     2. Assuming that the original hypothesis H0 is true, randomly assign “lonely” and “non-lonely” labels to the data to simulate the difference between the means of the two groups, and repeat this process to generate a large number of simulated mean difference distributions. 
     3. Observe the position of the original mean difference in the simulated distribution, and calculate the probability that the observed mean difference is at the extremes on either side to obtain a p-value. 
     4. Compare the p-value to a significance level (e.g., 0.05). If the p-value is less than 0.05, reject the hypothesis that there is a significant difference between the means of the “lonely” and “non-lonely” states.

### - **Expected Results and Interpretation**:
  - **Expected results**: if the p-value is less than 0.05, the original hypothesis is rejected, indicating that emotional and social loneliness has a significant effect on life satisfaction.
  - **Interpretation of results**: If the results of the analyses support the hypothesis that life satisfaction is significantly lower in the ‘lonely’ group than in the ‘non-lonely’ group, then it can be concluded that emotional and social support play an important role in enhancing the well-being of individuals. This difference may reflect the fact that social isolation leads to lower life satisfaction, further emphasising the role of social support in mental health.
  - **Possible conclusions**: The findings of the study may provide a quantitative basis for the importance of social and emotional support, suggesting that mental health interventions and social connection support should be strengthened for people with a state of isolation in order to improve their life satisfaction and overall well-being.

## 3. Simple linear regression

### - **Research Question**:
we wish to explore the subjective well-being (`WELLNESS_subjective_happiness_scale_score`) and social support (`PSYCH_zimet_multidimensional_social_support_scale_score`) The relationship between. It is hypothesised that higher levels of social support are associated with higher levels of subjective well-being and that the study of this relationship contributes to an understanding of the importance of social support in well-being enhancement.

### - **Variables**:
  - **Dependent variable (Y)**: `WELLNESS_subjective_happiness_scale_score`, which represents an individual's subjective happiness score.
  - **Independent variable (X)**: `PSYCH_zimet_multidimensional_social_support_scale_score`, a multidimensional social support score, which measures the level of social support received by an individual.
  - **Rationale for variable selection**: both subjective well-being and social support reflect an individual's level of psychological well-being, and the potential role of social support on an individual's well-being can be revealed by examining their relationship.
  - **Visualisation**: a scatterplot was used to demonstrate the relationship between subjective well-being and social support, and a regression straight line was added to observe a linear trend. If the scatterplot shows a positive slope distribution, it indicates that there may be a positive correlation between the two.

### - **Analysis Plan**:
   - **Hypothetical conditions**:
     - **Linear Relationship**: assumes a linear relationship between the independent and dependent variables, i.e., a consistent trend in well-being as social support changes.
     - **Residual Independence**: observations in the data should be independent of each other and there is no correlation between the residuals of the regression model.
     - **Homoskedasticity**: the variance of the dependent variable corresponding to different values of the independent variable should be consistent, i.e. the variance of the residuals does not vary with the independent variable.
     - **Normality**: residuals are assumed to follow a normal distribution to facilitate statistical inference.
    
  - **Hypothesis**:
    - **Original Hypothesis (H0)**: there is no significant relationship between social support and subjective well-being, i.e. the coefficient of social support in the regression model ( β 1 = 0 ).
    - **Alternative hypothesis (H1)**: there is a significant positive relationship between social support and subjective well-being, i.e.( β 1 > 0 ).
  - **Regression analysis**: Construct a simple linear regression model , where β 1 represents the direction and strength of the influence of the independent variable on the dependent variable.
    - **Hypothesis testing**:
    1. individual t-test is used to assess the significance of the regression coefficient β 1. 
    2. Calculate the p-value of β 1. If the p-value is less than 0.05, the regression coefficient is considered to be significant, indicating that social support has a significant effect on subjective well-being. 
    3. If the regression model is significant, i.e. the p-value of β 1 is less than the level of significance, the model can be recognized as statistically significant and can be used to explain the relationship between social support and subjective well-being.
    
### - **Expected Results and Interpretation**:
  - **Expected results**: If the p-value indicates that the regression coefficient is significant, it means that social support has a positive effect on subjective well-being.
  - **Results Interpretation**: If the results support a positive correlation, it suggests that social support has a positive impact on subjective well-being. The results can be used to suggest community activities or enhancement of psychological support services.
  - **Possible conclusion**: This analysis can lead to the conclusion that the enhancement of social support should be emphasised in social policies or mental health interventions to help individuals achieve higher levels of well-being and life satisfaction. This conclusion could provide data to support mental health management and social support policies. 
