# Final Proposal

We will explore how the perception of strong emotional bonds is correlated with life satisfaction in general for different age groups.

## Background
Psychology scholars have shown some evidence for a positive relationship between emotional attachment and mental well-being through theoretical analysis (Greenman and Johnson, 2022). Furthermore, Vandeleur and his colleagues (2009) suggested this relationship may differ in life stages. In a sample of 95 adolescent-parents pairs, parents tend to have higher emotional well-being when with friends or colleagues than when alone, whereas adolescents reported lower well-being when with peers or siblings than when alone. In our paper, we will use empirical analysis with a national-wide dataset to explore the relationship, and expand on previous results by giving a systematic estimation on the interaction effect between emotional bond and age on life satisfaction.

## Data
In our paper, we consider using a cross-sectional data set, Canadian Community Health Survey - Annual Component 2017-2018, provided by Statistics Canada. The CCHS data is structured into sections that focus on specific health topics, including mental health, diseases, lifestyle, and social conditions. Some more important variables that are related to our research among others are under the subcategories of satisfaction and relationships. Based on our research question, the main variables of interests consist of “satisfaction with life in general (GEN_010)” which ranges from 0 (very dissatisfied) to 10 (very satisfied), the degree of agreement to “relationship-strong emotional bond with at least one person (SPS_040)” and “age (DHHGAGE)”. Additionally, we will take demographic variables – “sex (DHH_SEX)” and “family arrangement (DHHDGLVG)” —  into account by including them as control variables in the model. These two factors seemed to have effects on satisfaction (Nordenmark, 2018).

## Summary Statistics

The descriptive statistics for variables is summarized in Table 1. Before employing the data, we did some economic transformations. The variable of age group is treated to be quantitative by taking the midpoint of each interval. The only exception is for the last category “age 80 or older” where we took the lower bound (“80”). All qualitative variables (emotional bond, sex and family arrangement) are transformed into dummy variables.  Here, “male” in “sex”, the level of “strongly disagree” in “emotional bond”, and the category “other” of “family arrangement” are chosen to be the base groups and omitted in the model to avoid the dummy variable trap.

### Data Cleaning

In [1]:
library(tidyverse)
library(haven)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.1 ──

[32m✔[39m [34mggplot2[39m 3.3.6     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.8     [32m✔[39m [34mdplyr  [39m 1.0.9
[32m✔[39m [34mtidyr  [39m 1.2.0     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.1.2     [32m✔[39m [34mforcats[39m 0.5.1

“package ‘ggplot2’ was built under R version 4.1.3”
“package ‘tidyr’ was built under R version 4.1.2”
“package ‘readr’ was built under R version 4.1.2”
“package ‘dplyr’ was built under R version 4.1.3”
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()

“package ‘haven’ was built under R version 4.1.3”


In [2]:
dat <- read_dta("data/CCHS_Annual_2017_2018_curated_trimmed_25%.dta") |> 
    select(GEN_010, SPS_040, dhhgage, DHH_SEX, dhhdglvg) |>
    na.omit()

In [3]:
dat_cleaned <- dat |>
    rename(satisfaction = GEN_010, emo_bond = SPS_040, age = dhhgage, sex = DHH_SEX, family = dhhdglvg) |>
    filter(satisfaction < 11 & emo_bond <= 4 & age <= 16 & sex <= 2 & family <= 8) |> #filter out invalid values
    mutate(sex = as_factor(sex),
           emo_bond = as_factor(emo_bond),
         family = as_factor(family),
         age = as_factor(age))

We will take the midpoint of each age group and treat it as a quantitative variable.

In [4]:
dat_cleaned$age <- case_when(dat_cleaned$age == "Age between 12 and 14" ~ 13,
                            dat_cleaned$age == "Age between 15 and 17" ~ 16,
                            dat_cleaned$age == "Age between 18 and 19" ~ 18.5,
                            dat_cleaned$age == "Age between 20 and 24" ~ 22,
                            dat_cleaned$age == "Age between 25 and 29" ~ 27,
                            dat_cleaned$age == "Age between 30 and 34" ~ 32,
                            dat_cleaned$age == "Age between 35 and 39" ~ 37,
                            dat_cleaned$age == "Age between 40 and 44" ~ 42,
                            dat_cleaned$age == "Age between 45 and 49" ~ 47,
                            dat_cleaned$age == "Age between 50 and 54" ~ 52,
                            dat_cleaned$age == "Age between 55 and 59" ~ 57,
                            dat_cleaned$age == "Age between 60 and 64" ~ 62,
                            dat_cleaned$age == "Age between 65 and 69" ~ 67,
                            dat_cleaned$age == "Age between 70 and 74" ~ 72,
                            dat_cleaned$age == "Age between 75 and 79" ~ 77,
                            dat_cleaned$age == "Age 80 and older" ~ 80
)

### Summary Statistics

In [5]:
data_destat <- dat_cleaned |>
    mutate("female" = ifelse(sex == "Male", 0, 1),
           emo_bond_strongly_agree = ifelse(emo_bond == "Strongly agree", 1, 0),
           emo_bond_agree = ifelse(emo_bond == "Agree", 1, 0),
           emo_bond_disagree = ifelse(emo_bond == "Disagree", 1, 0),
          "Unattached individual living alone" = 
               ifelse(family == "Unattached individual living alone.", 1, 0),
          "Unattached individual living with others" = 
               ifelse(family == "Unattached individual living with others.", 1, 0),
          "Individual living with spouse/partner" = 
               ifelse(family == "Individual living with spouse/partner.", 1, 0),
          "Parent living with spouse/partner and child(ren)" = 
               ifelse(family == "Parent living with spouse/partner and child(ren).", 1, 0),
          "Single parent living with children" = 
               ifelse(family == "Single parent living with children.", 1, 0),
          "Child living with a single parent with or without siblings" = 
               ifelse(family == "Child living with a single parent with or without siblings.", 1, 0),
          "Child living with two parents with or without siblings" = 
               ifelse(family == "Child living with two parents with or without siblings", 1, 0))

In [6]:
mean_table <- data_destat |>
    select(-c("sex", "emo_bond", "family")) |>
    summarize_all(mean)

sd_table <- data_destat |>
    select(-c("sex", "emo_bond", "family")) |>
    summarize_all(sd)

max_table <- data_destat |>
    select(-c("sex", "emo_bond", "family")) |>
    summarize_all(max)

min_table <- data_destat |>
    select(-c("sex", "emo_bond", "family")) |>
    summarize_all(min)

summary_table <- rbind(mean_table, sd_table, max_table, min_table) |>
    rename("satisfaction with life in general" = satisfaction,
           "strong emotional bond with >= 1 person (strongly agree)" = emo_bond_strongly_agree,
           "strong emotional bond with >= 1 person (agree)" = emo_bond_agree,
           "strong emotional bond with >= 1 person (disagree)" = emo_bond_disagree)

summary_table <- t(summary_table)

colnames(summary_table) <- c("mean", "standard deviation", "max", "min")

In [7]:
summary_table

Unnamed: 0,mean,standard deviation,max,min
satisfaction with life in general,8.03062731,1.6924781,10,0
age,48.54778598,19.5925424,80,13
female,0.54059041,0.4983803,1,0
strong emotional bond with >= 1 person (strongly agree),0.58154982,0.4933351,1,0
strong emotional bond with >= 1 person (agree),0.38093481,0.4856465,1,0
strong emotional bond with >= 1 person (disagree),0.03333333,0.1795165,1,0
Unattached individual living alone,0.2798278,0.4489421,1,0
Unattached individual living with others,0.03677737,0.1882263,1,0
Individual living with spouse/partner,0.2897909,0.4536931,1,0
Parent living with spouse/partner and child(ren),0.18733087,0.3902009,1,0


## Model
To perform the statistical analysis, we will estimate a linear regression model in this paper:

$$
Y_i = \beta_0 + \sum_{b=1}^3 \beta_{1, b} E^b_{i} + \beta_2 A_i + \sum_{b=1}^3 \sigma_{b} (E^b_{i} \times A_i) + \alpha X_i + \epsilon_i
$$

Let $i$ index the observation. $Y_i$ is the satisfaction with life in general. $E_i$ is the degree of agreement for strong emotional bond with at least one person. In the summation function $\sum_{b=1}^3 \beta_{1, b} E^b_{i}$, $E^b_{i}$ is an indicator variable equal to one if $E_i$ falls in the given level $b$ (e.g., “agree”). $A_i$ is the age. $E^b_{i} \times A_i$ is the interaction between the emotional bond of a given category $b$ and age. We include this term because we hypothesize that the effect of emotional bond on life satisfaction may depend on age groups, as indicated by the previous study (Vandeleur et al., 2009). $X_i$ represents other control variables. As mentioned above, we will include sex and living/family arrangement, for which we will run different specifications for multiple trials.

## References

- Greenman, P. S., & Johnson, S. M. (2022). Emotionally focused therapy: Attachment, connection, and health. Current Opinion in Psychology, 43, 146–150. https://doi.org/10.1016/j.copsyc.2021.06.015
- Nordenmark, M. (2017). The importance of job and family satisfaction for happiness among women and men in different gender regimes. Societies, 8(1), 1. https://doi.org/10.3390/soc8010001
- Vandeleur, C. L., Jeanpretre, N., Perrez, M., & Schoebi, D. (2009). Cohesion, satisfaction with family bonds, and emotional well-being in families with adolescents. Journal of Marriage and Family, 71(5), 1205–1219. https://doi.org/10.1111/j.1741-3737.2009.00664.x 

