## Descriptive analysis of FB survey demographics

This notebook runs on the full FB survey data (n=1777). Please note that some analyses may run on subsets of this data. <br>
Author: Rosalynn Yang <br>
Date: 11/25/2020

In [1]:
import numpy as np
import pandas as pd

import utility as util

In [2]:
fb = pd.read_csv("../output/fb_numeric.csv")

In [3]:
fb.shape

(1777, 43)

In [4]:
# manually exclude those <18 yrs old
fb = fb.drop(fb[fb['Q11'] == 1].index)

In [5]:
fb.shape

(1773, 43)

#### Recode: (1) age

Recode 3, 4, 5, 6 into missing; <br> {male: 1, female: 0} (same as codes used in UAS)

In [6]:
fb['gender'] = np.nan
fb.loc[fb['Q9']==1, ['gender']] = 0
fb.loc[fb['Q9']==2, ['gender']] = 1

In [7]:
fb['gender'].value_counts(dropna=False).sort_index()

0.0    790
1.0    503
NaN    480
Name: gender, dtype: int64

#### Recode: (2) marital status

Recode 3 into missing;<br>{married/living w a partner: 1, not married/not living w a partner: 0}

In [8]:
fb['married'] = np.nan
fb.loc[fb['Q10']==1, ['married']] = 1
fb.loc[fb['Q10']==2, ['married']] = 0

In [9]:
fb['married'].value_counts(dropna=False).sort_index()

0.0    434
1.0    891
NaN    448
Name: married, dtype: int64

#### Recode: (3) age group

{19-35 yrs old: 1, 36-65 yrs old: 2, >=66 yrs old: 3}

In [10]:
fb['age_group'] = np.nan
fb.loc[np.isin(fb['Q11'], [2,3]), ['age_group']] = 1
fb.loc[np.isin(fb['Q11'], [4,5,6]), ['age_group']] = 2
fb.loc[fb['Q11']==7, ['age_group']] = 3

In [11]:
fb['age_group'].value_counts(dropna=False).sort_index()

1.0     83
2.0    768
3.0    485
NaN    437
Name: age_group, dtype: int64

#### Recode: (4) education level

{HS grad or less: 1, Some college: 2, BA or above: 3}

In [12]:
fb['education'] = np.nan
fb.loc[np.isin(fb['Q12'], [1,2]), ['education']] = 1
fb.loc[fb['Q12']==3, ['education']] = 2
fb.loc[np.isin(fb['Q12'], [4,5]), ['education']] = 3

In [13]:
fb['education'].value_counts(dropna=False).sort_index()

1.0     108
2.0     244
3.0     297
NaN    1124
Name: education, dtype: int64

#### Crosstabs and chisq tests

Check demographic distributions in the five image conditions <br>
{Control: 1, COVID: 2, Data Privacy: 3, Finance: 4, Mental Health: 5}

In [14]:
fb['Image'].value_counts().sort_index()

1    466
2    747
3     76
4    183
5    301
Name: Image, dtype: int64

In [15]:
demo_cols = ['gender', 'married', 'age_group', 'education']
col_names = ['Control', 'COVID', 'Data Privacy', 'Finance', 'Mental Health']

In [16]:
for col in demo_cols:
    util.crosstab_chisq(col, 'Image', fb, col_names)

#### Crosstab of gender and Image

Unnamed: 0_level_0,Control,COVID,Data Privacy,Finance,Mental Health
gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0.0,47.9,67.3,61.0,44.8,77.8
1.0,52.1,32.7,39.0,55.2,22.2
Total n,334.0,559.0,59.0,143.0,198.0


*Chi-squared statistic = 72.6, degree of freedom = 4, p = 0.0*

-----

#### Crosstab of married and Image

Unnamed: 0_level_0,Control,COVID,Data Privacy,Finance,Mental Health
married,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0.0,33.9,27.6,42.6,28.1,45.2
1.0,66.1,72.4,57.4,71.9,54.8
Total n,342.0,568.0,61.0,146.0,208.0


*Chi-squared statistic = 25.7, degree of freedom = 4, p = 0.0*

-----

#### Crosstab of age_group and Image

Unnamed: 0_level_0,Control,COVID,Data Privacy,Finance,Mental Health
age_group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1.0,2.3,6.1,6.6,4.8,13.8
2.0,42.0,74.0,45.9,38.6,53.8
3.0,55.7,19.9,47.5,56.6,32.4
Total n,343.0,577.0,61.0,145.0,210.0


*Chi-squared statistic = 178.4, degree of freedom = 8, p = 0.0*

-----

#### Crosstab of education and Image

Unnamed: 0_level_0,Control,COVID,Data Privacy,Finance,Mental Health
education,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1.0,19.0,19.3,7.4,9.8,12.6
2.0,45.8,37.0,51.9,29.3,28.7
3.0,35.3,43.7,40.7,61.0,58.6
Total n,153.0,300.0,27.0,82.0,87.0


*Chi-squared statistic = 25.3, degree of freedom = 8, p = 0.001*

-----