# Progressivism and Social Media / Video Game Use and Political Engagement

Here we examine the relationships between the levels of progressivism in the respondents and their use of social media and video games.  "Progressivism" here is based on responses to six questions about various social issues and the role of government.

## Table of contents

<ol>
   <li><a href='#ProgressivismScoresDistributions'>Progressivism Scores - Distributions </a> </li>
    <li><a href='#ProgressivismAssumptions'>Investigating Assumptions about Progressivism</a></li>
    <li><a href='#SocialMediaVideoGameProgressivism'>Social Media/Video Game Use and Progressivism</a></li>
    <li><a href='#Summary'>Summary</a></li>
</ol>

In [None]:
!pip install joblib

import pandas as pd
import numpy as np
import joblib
from plotnine import *
import joblib

import plotly.express as px
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# Load in the chi2 testing results & preprocessed data
results = joblib.load('Chi2Results.pkl')
data = joblib.load('GroupedAndUngroupedData.pkl')

In [None]:
# Small utility function to help create Two Way tables visualized as heatmaps
def ShowTwoWayHeatmap (df, row, col, normalize = False, colSort=[], rowSort=[]):
    """
    Little function to help with creation of heatmaps.
    @params:
        df                  - required - dataframe containing data to be mapped
        row                 - required - name of column in df to plot on the rows
        col                 - required - name of column in df to plot on the columns
        normalize           - optional - specifies whether the table should be normalized along 'index', 'column', 'all' (or True), or neither (False)
        rowSort             - optional - list specifying how the values in the rows should be sorted
        colSort             - optional - list specifying how the values in the columns should be sorted|
    """
    ctab = pd.crosstab (data[row], data[col], normalize = normalize)
    
    if rowSort:
        ctab = ctab.loc[rowSort]
    
    if colSort:
        ctab = ctab[colSort]
    
    if normalize:
        mult = 100
        fmt = '%.2f%%'
    else:
        mult = 1
        fmt = '%.0f'

    fig = plt.figure()
    fig.set_size_inches(13,6)

    heatmap = plt.pcolor(ctab)
    for y in range (ctab.shape[0]):
        for x in range (ctab.shape[1]):
            plt.text (x + 0.5, y + 0.5, fmt % (ctab.iloc[y,x] * mult),
                       ha='center', va='center')
    plt.yticks(np.arange(len (ctab.index))+0.5, ctab.index)
    plt.xticks(np.arange(len (ctab.columns))+0.5, ctab.columns, rotation=90)
    plt.colorbar(heatmap)

    plt.title(f'{row} vs {col}')
    plt.xlabel(col)
    plt.ylabel(row)
    plt.show()


In [None]:
def ShowGroupedBar(df, x, groupVar, categoryArrayOrder=[]):
    """
    Little function to help with creation of grouped bar charts.
    @params:
        df                  - required - dataframe containing data to be mapped
        x                   - required - name of column in df to plot on the X axis
        groupvar            - required - list of columns in df to color / group the bars by 
        categoryArrayOrder  - optional - ordered list indicating how the X axis should be sorted
    """
    scores = df[[x, *groupVar]].melt(id_vars=x)
    scores = scores.groupby([x, 'variable']).sum().reset_index()

    p = px.bar (scores, x=x, y='value', color='variable', barmode='group')
    p = p.for_each_trace(lambda t: t.update(name=t.name.split('=')[1]))
    p = p.for_each_trace(lambda t: t.update(name=t.name.split('_')[0]))
    p = p.update_xaxes(type='category', categoryorder='array', categoryarray=categoryArrayOrder)
    p.show()

In [None]:
score_cols = [col for col in data.columns if '_Score' in col and col != 'progressivism_Score']
govt_score_cols = [col for col in score_cols if 'govt' in col]


## Progressivism Scores - Distributions <a id='ProgressivismScoresDistributions'></a>

Overall, we can see that the scores follow a fairly symmetrical distribution.  The most common score is 0, indicating neither conservative or progressive.  However, there is a slight lean towards the conservative side.

In [None]:
p = px.histogram (data, 'progressivism_Score')

print (data.progressivism_Score.describe())
p.show()


Checking the distribution of the groups, we confirm that the slightly conservative group is in the majority, followed
by the slightly progressive

In [None]:
px.histogram(data.sort_values('progressivism_Score'), x='progressivism_Groups')

In [None]:
# Set up a sorting list for our heatmaps
progSort = ['Very Conservative', 'Slightly Conservative', 'Neither', 'Slightly Progressive', 'Very Progressive']

## Investigating assumptions about Progressivism <a id='ProgressivismAssumptions'></a>

### Age

One common assumption would be that people would tend to become more conservative as they get older.  The data does appear to support that assumption.  Surprisingly, however, of the younger age groups, only the teens are majority progressive.

In [None]:
p = px.histogram(data.sort_values(['progressivism_Score','age']), x='progressivism_Groups', facet_col='age_Groups')
p = p.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
p = p.update_xaxes(tickangle=90)
p.show()


Looking at it another way, we can see that the majority score group in each age group except for teens is "slightly conservative"   

In [None]:
ageSort = ['LateTeens', 'Early20s', 'Mid20s', 'Late20s']
ShowTwoWayHeatmap(data, 'age_Groups', 'progressivism_Groups', normalize='index', rowSort=ageSort, colSort=progSort)

We can also see from our chi squared testing that there appears to be a statistically significant relationship between the age and progressivism groups (p = ~0.02)

In [None]:
results[(results.Y=='progressivism_Groups') & (results.X=='age_Groups')]

The topic of abortion was clearly the most contentious, eliciting the strongest conservative responses.

In [None]:
scores = data[score_cols].melt()
p = px.box (scores, x='variable', y='value')
p.show()

The results by age group closely follow the reuslts for the dataset as a whole (alhtough teens appear to be a bit more progressive with regards to immigration)

In [None]:
for ag in ageSort:
    scores = data[(data.age_Groups==ag)][score_cols].melt()
    p = px.box (scores, x='variable', y='value', title = f'Progressivism scores by question for {ag}')
    p.show()

We can see a trend emerge around the question of immigration:  there is a steady movement towards a more conservative stance as the respondents get older.

In [None]:
ShowGroupedBar (data, 'age_Groups', score_cols, ageSort)

### Area Type

Another common assumption might be that people in rural areas / smaller towns might be more conservative. Surprisingly, respondents in small towns tend to go against that trend. Suburban areas tend to have the highest number of progressives.

In [None]:
data['USAAreaType_Groups'] = pd.Categorical (data['USAAreaType_Groups'], ['Rural', 'SmallTown', 'Suburban', 'City'])

p = px.histogram(data.sort_values(['progressivism_Score', 'USAAreaType_Groups']), x='progressivism_Groups', facet_col='USAAreaType_Groups')
p = p.for_each_annotation(lambda a: a.update(text=a.text.split("=")[-1]))
p = p.update_xaxes(tickangle=90)
p.show()

print (data.USAAreaType_Groups.value_counts())

We can see that the progressivism groups have approximately the same strength of association with area type as with age.

In [None]:
results[(results.Y=='progressivism_Groups') & (results.X=='USAAreaType_Groups')]

Looking at it another way, we see that the data does appear to support one assumption:  respondents in rural areas are mostly conservative.  Also as we might expect, the highest concentration of very conservative responses are in the rural and small town areas, while the highest concentration of veyr progressive responses are in suburbs and cities.  However, the slightly conservative group is in the majority in almost all area types (except small towns).

In [None]:
areaSort = ['Rural', 'SmallTown', 'Suburban', 'City']
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'USAAreaType_Groups', normalize='columns', rowSort=progSort, colSort=areaSort)

In [None]:
USAAreaType_Groups = ['Rural', 'SmallTown', 'Suburban', 'City']
for area in USAAreaType_Groups:
    scores = data[(data.USAAreaType_Groups==area)][score_cols].melt()
    p = px.box (scores, x='variable', y='value', title = f'Progressivism scores by question for {area}')
    p.show()

Here we can see clearly that the questions that asked about the role of government elicited the most consevative responses, while the others tended more progressive. Only the question about helping vulnerable people elicits a slightly progressive response from respondents in cities.

In [None]:
ShowGroupedBar (data, 'USAAreaType_Groups', score_cols)

In [None]:
ShowGroupedBar (data, 'USAAreaType_Groups', govt_score_cols)

### Political Affiliation

We'd also expect to see a strong reltaionship between political party and progressivism in the US, and there does appear to be
a strong association between the two.

In [None]:
results[(results.X=='USAPoliticalParty_Groups') & (results.Y=='progressivism_Groups')]

As we'd expect, most people who identify as Democrat are slightly progressive, whereas most people who identify as Republican are either slightly or very conservative

In [None]:
ShowTwoWayHeatmap(data, 'USAPoliticalParty_Groups', 'progressivism_Groups', normalize='index', colSort=progSort)

As expected, Republicans tend to hold more conservative views overall, especially regarding the role of government.  However, even Democrats tend slightly conservative when it comes to the question of poor peoples' dependence on the govenrment for help.  Somewhat surprisingly, Democrats even trend conservative on the question of the use of government regulation.

In [None]:
ShowGroupedBar (data, 'USAPoliticalParty_Groups', score_cols)

In [None]:
ShowGroupedBar (data, 'USAPoliticalParty_Groups', govt_score_cols)

# Social Media / Video Game Use and Progressivism <a id='SocialMediaVideoGameProgressivism'></a>

### Social Media as a source of news

There appears to be a statistically significant association between the use of Social Media as a source for news and progressivism

In [None]:
results[(results.Y=='progressivism_Groups') & (results.X == 'USAPoliticalNewsSourceSocialMedia_Groups')]

Those who used Social Media as their source of news were conservative in the majority, whereas those who didn't were progressive in the majority

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'USAPoliticalNewsSourceSocialMedia_Groups', rowSort=progSort, normalize='columns')

The responses to the questions seem largely consistent between those who use social media as a source of political news vs those who don't, with one notable exception:  The question regarding the government helping those who are most vulnerable.  That question tends to skew more progressive for those who do use social media in this way.

In [None]:
ShowGroupedBar (data, 'USAPoliticalNewsSourceSocialMedia_Groups', score_cols)

Turning to our central questions, we can see the amount of Facebook & Video Game usage both seem to have a statistically significant association to progressivism, but Twitter use does not.

In [None]:
results[(results.X.isin(['facebookUseAmount_Groups','facebookDailyRoutine_Groups','twitterUseAmount_Groups','videoGameUseFrequency_Groups'])) & (results.Y=='progressivism_Groups')]

### Video Game Use

The relationship between these video game use and progressivism is not entirely clear.  Similar to the overall distribution, we can see that slightly conservative tends to be the majority for most frequency types.  The biggest deviation from that trend is in the "multiple times per day" group, which has a solid progressive majority.

In [None]:
freqSort = ['Never', '<1PerMonth', '1PerMonth', '2-3PerMonth', '1PerWeek', '2-3PerWeek', 'Daily', 'MultipleTimesPerDay']

ShowTwoWayHeatmap(data, 'videoGameUseFrequency_Groups', 'progressivism_Groups', normalize='index', rowSort=freqSort, colSort=progSort)

One interesting trend appears around the question of it the govenrment should help vulnerable groups even if it means going into debt:  as video game use increases, it appears that people become more receptive to this idea

In [None]:
ShowGroupedBar (data, 'videoGameUseFrequency_Groups', score_cols, freqSort)

In [None]:
ShowGroupedBar (data, 'videoGameUseFrequency_Groups', govt_score_cols, freqSort)

In [None]:
ShowTwoWayHeatmap (data, 'govtHelpVulnerable_Groups', 'videoGameUseFrequency_Groups', colSort=freqSort, normalize='columns')

### Video Game Usage - type and activites

Here we examine the relationship between progressivism and the types of video games played and the types of activities performed in video games.

We can see that there are fairly strong associations with most questions, except for the Single Player and Type of Game most frequently played

In [None]:
vgCols = ['videoGamePlayingGamer_Groups',
          'videoGamePlayingHelpOthers_Groups',
          'videoGamePlayingLearnSocietyProblems_Groups',
          'videoGamePlayingMoralEthicalIssues_Groups',
          'videoGameTypeMultiplayerPVP_Groups',
          'videoGameTypeMultiplayerCoop_Groups',
          'videoGameTypeSingleplayer_Groups',
          'videoGameTypeMostFrequent_Groups']

results[(results.Y=='progressivism_Groups') & (results.X.isin(vgCols))].sort_values('p')

These questions asked if respondents considered moral/ethical issues, helped others, or learned about society's problems while playing games.  In each case, we an see that those who responded that they did tended to be mostly on the slightly conservative side.  However, those who did not play video games ("NotMe") tended towards slightly progressive.

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'videoGamePlayingMoralEthicalIssues_Groups', rowSort=progSort, normalize='columns')
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'videoGamePlayingHelpOthers_Groups', rowSort=progSort, normalize='columns')
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'videoGamePlayingLearnSocietyProblems_Groups', rowSort=progSort, normalize='columns')

Broken out by question, we can see a pattern emerge:  Those who responded that they did perform these activities while playing video games tended to have a much more progressive view on the role of government helping vulnerable people.

In [None]:
ShowGroupedBar (data, 'videoGamePlayingMoralEthicalIssues_Groups', score_cols)
ShowGroupedBar (data, 'videoGamePlayingHelpOthers_Groups', score_cols)
ShowGroupedBar (data, 'videoGamePlayingLearnSocietyProblems_Groups', score_cols)

When asked if they played video games where they played against other players, those who responded Yes tended to be more on the conservative side than those who responded no.

In [None]:
ShowTwoWayHeatmap(data, 'progressivism_Groups', 'videoGameTypeMultiplayerPVP_Groups', rowSort=progSort, normalize='columns')

We also see that those who responded that they did play these types of games tended to have a slightly more progressive view on the role of government with regards to helping vulnerable people.

In [None]:
ShowGroupedBar (data, 'videoGameTypeMultiplayerPVP_Groups', score_cols)

### Facebook usage

Regarding Facebook usage: It appears that, similar to the overall distribution, slightly conservative tends to be the majority group. However, at both extremes of usage (> 90 minutes per day in the past week or "n/a", which we could interpret as not using Facebook), the balance shifts a bit.

In [None]:
freqSort = ['N/A', 'NotInLastWeek', '<10Minutes', '10-30Minutes','31-60Minutes', '61-90Minutes', '>90Minutes']

ShowTwoWayHeatmap (data, 'facebookUseAmount_Groups', 'progressivism_Groups', normalize='index', rowSort=freqSort, colSort=progSort)

People across the spectrum of progressivism appear to use Facebook as part of their daily routine.  However, we can see that the majority is much greater at the extreme ends of the spectrum.  That is, a large number of people who identify as very progressive or very conservative use Facebook as part of their daily routine (moreso than people closer to the middle of the spectrum).

In [None]:
ShowTwoWayHeatmap (data, 'facebookDailyRoutine_Groups', 'progressivism_Groups', colSort=progSort, normalize='columns')

In [None]:
ShowGroupedBar (data, 'facebookUseAmount_Groups', score_cols, freqSort)

It appears that people on the extreme ends of the Facebook use spectrum have a more progressive stance towards the government helping vulnerable people

In [None]:
ShowGroupedBar (data, 'facebookUseAmount_Groups', govt_score_cols, freqSort)

In [None]:
ShowGroupedBar (data, 'facebookDailyRoutine_Groups', score_cols)

Comparing those who do consider Facebook part of their daily routine vs those who do not:

We see that, although the distributions are approximately the same, those who do consider facebook part of their daily routine tend to have scores closer to the ends of the spectrum (2 or -2) for the questions about poor people have dependence on the poor and gay marriage

In [None]:
routines = ['neg', 'pos']

for routine in routines:
    scores = data[(data.facebookDailyRoutine_Groups==routine)][score_cols].melt()
    p = px.box (scores, x='variable', y='value', title = f'Progressivism scores by question for {routine}')
    p.show()

### Facebook usage - politically-focused activities

There is a statistically significant association between progressivism and using Facebook for various politically-focused activities

In [None]:
results[(results.Y=='progressivism_Groups') & (results.X.isin(['facebookPostPoliticalLinks_Groups','facebookEncouragePoliticalAction_Groups','facebookEncourageVote_Groups' ]))]

In each case, we can see that those who responded that they used Facebook for various political activities tended to be on the conservative side.

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'facebookEncouragePoliticalAction_Groups', rowSort=progSort, normalize='columns')

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'facebookEncourageVote_Groups', rowSort=progSort, normalize='columns')

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'facebookPostPoliticalLinks_Groups', rowSort=progSort, normalize='columns')

We see a similar trend for each activity:  those who reported using Facebook for the specified activity tended to have more progressive views on the government helping vulnerable people.  Responses to the other questions otherwise seemed similar.

In [None]:
ShowGroupedBar (data, 'facebookEncouragePoliticalAction_Groups', score_cols)

In [None]:
ShowGroupedBar (data, 'facebookEncourageVote_Groups', score_cols)

In [None]:
ShowGroupedBar (data, 'facebookPostPoliticalLinks_Groups', score_cols)

In [None]:
twitterPolCols = ['twitterUseReadNewsPolitics_Groups', 'twitterUseShareNewsPolitics_Groups', 'twitterUseDiscussNewsPolitics_Groups']

### Twitter Usage - politically-focused activities

Although the frequency of Twitter usage did not have a strong association with progressivism, the various politically-focused activities seem to.

In [None]:
results[(results.Y=='progressivism_Groups')& (results.X.isin(twitterPolCols))]

Similar to Facebook, those who used Twitter for political activities tended to be on the conservative side.

In [None]:
for col in twitterPolCols:
    ShowTwoWayHeatmap (data, 'progressivism_Groups', col, rowSort=progSort, normalize='columns')

We see a similar trend here as with Facebook usage; those who reported using Twitter for the specified activity tended to have more progressive views on the government helping vulnerable people.  Responses to the other questions otherwise seemed similar.

In [None]:
for col in twitterPolCols:
    ShowGroupedBar (data, col, score_cols)

In [None]:
ShowTwoWayHeatmap (data, 'progressivism_Groups', 'USAPoliticalNewsSourceSocialMedia_Groups', rowSort=progSort, normalize='columns')

In [None]:
ShowGroupedBar (data, 'USAPoliticalNewsSourceSocialMedia_Groups', score_cols)

## Summary <a id='Summary'></a>

1) In general, people were hesitant to strongly agree or disagree with any particular opinion.  Most responses ranged between
   agree, neutral and disagree.  There was overall a slight skew towards conservative. 
   
2) The topics that seemed to elicit the most conservative responses had to do with the role of government, especially around
   the use of government regulations and providing services to the poor.  Meanwhile, most respondents seemed to be generally in    favor of legalizing gay marriage.  Opinions on abortion were very polarized.
   
3) In general, those who used social media as a source of political news or for political activities tended to be
   conservative than those who did not. 

4) Similarly, those who played video games and used social media for political activities tended to be more conservative than 
   those who did not.  However, the question regarding the role of government in helping vulnerable people often bucked that 
   trend.