# Exploring Political Values Based on Reaction to Nike's Kaepernick Campaign

## Introduction

One of the most important aspects of sports is the gear worn by athletes.
Historically sports apparel and the brands that make them have served two functions:
* Improve Performance - often through innovative materials
* Express the players' personality - often through design elements such a use of color or symbol (Jordan Jumpman)

In this time period a new function for sports apparel appears to be forming: an expression of political values.

This is most evident with Nike's decision to publicly support Colin Kaepernick kneeling during the National Anthem at NFL games in protest of police brutality in America. This is an issue highly divided by political lines, and the company's endorsement was publicly critized by the former Republican President of the United States, Donald Trump, on Twitter.

## Goal

This notebook explores the tweets made on the day Nike anounced its endorsement of Colin Kaepernick (September 7, 2018). Using a series of data science techniques, the goal is to see if there are insights that can be generated from the tweets to inform Nike's brand strategy in the United States.




## 1. Upload Dataset

First, we upload the [dataset](https://www.kaggle.com/eliasdabbas/5000-justdoit-tweets-dataset) with Twitter data related to Nike's Kaepernick Campaign from Kaggle.

Run the following notebook to extract relevant tweet information downloaded from the website Kaggle and assign a state and state-based political value to each tweet.



[1-CreateDataset.ipynb](./1-CreateDataset.ipynb)

This notebook saves the dataset in the file `./intermediate_data/cleaned_tweet_data.json`.


## 2. Explore Data

Tweets cannot be directly used for machine learning. Here I use the TfidfVectorizer method to calculate a fixed-sized feature vector for each tweet. I then cluster the tweets into groups of 5.

Run the following notebook to create and explore clusters
. 

[2-ExploreData.ipynb](./2-ExploreData.ipynb)

## 3. Fit a Model and Predict Favoriting


Next, we fit a classification model using the same TfidfVectorizer method to see whether a tweet that is "favorited" can be predicted from the dataset.

Run the following notebook to fit a machine learning model on a training set and evaluate its performance on a test set.

[3-FitPredictFavoriting.ipynb](./3-FitPredictFavoriting.ipynb)




## 4. Fit a Model  and Predict Home State Electoral College Results

Finally, we fit a classification model using the same TfidfVectorizer method to see whether the dataset can predict whether a tweet came from a state whose electoral college voted Republican or Democratic.

Run the following notebook to fit a machine learning model on a training set and evaluate its performance on a test set.





[4-FitPredictECResults.ipynb](./4-FitPredictECResults.ipynb)

## Version and Hardware Information

In [2]:
%load_ext watermark
%watermark -v -m -p ipywidgets,matplotlib,numpy,pandas,sklearn

The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark
Python implementation: CPython
Python version       : 3.7.4
IPython version      : 7.8.0

ipywidgets: 7.5.1
matplotlib: 3.3.2
numpy     : 1.17.2
pandas    : 0.25.1
sklearn   : 0.24.2

Compiler    : Clang 4.0.1 (tags/RELEASE_401/final)
OS          : Darwin
Release     : 19.6.0
Machine     : x86_64
Processor   : i386
CPU cores   : 16
Architecture: 64bit



---

**Author:** [Nick Capaldini](mailto:nickcaps@umich.edu), University of Michigan, January 19, 2022

---