# Mass Mobilization Project
<img style="float: right;" src="../images/man_with_hammer.png">

#### 05. Import built models to predict responses with five United States protests

Models in this notebook all use a Sklearn MultiOutputClassifier.

One model uses a RandomForestClassifier, and one uses LogisticRegression.

It reads in pickled models that were exported from other notebooks, and runs them against five manually created U.S protests.

In [16]:
import pandas as pd
import gzip
import pickle

In [17]:
X = pd.read_csv('../data/x_sample.csv')
X.head()

Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
0,1,0.0,73030.879,244.93,56.900724,0,1,0,0,0,...,0,0,0,0,0,0,1,0,0,0
1,2,1.0,3862.998,71.024,44.029639,1,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
2,1,0.0,927403.866,311.922,53.640353,0,1,0,0,0,...,0,0,0,0,0,0,1,0,0,0
3,14,0.0,4058.131,58.907,80.176044,0,0,0,1,0,...,0,0,0,0,0,0,1,0,0,0


#### Load model from pickle file

In [18]:
# load the model from disk

# The Model from MultiOutputClassifier was around 90MB, but zipped down to 20MB. Use gzip to unzip the pickle
f = gzip.open('../models/01_multi_label_forest_smaller.pickle_zip','rb')
multi_label_forest_model = pickle.load(f)
f.close()

In [19]:
multi_label_forest_model

MultiOutputClassifier(estimator=RandomForestClassifier(max_depth=30,
                                                       max_leaf_nodes=100,
                                                       min_impurity_decrease=0,
                                                       min_samples_leaf=5,
                                                       min_samples_split=10,
                                                       n_estimators=20))

In [20]:
result = multi_label_forest_model.predict(X)

print(result)

[[0 0 0 0 1 0 0]
 [0 0 0 1 0 0 0]
 [0 0 0 0 1 0 0]
 [0 0 0 0 1 0 0]]


First, delete all rows from X that were the four rows of test data.

In [21]:
X.drop([0,1,2,3], inplace=True)  # drop the four rows of old data.

### Create input records for a few protests in the United States.


### November, 1999
#### Seattle World Trade Organization protest

From Wikipedia: [1999 Seattle World Trade Organization protest](https://en.wikipedia.org/wiki/1999_Seattle_WTO_protests)

>The 1999 Seattle WTO protests, sometimes referred to as the Battle of Seattle, were a series of protests surrounding the WTO Ministerial Conference of 1999, when members of the World Trade Organization (WTO) convened at the Washington State Convention and Trade Center in Seattle, Washington on November 30, 1999. The Conference was to be the launch of a new millennial round of trade negotiations.


Create a row for each protest.

In [22]:
# 1999 Seattle
# Create a dictionary that contains a value for each of the 40 input features

new_protest = {
    'protestnumber': 4,
    'protesterviolence': 1,
    'pop_total': 278548.148,  # Taken from WPP2019_TotalPopulationBySex.csv
    'pop_density':30.45,
    'prosperity_2020': 77.5,  # Taken from Legatum_Prosperity_Index_Full_2020_Data_Set Excel file - Score from 2007
    'region_Africa': 0,
    'region_Asia': 0,
    'region_Central America': 0,
    'region_Europe': 0,
    'region_MENA': 0,
    'region_North America': 1,
    'region_Oceania': 0,
    'region_South America': 0,
    'protest_size_category_1,000-4,999': 0,
    'protest_size_category_10,000-100,000': 1,   #Protest size around 50,000 or more
    'protest_size_category_100-999': 0,
    'protest_size_category_5,000-9,999': 0,
    'protest_size_category_50-99':0 ,
    'protest_size_category_Less than 50': 0,
    'protest_size_category_Over 100,000': 0,
    'protester_id_type_civil_human_rights': 0,
    'protester_id_type_ethnic_group': 0,
    'protester_id_type_locals_residents': 0,
    'protester_id_type_pensioners_retirees': 0,
    'protester_id_type_political_group': 0,
    'protester_id_type_prisoners': 0,
    'protester_id_type_protestors_generic': 0,
    'protester_id_type_religious_group': 0,
    'protester_id_type_soldiers_veterans': 0,
    'protester_id_type_students_youth':0,
    'protester_id_type_victims_families': 0,
    'protester_id_type_women':0,
    'protester_id_type_workers_unions': 1,
    'labor_wage_dispute': 1,
    'land_farm_issue': 0,
    'police_brutality':0,
    'political_behavior_process': 1,
    'price increases_tax_policy': 1,
    'removal_of_politician':0,
    'social_restrictions':1
}
        
X.loc["1999 Seattle WTO Protest"] = new_protest
X

Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
1999 Seattle WTO Protest,4.0,1.0,278548.148,30.45,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0


### October, 2011
#### Occupy Atlanta

From [Wikipedia](https://en.wikipedia.org/wiki/Occupy_movement):
>The Occupy movement is an international progressive socio-political movement that expresses opposition to social and economic inequality and to the lack of "real democracy" around the world.

[Occupy Atlanta](https://en.wikipedia.org/wiki/Occupy_Atlanta)

>Occupy Atlanta began on October 6, 2011 in Woodruff Park, located in downtown Atlanta, Georgia. As part of the Occupy movement, it is inspired by Occupy Wall Street which began in New York City on September 17.

>52 Arrests on October 26, 2011


In [23]:
# October, 2011 - Occupy Atlanta
# Create a dictionary that contains a value for each of the 40 input features

new_protest = {
    'protestnumber': 30,  # Random guess
    'protesterviolence': 0,
    'pop_total': 309011.469,  # Taken from WPP2019_TotalPopulationBySex.csv
    'pop_density':33.781,
    'prosperity_2020': 77.5,  # Taken from Legatum_Prosperity_Index_Full_2020_Data_Set Excel file - Score from 2007
    'region_Africa': 0,
    'region_Asia': 0,
    'region_Central America': 0,
    'region_Europe': 0,
    'region_MENA': 0,
    'region_North America': 1,
    'region_Oceania': 0,
    'region_South America': 0,
    'protest_size_category_1,000-4,999': 0,  
    'protest_size_category_10,000-100,000': 0,   
    'protest_size_category_100-999': 1,  #Protesters numbering around 120-150 were warned to leave the park or they would be arrested.  
    'protest_size_category_5,000-9,999': 0,
    'protest_size_category_50-99':0 ,
    'protest_size_category_Less than 50': 0,
    'protest_size_category_Over 100,000': 0,  
    'protester_id_type_civil_human_rights': 0,
    'protester_id_type_ethnic_group': 0,
    'protester_id_type_locals_residents': 1,
    'protester_id_type_pensioners_retirees': 0,
    'protester_id_type_political_group': 0,
    'protester_id_type_prisoners': 0,
    'protester_id_type_protestors_generic': 0,
    'protester_id_type_religious_group': 0,
    'protester_id_type_soldiers_veterans': 0,
    'protester_id_type_students_youth':0,
    'protester_id_type_victims_families': 0,
    'protester_id_type_women':0,
    'protester_id_type_workers_unions': 0,
    'labor_wage_dispute': 1,
    'land_farm_issue': 0,
    'police_brutality':0,
    'political_behavior_process': 1, 
    'price increases_tax_policy': 0,
    'removal_of_politician':0,
    'social_restrictions':0
}
        
#append row to the dataframe
#X = X.append(new_protest, ignore_index=True)
X.loc["2011 Occupy Atlanta"] = new_protest
X

Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
1999 Seattle WTO Protest,4.0,1.0,278548.148,30.45,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
2011 Occupy Atlanta,30.0,0.0,309011.469,33.781,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0


### March, 2018 
#### March for Our Lives Protest

The [March for Our Lives (MFOL)](https://en.wikipedia.org/wiki/March_for_Our_Lives) was a student-led demonstration in support of gun control legislation.


In [24]:
# March, 2018 - March for Our Lives Protest
# Create a dictionary that contains a value for each of the 40 input features

new_protest = {
    'protestnumber': 4,  # Random guess
    'protesterviolence': 0,
    'pop_total': 327096.263,  # Taken from WPP2019_TotalPopulationBySex.csv
    'pop_density':35.76,
    'prosperity_2020': 77.5,  # Taken from Legatum_Prosperity_Index_Full_2020_Data_Set Excel file - Score from 2007
    'region_Africa': 0,
    'region_Asia': 0,
    'region_Central America': 0,
    'region_Europe': 0,
    'region_MENA': 0,
    'region_North America': 1,
    'region_Oceania': 0,
    'region_South America': 0,
    'protest_size_category_1,000-4,999': 0,  
    'protest_size_category_10,000-100,000': 0,   
    'protest_size_category_100-999': 0,
    'protest_size_category_5,000-9,999': 0,
    'protest_size_category_50-99':0 ,
    'protest_size_category_Less than 50': 0,
    'protest_size_category_Over 100,000': 1,  #Protest nation-wide, over 1 million
    'protester_id_type_civil_human_rights': 0,
    'protester_id_type_ethnic_group': 0,
    'protester_id_type_locals_residents': 0,
    'protester_id_type_pensioners_retirees': 0,
    'protester_id_type_political_group': 0,
    'protester_id_type_prisoners': 0,
    'protester_id_type_protestors_generic': 0,
    'protester_id_type_religious_group': 0,
    'protester_id_type_soldiers_veterans': 0,
    'protester_id_type_students_youth':1,
    'protester_id_type_victims_families': 0,
    'protester_id_type_women':0,
    'protester_id_type_workers_unions': 0,
    'labor_wage_dispute': 0,
    'land_farm_issue': 0,
    'police_brutality':0,
    'political_behavior_process': 1,   # This seems to match gun control legislation the best
    'price increases_tax_policy': 0,
    'removal_of_politician':0,
    'social_restrictions':0
}
        
#append row to the dataframe
#X = X.append(new_protest, ignore_index=True)
X.loc["2018 March For Our Lives"] = new_protest
X

Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
1999 Seattle WTO Protest,4.0,1.0,278548.148,30.45,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
2011 Occupy Atlanta,30.0,0.0,309011.469,33.781,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0
2018 March For Our Lives,4.0,0.0,327096.263,35.76,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0


### April 2020
#### Anti-Covid Lockdown Protest in Michigan

Per [Wikipedia](https://en.wikipedia.org/wiki/2020%E2%80%932021_United_States_anti-lockdown_protests): 
>On April 15, 2020, an estimated 3,000 people took part in a protest they called "Operation Gridlock" in the area surrounding the Michigan State Capitol in Lansing. Most protestors remained in their vehicles, jamming the streets around the capitol building, although around 150 protested on the capitol lawn. The protest lasted eight hours. The protest caused delays during a shift-change at Sparrow Hospital. Police described the protesters as respectful, with most trying to maintain social distancing; no arrests were made.

In [25]:
# April 2020 Anti-Covid Lockdown protest

new_protest = {
    'protestnumber': 20,  # 2020 had a lot of protests - this is just a guess
    'protesterviolence': 0,
    'pop_total': 331002.647,  # Taken from WPP2019_TotalPopulationBySex.csv
    'pop_density':36.185,
    'prosperity_2020': 77.5,  # Taken from Legatum_Prosperity_Index_Full_2020_Data_Set Excel file - Score from 2007
    'region_Africa': 0,
    'region_Asia': 0,
    'region_Central America': 0,
    'region_Europe': 0,
    'region_MENA': 0,
    'region_North America': 1,
    'region_Oceania': 0,
    'region_South America': 0,
    'protest_size_category_1,000-4,999': 1,  #Protest size around 3,000
    'protest_size_category_10,000-100,000': 0,   
    'protest_size_category_100-999': 0,
    'protest_size_category_5,000-9,999': 0,
    'protest_size_category_50-99':0 ,
    'protest_size_category_Less than 50': 0,
    'protest_size_category_Over 100,000': 0,
    'protester_id_type_civil_human_rights': 0,
    'protester_id_type_ethnic_group': 0,
    'protester_id_type_locals_residents': 0,
    'protester_id_type_pensioners_retirees': 0,
    'protester_id_type_political_group': 1,
    'protester_id_type_prisoners': 0,
    'protester_id_type_protestors_generic': 0,
    'protester_id_type_religious_group': 0,
    'protester_id_type_soldiers_veterans': 0,
    'protester_id_type_students_youth':0,
    'protester_id_type_victims_families': 0,
    'protester_id_type_women':0,
    'protester_id_type_workers_unions': 0,
    'labor_wage_dispute': 0,
    'land_farm_issue': 0,
    'police_brutality':0,
    'political_behavior_process': 1,
    'price increases_tax_policy': 0,
    'removal_of_politician':1,
    'social_restrictions':1
}

X.loc["2020 Michigan Covid Lockdown"] = new_protest
X


Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
1999 Seattle WTO Protest,4.0,1.0,278548.148,30.45,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
2011 Occupy Atlanta,30.0,0.0,309011.469,33.781,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0
2018 March For Our Lives,4.0,0.0,327096.263,35.76,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
2020 Michigan Covid Lockdown,20.0,0.0,331002.647,36.185,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0


### Jan, 2021

#### [Jan 6 storming of the U.S. Capitol](https://en.wikipedia.org/wiki/2021_storming_of_the_United_States_Capitol) in Washington D.C.

In [26]:
# 2021 D.C. Jan 6

# Create a dictionary that contains a value for each of the 40 input features

new_protest = {
    'protestnumber': 1,  # First protest of the year
    'protesterviolence': 1,
    'pop_total': 332915.074,  # Taken from WPP2019_TotalPopulationBySex.csv
    'pop_density':36.40,
    'prosperity_2020': 77.5,  # Taken from Legatum_Prosperity_Index_Full_2020_Data_Set Excel file - Score from 2007
    'region_Africa': 0,
    'region_Asia': 0,
    'region_Central America': 0,
    'region_Europe': 0,
    'region_MENA': 0,
    'region_North America': 1,
    'region_Oceania': 0,
    'region_South America': 0,
    'protest_size_category_1,000-4,999': 1,  #Protest size in the thousands
    'protest_size_category_10,000-100,000': 0,   
    'protest_size_category_100-999': 0,
    'protest_size_category_5,000-9,999': 0,
    'protest_size_category_50-99':0 ,
    'protest_size_category_Less than 50': 0,
    'protest_size_category_Over 100,000': 0,
    'protester_id_type_civil_human_rights': 0,
    'protester_id_type_ethnic_group': 0,
    'protester_id_type_locals_residents': 0,
    'protester_id_type_pensioners_retirees': 0,
    'protester_id_type_political_group': 1,
    'protester_id_type_prisoners': 0,
    'protester_id_type_protestors_generic': 0,
    'protester_id_type_religious_group': 0,
    'protester_id_type_soldiers_veterans': 0,
    'protester_id_type_students_youth':0,
    'protester_id_type_victims_families': 0,
    'protester_id_type_women':0,
    'protester_id_type_workers_unions': 0,
    'labor_wage_dispute': 0,
    'land_farm_issue': 0,
    'police_brutality':0,
    'political_behavior_process': 1,
    'price increases_tax_policy': 0,
    'removal_of_politician':1,
    'social_restrictions':0
}

X.loc["2021 D.C. Riot"] = new_protest
X


Unnamed: 0,protestnumber,protesterviolence,pop_total,pop_density,prosperity_2020,region_Africa,region_Asia,region_Central America,region_Europe,region_MENA,...,protester_id_type_victims_families,protester_id_type_women,protester_id_type_workers_unions,labor_wage_dispute,land_farm_issue,police_brutality,political_behavior_process,price increases_tax_policy,removal_of_politician,social_restrictions
1999 Seattle WTO Protest,4.0,1.0,278548.148,30.45,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
2011 Occupy Atlanta,30.0,0.0,309011.469,33.781,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0
2018 March For Our Lives,4.0,0.0,327096.263,35.76,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
2020 Michigan Covid Lockdown,20.0,0.0,331002.647,36.185,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0
2021 D.C. Riot,1.0,1.0,332915.074,36.4,77.5,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0


Create dataframe to contain prediction results

In [27]:
pred_values = multi_label_forest_model.predict(X)

predictions_df = pd.DataFrame(pred_values, columns=['arrests', 'accomodation', 'beatings',
       'crowddispersal', 'ignore', 'killings', 'shootings'], index=X.index)

predictions_df

Unnamed: 0,arrests,accomodation,beatings,crowddispersal,ignore,killings,shootings
1999 Seattle WTO Protest,0,0,0,0,0,0,0
2011 Occupy Atlanta,0,0,0,0,1,0,0
2018 March For Our Lives,0,0,0,0,1,0,0
2020 Michigan Covid Lockdown,0,0,0,0,1,0,0
2021 D.C. Riot,0,0,0,0,0,0,0


### How does this model compare to what actually happened in these protests?
* 1999 Seattle WTO had **arrests** and **crowd dispersal** (tear gas)
* 2011 Occupy Atlanta was mostly peaceful, but did have **arrests** and **crowd dispersal** with helicopters, SWAT team and motorcycle police.
* 2018 March For Our Lives was peaceful. No arrests, no accomodations, no crowd dispersal, no killings and no shootings or violence. It was **ignored**.
* 2020 Michigan Covid Lockdown was a small protest nicknamed Operation Gridlock. It had no arrests, no beatings, no crowd dispersal, no killings, no shootings or violence.  It was **ignored**.
* 2021 D.C. Riot had **arrests**. No accomodation, but there were **beatings** between law enforcement and the protesters. There was eventually **crowd dispersal**. Five were people **killed**, so there were **shootings**, and a **violent response**.

---
#### Load model using Mult-label classification of logistic regression classifiers

In [28]:

# load the model from disk
model_file_2 = '../models/02_multi_label_logistic.pickle'
multi_label_logistic = pickle.load(open(model_file_2, 'rb'))

#### Use StandardScaler to scale the U.S. protests data

In [29]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_sc = sc.fit_transform(X)

In [30]:
pred_multi_log_values = multi_label_forest_model.predict(X_sc)

predictions_multi_log_df = pd.DataFrame(pred_multi_log_values, columns=['arrests', 'accomodation', 'beatings',
       'crowddispersal', 'ignore', 'killings', 'shootings'], index=X.index)

predictions_multi_log_df

Unnamed: 0,arrests,accomodation,beatings,crowddispersal,ignore,killings,shootings
1999 Seattle WTO Protest,0,0,0,0,0,0,0
2011 Occupy Atlanta,0,0,0,0,1,0,0
2018 March For Our Lives,0,0,0,0,1,0,0
2020 Michigan Covid Lockdown,0,0,0,0,1,0,0
2021 D.C. Riot,0,0,0,1,0,0,0


### How does this model compare to what actually happened in these protests?
* 1999 Seattle WTO had **arrests** and **crowd dispersal** (tear gas)
* 2011 Occupy Atlanta was mostly peaceful, but did have **arrests** and **crowd dispersal** with helicopters, SWAT team and motorcycle police.
* 2018 March For Our Lives was peaceful. No arrests, no accomodations, no crowd dispersal, no killings and no shootings or violence. It was **ignored**.
* 2020 Michigan Covid Lockdown was a small protest nicknamed Operation Gridlock. It had no arrests, no beatings, no crowd dispersal, no killings, no shootings or violence.  It was **ignored**.
* 2021 D.C. Riot had **arrests**. No accomodation, but there were **beatings** between law enforcement and the protesters. There was eventually **crowd dispersal**. Five were people **killed**, so there were **shootings**, and a **violent response**.