# Data Summary

The protest dataset has info about 12,652 different protests against government from 132 countries between 1990 and 2019. The average number of protests per year has nearly doubled since 2010 (from 343 before 2010 to 605 after).

<table>
<td>
<img src="./output/images/year_dist.jpg" alt="Protests per Year"/>
</td>
</table>

The plurality of protests (33 percent) are from Europe; only 38 are from Oceania. We will use the distribution of protests across regions as a baseline for comparing statistics across clusters. We expect that any measurements will tend be more prevalent in Europe than other places solely because there are more protests recorded from Europe, for example.

<table>
  <tr>
    <td><img src="./output/images/regions_dist.jpg" alt="Distribution of Regions""/></td>
    <td><img src="https://imgs.xkcd.com/comics/heatmap.png" alt="xkcd: heatmap"/></td>
  </tr>
  <tr>
    <td></td>
    <td style="text-align:center;"><p>(relevant xkcd comic)</p></td>
  </tr>
</table>

The Varieties of Democracy variables are different components of measuring each countries liberal democracy level at the time of a protest. For each data point, these five scores tend to be highly correlated (r > 0.90). To limit the number of feature inputs, I combine these into a single measure called `libdem` which is the average of the five scores. In contrast, the HDI score is much less correlated with these scores and it is left to be included in models as a separate measure. HDI is most correlated with the Egalitarian component (r = 0.70).

In [1]:
import pandas as pd
protests = pd.read_csv('./local/data/full.csv')
protests[['libdem','Electoral_Score', 'Liberal_Score', 'Participatory_Score', 'Deliberative_Score', 'Egalitarian_Score','HDI_Score']].corr(method='pearson').round(2)

Unnamed: 0,libdem,Electoral_Score,Liberal_Score,Participatory_Score,Deliberative_Score,Egalitarian_Score,HDI_Score
libdem,1.0,0.99,0.99,0.99,0.99,0.98,0.65
Electoral_Score,0.99,1.0,0.98,0.98,0.96,0.94,0.6
Liberal_Score,0.99,0.98,1.0,0.98,0.98,0.97,0.65
Participatory_Score,0.99,0.98,0.98,1.0,0.96,0.95,0.63
Deliberative_Score,0.99,0.96,0.98,0.96,1.0,0.95,0.62
Egalitarian_Score,0.98,0.94,0.97,0.95,0.95,1.0,0.71
HDI_Score,0.65,0.6,0.65,0.63,0.62,0.71,1.0


To examine state response patterns, I create an indicator `stateviolence`, which is 1 if any `stateresponse*` variable is 'arrests', 'beatings', 'crowd dispersal', 'shootings', or 'killings'; and I create an indicator `accomodation` if any `stateresponse*` variable is 'accomodation'.



## Violence Summary

 - 56 percent of all protests were completely peaceful. Neither protesters nor the state engaged in any violence. 
 - 26 percent of protests had some level of protester violence.
    - Of those, 86 percent were met with state violence.
 - 18 percent of protests were peaceful protests that were met with state violence. A total of 40 percent had state violence, and only about half of these were in response to protester violence.

 !['Distribution of Violence'](./output/images/violence_dist.jpg)

 ## Accomodation Summary

 - Only 10 percent of protests led to accomodations by the government.

 The distribution of violence for protests that led to accomodations is similar to the overall distribution of violence. Of the protests that led to accomodations:

 - 60 percent were completely peaceful,
 - 28 percent had protestor violence,
    - (75 percent of these were met with state violence), and
 - 12 percent had state violence to a peaceful protest.

<table>
  <tr>
    <td><img src="./output/images/violence_dist_2.jpg" alt="Distribution of Violence, Accomodated""/></td>
    <td><img src="./misc/same_picture.png" alt="Same Picture" width=300/></td>
  </tr>
  <tr>
    <td></td>
    <td style="text-align:center;"><p>I mean, they're not. But they are about the same.</p></td>
  </tr>
</table>

## Violence by Region

By region:
Overall Africa and MENA have highest rate of violence
    MENA has average protester violence, but highest state violence
and Europe has lowest
    Europe and South America have lower than average rate of protester violence

Both MENA and South America have higher than average rate of a violent state response
North America is far less likely to have a protest met with state violence
MENA has the highest rate of violent state response to peaceful protest

!['Violence by Region'](./output/images/violence_region.jpg)

Oceania has by far the highest rate of state accomodations




 'protesterviolence',
       'protesterdemand1', 'protesterdemand2', 'protesterdemand3',
       'protesterdemand4', 'stateresponse1', 'stateresponse2',
       'stateresponse3', 'stateresponse4', 'stateresponse5', 'stateresponse6',
       'stateresponse7', 'Electoral_Score', 'Liberal_Score',
       'Participatory_Score', 'Deliberative_Score', 'Egalitarian_Score',
       'HDI_Score', 'violenceStatus', 'predicted_prob', 'accomodation',
       'stateviolence'