<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Modeling" data-toc-modified-id="Modeling-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Modeling</a></span><ul class="toc-item"><li><span><a href="#Victims" data-toc-modified-id="Victims-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Victims</a></span><ul class="toc-item"><li><span><a href="#graphQL" data-toc-modified-id="graphQL-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>graphQL</a></span></li></ul></li><li><span><a href="#Perpetrators" data-toc-modified-id="Perpetrators-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Perpetrators</a></span></li><li><span><a href="#ViolenceEvent" data-toc-modified-id="ViolenceEvent-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>ViolenceEvent</a></span></li></ul></li></ul></div>

In [54]:
import sys
sys.version

'3.8.0 (default, Nov 18 2019, 15:40:53) \n[GCC 7.4.0]'

In [55]:
%load_ext cypher
# https://ipython-cypher.readthedocs.io/en/latest/
# used for cell magic

The cypher extension is already loaded. To reload it, use:
  %reload_ext cypher


In [56]:
from py2neo import Graph
NEO4J_URI="bolt://localhost:7687"
graph = Graph(NEO4J_URI)
graph

<Graph database=<Database uri='bolt://localhost:7687' secure=False user_agent='py2neo/4.3.0 neobolt/1.7.17 Python/3.8.0-final-0 (linux)'> name='data'>

In [100]:
def clear_graph():
    print(graph.run("MATCH (n) DETACH DELETE n").stats())

In [101]:
clear_graph()

constraints_added: 0
constraints_removed: 0
contained_updates: False
indexes_added: 0
indexes_removed: 0
labels_added: 0
labels_removed: 0
nodes_created: 0
nodes_deleted: 0
properties_set: 0
relationships_created: 0
relationships_deleted: 0


In [58]:
graph.run("RETURN apoc.version();").data()

[{'apoc.version()': '3.5.0.9'}]

In [59]:
graph.run("call dbms.components() yield name, versions, edition unwind versions as version return name, version, edition;").data()

[{'name': 'Neo4j Kernel', 'version': '3.5.15', 'edition': 'enterprise'}]

# Modeling

In [60]:
import pandas as pd

We are modeling data from the pinochet dataset, available in https://github.com/danilofreire/pinochet

> Freire, D., Meadowcroft, J., Skarbek, D., & Guerrero, E.. (2019). Deaths and Disappearances in the Pinochet Regime: A New Dataset. https://doi.org/10.31235/osf.io/vqnwu.

The dataset has 59 variables with information about the victims, the perpetrators, and geographical
coordinates of each incident. 

In [61]:
PINOCHET_DATA = "../pinochet/data/pinochet.csv"
pin = pd.read_csv(PINOCHET_DATA)
pin.head()

Unnamed: 0,individual_id,group_id,start_date_daily,end_date_daily,start_date_monthly,end_date_monthly,last_name,first_name,minor,age,...,latitude_5,longitude_5,exact_coordinates_5,place_6,end_location_6,latitude_6,longitude_6,exact_coordinates_6,page,additional_comments
0,1,1,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Corredera Reyes,Mercedes del Pilar,1.0,,...,,,,,,,,,159,
1,2,2,1973-09-11,1973-09-12,1973-09-01,1973-09-01,Torres Torres,Benito Heriberto,0.0,57.0,...,,,,,,,,,159-60,
2,3,3,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Lira Morales,Juan Manuel,0.0,23.0,...,,,,,,,,,160,
3,4,4,1973-09-12,1973-09-14,1973-09-01,1973-09-01,Fontela Alonso,Alberto Mariano,0.0,26.0,...,,,,,,,,,160,
4,5,5,1973-09-12,1973-09-12,1973-09-01,1973-09-01,Quintilliano Cardozo,Tulio Roberto,0.0,29.0,...,,,,,,,,,160-61,


The dataset contains informations about perpetrators, victims, violence events and event locations. We will develop models around these concepts, and we will stablish relationships between them later. 


## Victims

- victim_id*: this is not the same as in the dataset.
- individual_id
- group_id
- first_name
- last_name
- age
- minor
- male
- number_previous_arrests
- occupation
- occupation_detail
- victim_affiliation
- victim_affiliation_detail
- targeted

In [62]:
victim_attributes = [
    "individual_id",
    "group_id",
    "first_name",
    "last_name",
    "age",
    "minor",
    "male",
    "number_previous_arrests",
    "occupation",
    "occupation_detail",
    "victim_affiliation",
    "victim_affiliation_detail",
    "targeted",
]

pin_victims = pin[victim_attributes]
pin_victims.head()

Unnamed: 0,individual_id,group_id,first_name,last_name,age,minor,male,number_previous_arrests,occupation,occupation_detail,victim_affiliation,victim_affiliation_detail,targeted
0,1,1,Mercedes del Pilar,Corredera Reyes,,1.0,0.0,,School Student,high school,,,
1,2,2,Benito Heriberto,Torres Torres,57.0,0.0,1.0,,Blue Collar,plumbing installer,,,
2,3,3,Juan Manuel,Lira Morales,23.0,0.0,1.0,,White Collar,office worker,,,
3,4,4,Alberto Mariano,Fontela Alonso,26.0,0.0,1.0,,Blue Collar,small fisherman,,,
4,5,5,Tulio Roberto,Quintilliano Cardozo,29.0,0.0,1.0,,Blue Collar,engineer,Opposition,Communist party,


In [63]:
# https://neo4j.com/docs/labs/apoc/current/import/load-csv/
PINOCHET_CSV_GITHUB = "https://raw.githubusercontent.com/danilofreire/pinochet/master/data/pinochet.csv"

query = """
WITH $url AS url 
CALL apoc.load.csv(url) 
YIELD lineNo, map, list
RETURN *
LIMIT 1"""

graph.run(query, url = PINOCHET_CSV_GITHUB).data()

[{'lineNo': 0,
  'list': ['1',
   '1',
   '1973-09-12',
   '1973-09-12',
   '1973-09-01',
   '1973-09-01',
   'Corredera Reyes',
   'Mercedes del Pilar',
   '1',
   'NA',
   '0',
   'School Student',
   'high school',
   'NA',
   'NA',
   'Killed',
   'Gun',
   'NA',
   'NA',
   'NA',
   'NA',
   '0',
   '0',
   'NA',
   'NA',
   'NA',
   'Chilean',
   'In Public',
   'Calle Gran Avenida',
   '-33.501342',
   '-70.654242',
   '0',
   'In Hospital',
   'Medical Legal Institute (by the Barros Luco Hospital)',
   '-33.484124',
   '-70.646406',
   '1',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   'NA',
   '159',
   'NA'],
  'map': {'occupation': 'School Student',
   'minor': '1',
   'perpetrator_affiliation': 'NA',
   'start_date_monthly': '1973-09-01',
   'individual_id': '1',
   'perpetrator_affiliation_detail': 'NA',
   'press': '0',
   'method': 'Gun',
   'mi

In [64]:
%%cypher
CALL apoc.load.csv('pinochet.csv') 
YIELD lineNo, map, list
RETURN *
LIMIT 1

1 rows affected.


lineNo,list,map
0,"['1', '1', '1973-09-12', '1973-09-12', '1973-09-01', '1973-09-01', 'Corredera Reyes', 'Mercedes del Pilar', '1', 'NA', '0', 'School Student', 'high school', 'NA', 'NA', 'Killed', 'Gun', 'NA', 'NA', 'NA', 'NA', '0', '0', 'NA', 'NA', 'NA', 'Chilean', 'In Public', 'Calle Gran Avenida', '-33.501342', '-70.654242', '0', 'In Hospital', 'Medical Legal Institute (by the Barros Luco Hospital)', '-33.484124', '-70.646406', '1', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', '159', 'NA']","{'occupation': 'School Student', 'minor': '1', 'perpetrator_affiliation': 'NA', 'start_date_monthly': '1973-09-01', 'individual_id': '1', 'perpetrator_affiliation_detail': 'NA', 'press': '0', 'method': 'Gun', 'mistreatment': 'NA', 'occupation_detail': 'high school', 'nationality': 'Chilean', 'number_previous_arrests': 'NA', 'page': '159', 'start_date_daily': '1973-09-12', 'start_location_1': 'Calle Gran Avenida', 'male': '0', 'war_tribunal': '0', 'end_date_monthly': '1973-09-01', 'location_2': 'Medical Legal Institute (by the Barros Luco Hospital)', 'latitude_5': 'NA', 'latitude_6': 'NA', 'victim_affiliation_detail': 'NA', 'latitude_3': 'NA', 'latitude_4': 'NA', 'interrogation': 'NA', 'targeted': 'NA', 'first_name': 'Mercedes del Pilar', 'violence': 'Killed', 'end_location_6': 'NA', 'exact_coordinates_3': 'NA', 'last_name': 'Corredera Reyes', 'exact_coordinates_2': '1', 'exact_coordinates_1': '0', 'victim_affiliation': 'NA', 'latitude_1': '-33.501342', 'latitude_2': '-33.484124', 'additional_comments': 'NA', 'place_2': 'In Hospital', 'longitude_4': 'NA', 'place_3': 'NA', 'longitude_5': 'NA', 'torture': 'NA', 'longitude_6': 'NA', 'place_1': 'In Public', 'place_6': 'NA', 'group_id': '1', 'longitude_1': '-70.654242', 'exact_coordinates_6': 'NA', 'longitude_2': '-70.646406', 'place_4': 'NA', 'exact_coordinates_5': 'NA', 'longitude_3': 'NA', 'exact_coordinates_4': 'NA', 'place_5': 'NA', 'end_location_4': 'NA', 'end_location_5': 'NA', 'end_date_daily': '1973-09-12', 'end_location_3': 'NA', 'age': 'NA'}"


In [104]:
clear_graph()

constraints_added: 0
constraints_removed: 0
contained_updates: True
indexes_added: 0
indexes_removed: 0
labels_added: 0
labels_removed: 0
nodes_created: 0
nodes_deleted: 2398
properties_set: 0
relationships_created: 0
relationships_deleted: 0


In [105]:
query = """
WITH $url AS url 
CALL apoc.load.csv(url, {skip:0, header:true,
   mapping:{
     individual_id: {type:'int'},
     individual_id: {type:'int'},
     minor: {type:'bool'},
     age: {type:'str'},
     male: {type:'bool'},
     number_previous_arrests: {type: 'string'}
   }
}) 
YIELD lineNo, map, list
MERGE (v:Victim {
    individual_id: map.individual_id,
    group_id: map.group_id,
    first_name: map.first_name,
    last_name: map.last_name,
    age: map.age,
    minor: map.minor,
    male: map.male,
    number_previous_arrests: map.number_previous_arrests,
    occupation: map.occupation,
    occupation_detail: map.occupation_detail,
    victim_affiliation: map.victim_affiliation,
    victim_affiliation_detail: map.victim_affiliation_detail,
    targeted: map.targeted
    })
"""

graph.run(query, url = PINOCHET_CSV_GITHUB).stats()

constraints_added: 0
constraints_removed: 0
contained_updates: True
indexes_added: 0
indexes_removed: 0
labels_added: 2398
labels_removed: 0
nodes_created: 2398
nodes_deleted: 0
properties_set: 31174
relationships_created: 0
relationships_deleted: 0

### graphQL

In [93]:
import graphene


## Perpetrators

- perpetrator_id*
- perpetrator_affiliation
- perpetrator_affiliation_detail
- war_tribunal


## ViolenceEvent

toDo