# Penelope network components pipelines

This notebook demonstrates two pipelines that can be achieved using the new complex network components.

## P1: Community detection and network visualisation 

<br>

![pipeline](./img/p1.png)

We demonstrate the first pipeline using the paradigmatic example of the Karate Club graph. We import the data from networkx and convert it to JSON format using the built-in converter.

### Prepare graph data

In [None]:
import networkx as nx
kclub = nx.karate_club_graph()
kclub = nx.node_link_data(kclub)

### Detect communities using Louvain
Perform community detection on the network, which enhances the JSON graph by a new key called `"louvain_com"`

In [None]:
import requests

jsondata = {'data': kclub,
            'resolution': 1.0}

response = requests.post('https://penelope.vub.be/network-components/louvain', json=jsondata)
louvaingraph = response.json()

### Visualise graph
Visualise the network, auto-coloring the nodes by the newly found Louvain communities. The visualisation is saved as a html containing the graph information.

In [None]:
jsondata={'data': louvaingraph,
          'nodecoloring':'id',
          'nodelabel':'id',
          'darkmode': False,
          'edgevisibility': True,
          'particles': False}

response = requests.post('https://penelope.vub.be/network-components/visualiser', json=jsondata)
html = response.json().get('graph')
with open ("graph.html", "w") as f:
    f.write(html)

### [Open visualisation](./graph.html)

---

## P2: Parliamentary speech data visualisation

<br>

![pipeline](./img/p2.png)

### Data retrieval using API
Use the Penelope API to retrieve speeches from the UK House of Commons.

In [None]:
import requests
json = {'search_query':'Brexit', 
        'dataset_name':'gbr', 
        'start_date':'2018-1-1', 
        'end_date':'2019-1-1'}
result = requests.post('https://penelope.vub.be/parliament-data/get-speeches-agg', json=json)

### Data selection
Pick one discussion from the pool:

In [None]:
discussiontitles = []
for speech in result.json()["speeches"]:
    discussiontitles.append(speech["discussion_title"])
discussiontitles = set(discussiontitles)

chosenspeeches = []
for speech in result.json()["speeches"]:
    if speech["discussion_title"] == 'LEAVING THE EU: PREPARATIONS 2019-09-03':
        chosenspeeches.append(speech)

### Split into sentences
Split the JSON array into sentences using the [EHAI spacy sentenizer](https://app.swaggerhub.com/apis/EHAI/vub-spacy-services/1.0.0).

In [None]:
for speech in chosenspeeches:
    speech['text'] = speech['text'].replace('hon.', 'hon')

texts = [el['text'] for el in chosenspeeches]
json={"texts":texts,
      "model":"en"}
result = requests.post('https://penelope.vub.be/spacy-api/texts-split-sentences',json=json)
result = result.json()['texts_sentences']

sentenized_speeches = []
for i, full_text in enumerate(chosenspeeches):
    for sentence in result[i]:
        temp = full_text.copy()
        del temp['id']
        temp['text'] = sentence
        sentenized_speeches.append(temp)

### Generate Statement Graph

Transform the list of statements into a graph:

In [None]:
jsondata = {
    "data"          : sentenized_speeches,
    "language"      : "en",
    "relevant_pos"  : ['ADJ'],
    "ignore"        : ['Brexit', 'hon', 'friend']
}

response = requests.post('https://penelope.vub.be/network-components/statementgraphgenerator', json=jsondata)
stmgraph = response.json()

### Keep only the giant component
You might want to keep only the Giant Component of the Statement Graph

In [None]:
jsondata  = {'data': stmgraph}

response  = requests.post('https://penelope.vub.be/network-components/giantcomponent', json=jsondata)
giantcomp = response.json()

### Visualise the graph
Visualise the Statement Graph, auto-coloring the nodes by the speaker's party.

In [None]:
jsondata={'data': giantcomp,
          'nodecoloring':'party',
          'nodelabel':'text',
          'darkmode': False,
          'edgevisibility': True,
          'particles': False}

response = requests.post('https://penelope.vub.be/network-components/visualiser', json=jsondata)

html = response.json().get('graph')
with open ("graph.html", "w") as f:
    f.write(html)

### [Open visualisation](./graph.html)
---