Create .env file in the same directory as the notebook and add the following lines:

```env
NEO4J_USERNAME=your_username
NEO4J_PASSWORD=your_password
NEO4J_ENDPOINT=your_endpoint
```

Install python dependencies.

In [None]:
pip install -r requirements.txt

Connect to database.

In [15]:
from databaseconnection import DatabaseConnection
gds = DatabaseConnection().get_database_connection()
gds.version()

'2.3.2'

Should show a version number if connection to database is successful.

https://github.com/neo4j/graph-data-science-client/blob/main/examples/fastrp-and-knn.ipynb

In [7]:
node_projection = {
    "Respondent": {},
    "QuestionAlternative": { "properties": { "position": { "defaultValue": 0 }}}
}
relationship_projection = "CHOSE"

In [9]:
G, result = gds.graph.project("respondentAnswer", node_projection, relationship_projection)

print(f"The projection took {result['projectMillis']} ms")
print(f"Graph '{G.name()}' node count: {G.node_count()}")
print(f"Graph '{G.name()}' node labels: {G.node_labels()}")

The projection took 1161 ms
Graph 'respondentAnswer' node count: 5010
Graph 'respondentAnswer' node labels: ['Respondent', 'QuestionAlternative']


In [10]:
result = gds.fastRP.mutate(
    G,
    mutateProperty='embedding',
    randomSeed=42,
    embeddingDimension=128,
    iterationWeights=[0.8, 1, 1, 1, 1, 1, 1, 1, 1, 1]
)
print(f"Number of embedding vectors produced: {result['nodePropertiesWritten']}")

Number of embedding vectors produced: 5010


In [11]:
result = gds.knn.write(
    G,
    topK=2,
    nodeProperties=["embedding"],
    randomSeed=42,
    concurrency=1,
    sampleRate=1.0,
    deltaThreshold=0.0,
    writeRelationshipType="SIMILAR",
    writeProperty="score",
)

print(f"Relationships produced: {result['relationshipsWritten']}")
print(f"Nodes compared: {result['nodesCompared']}")
print(f"Mean similarity: {result['similarityDistribution']['mean']}")

Knn:   0%|          | 0/100 [00:00<?, ?%/s]

Relationships produced: 10020
Nodes compared: 5010
Mean similarity: 0.5773449408555935


In [19]:
gds.run_cypher(
    """
        MATCH (p1:Respondent)-[r:SIMILAR]->(p2:Respondent)
        WHERE r.score > 0.0
        RETURN p1.id AS person1, p2.id AS person2, r.score AS similarity
        ORDER BY similarity DESCENDING, person1, person2
    """
)

Unnamed: 0,person1,person2,similarity
0,7ba5ac16-8c73-401b-b3e7-f3acaf65bff9,f07c936e-0be2-4232-9e7e-6d04ec1751f1,1.000000
1,f07c936e-0be2-4232-9e7e-6d04ec1751f1,7ba5ac16-8c73-401b-b3e7-f3acaf65bff9,1.000000
2,7ba5ac16-8c73-401b-b3e7-f3acaf65bff9,7fd4dcb1-b166-42f5-9484-02985bb2ef49,0.985398
3,7fd4dcb1-b166-42f5-9484-02985bb2ef49,7ba5ac16-8c73-401b-b3e7-f3acaf65bff9,0.985398
4,7fd4dcb1-b166-42f5-9484-02985bb2ef49,f07c936e-0be2-4232-9e7e-6d04ec1751f1,0.985398
...,...,...,...
6927,2c94e7f1-e29a-45c4-84d2-bd024415d105,c5df8342-02c8-46b4-98d8-8b1f0a06ecac,0.622691
6928,21eef46c-9fae-4353-9e67-6bab2ce496d0,921c8300-1059-40ea-b3df-429f51826443,0.613362
6929,553ce11a-44c6-4258-b7a5-61c6950d64b8,f7d6178f-fc3a-410d-b08c-1102487adc92,0.612722
6930,7c2016d6-d74f-4c9b-bdeb-21626b71adc6,b362f068-b8b8-4d41-ae39-cc4c488c36ca,0.608685


In [40]:
result = gds.run_cypher("""
    MATCH (r:Respondent)-[ha:HAS_ANSWERED]-(q:Question)
    WHERE r.id = '55134294-48bc-4e6a-8fa5-41d7ae3b7a70'
    OR r.id = 'b145c256-954c-4108-b1f5-a8eb19ee3e50'
    RETURN r.id as respondent, q.name AS question;
""")

for ix, row in enumerate(result['question']):
    print(ix, result['respondent'][ix], row)

0 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 Vi delar resultat och diskuterar kartläggningarna tillsammans med all personal på skolan
1 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 I vår skola respekterar eleverna alla människor
2 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 Tillsammans med eleverna diskuterar vi återkommande andra människors egenvärde samt deras kroppsliga och personliga integritet
3 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 I min(a) klass(er) är klassreglerna framtagna gemensamt av elever och lärare
4 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 I min(a) klass(er) kan alla elever känna sig trygga 
5 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 Rektor eller skolledning är bra på att stötta och uppmuntra personalen i arbetet mot kränkningar eller mobbning
6 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 Skolans personal finns på de platser i skolan som eleverna i kartläggningar har upplevt som mest otrygga
7 55134294-48bc-4e6a-8fa5-41d7ae3b7a70 Har du tankar eller kommentarer om hur det var att svara på denna enkät? Skri

In [34]:
for i, q in enumerate(result['question']):
    for j, p in enumerate(result['question']):
        if i == j:
            continue
        if q == p:
            print(f"""
            {i} and {j} have the same question:
            Question: {q}
            """)


            8 and 122 have the same question:
            Question: Särskilt utarbetade strategier för att skapa goda relationer mellan eleverna tillämpas i vår skola 
            

            10 and 125 have the same question:
            Question: Lärare på vår skola skapar ordning och arbetsro i klassrummen
            

            14 and 112 have the same question:
            Question: Vi har ett fungerande system där personal befinner sig bland eleverna på rasterna, tex. ett utarbetat rastvaktsschema 
            

            15 and 114 have the same question:
            Question: Skolpersonalen har över lag lätt för att engagera sig i arbetet mot kränkningar eller mobbning
            

            17 and 90 have the same question:
            Question: Vi delar resultat och diskuterar kartläggningarna tillsammans med eleverna 
            

            18 and 127 have the same question:
            Question: I min(a) klass(er) gillar de flesta eleverna skolan
            


### Before removing duplicate relationships
![Before removing duplicate relationships](fastrp_knn_duplicate_relationships.png)

### Remove duplicate relationships

In [2]:
gds.run_cypher("""
    MATCH (r1:Respondent)-[rel1:SIMILAR]->(r2:Respondent)
    WHERE id(r1) < id(r2) AND EXISTS((r2)-[:SIMILAR]->(r1))
    WITH rel1
    DELETE rel1
    """)

### After removing duplicate relationships
![After removing duplicate relationships](fastrp_knn_removed_duplicate_relationships.png)

## Community detection for the similar Respondents

### First project the Respondents with the `SIMILAR` relationship with the score property

In [5]:
G, result = gds.graph.project(
    "knnSimilarRespondents",
    ["Respondent"],
    {
        "SIMILAR": { "orientation": "UNDIRECTED" }
    },
    relationshipProperties="score"
)

print(f"The projection took {result['projectMillis']} ms")
print(f"Graph '{G.name()}' node count: {G.node_count()}")
print(f"Graph '{G.name()}' node labels: {G.node_labels()}")

The projection took 107 ms
Graph 'knnSimilarRespondents' node count: 3475
Graph 'knnSimilarRespondents' node labels: ['Respondent']


### Louvain

In [13]:
result = gds.louvain.stream(G)

print(f"Number of communities: {len(set(result['communityId']))}")

Louvain:   0%|          | 0/100 [00:00<?, ?%/s]

Number of communities: 25


In [12]:
result = gds.louvain.stream(G, relationshipWeightProperty="score")

print(f"Number of communities: {len(set(result['communityId']))}")

Louvain:   0%|          | 0/100 [00:00<?, ?%/s]

36

In [14]:
result = gds.louvain.write(G, relationshipWeightProperty="score", writeProperty="louvainCommunity")
print(f"No. of communities: {result['communityCount']}")
print(f"Modularity: {result['modularity']}")

Louvain:   0%|          | 0/100 [00:00<?, ?%/s]

No. of communities: 38
Modularity: 0.7312269661112804


```cypher

### Unweighted Louvain
```cypher
CALL gds.louvain.stream('knnSimilarRespondents') YIELD nodeId, communityId, intermediateCommunityIds RETURN COUNT(DISTINCT(communityId));
```

## Weighted Louvain

```cypher
CALL gds.louvain.stream('knnSimilarRespondents', { relationshipWeightProperty: 'score' }) YIELD nodeId, communityId, intermediateCommunityIds RETURN COUNT(DISTINCT(communityId));
```

### APOC seems to not be installed...

https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases  
https://neo4j.com/labs/apoc/5/installation/



### Visualize

https://neo4j.com/labs/apoc/5/export/gephi/
https://gephi.org/

### Apoc installed
Had to edit the config.