# Laboratory 1: Property Graphs
### Luis Alfredo Leon Villapún
### Liliia Aliakberova

# Part B Querying
* * *
In this section we will perform the suggested queries

## Creating the connector
As in previous parts, let's first create the connector to handle the messages with Neo4j.

In [1]:
# Uncomment to install extension
#!pip install ipython-cypher

In [2]:
%load_ext cypher

In [3]:
from connector import Neo4jConnector
from getpass import getpass

uri = "neo4j://localhost:7687"
user = "neo4j"
password = getpass("Input your password to connect")
conn = Neo4jConnector(uri, user, password)

Input your password to connect········


## Queries
1. Find the top 3 most cited papers of each conference.

In [4]:
# From documentation site: https://neo4j.com/docs/python-manual/current/

def query_1(tx):
    result = tx.run(
        """
MATCH (p:Paper)<-[c:CITED_BY]-(a:Paper)-[r:PUBLISHED_AT]->(d:Document)
WHERE d.DocumentType ="Conference"
WITH d.ConferenceName AS name, a, count(c) AS citations
ORDER BY name, citations DESC
WITH name, collect({paper: a.Title, cited: citations}) AS papers
RETURN name AS Conference, [p IN papers[..3] | p.paper] AS Papers, [p IN papers[..3] | p.cited] AS Cited
        """
    )
    records = list(result)
    summary = result.consume()
    return records, summary

with conn.driver.session(database="neo4j") as session:
    records, summary = session.execute_read(query_1)
    print(records, summary)

[<Record Conference='AAAI Workshop' Papers=['Semantic style creation'] Cited=[6]>, <Record Conference='ACM-BCB-International Conference on Bioinformatics, Computational Biology, and Health Informatics' Papers=['Feedback regulation of immune response to maximum exercise in Gulf war illness', 'The PepSeq Pipeline: Software for Antimicrobial Motif Discovery in Randomly-Generated Peptide Libraries'] Cited=[8, 3]>, <Record Conference='ACM/IEEE International Conference on Human-Robot Interaction' Papers=['Haptic Shape-Based Management of Robot Teams in Cordon and Patrol', 'Design and Evaluation of Adverb Palette: A GUI for Selecting Tradeoffs in Multi-objective Optimization Problems'] Cited=[9, 8]>, <Record Conference='AIAA Aerospace Sciences Meeting' Papers=['Mixing plane simulation of a high-performance fan using kestrel', 'Effect of the turbulence modeling in large-eddy simulations of nonpremixed flames undergoing extinction and reignition', 'Analysis of distortion transfer and generation

2. Find the each conference communities.

In [5]:
# From documentation site: https://neo4j.com/docs/python-manual/current/

def query_2(tx):
    result = tx.run(
        """
        MATCH (a:Author)<-[w:WRITTEN_BY]-(p:Paper)-[r:PUBLISHED_AT]->(d:Document)
        WITH a,d.ConferenceName as Conference_Name, count(DISTINCT d.Volume) as Editions
        WHERE Editions > 3
        RETURN Conference_Name, collect(a.AuthorName)  as Community_member
        """
    )
    records = list(result)
    summary = result.consume()
    return records, summary

with conn.driver.session(database="neo4j") as session:
    records, summary = session.execute_read(query_2)
    print(records, summary)

[<Record Conference_Name='AIAA/CEAS Aeroacoustics Conference' Community_member=[' James M.M.', ' Wall A.T.', ' Gee K.L.', ' Neilsen T.B.']>, <Record Conference_Name='ASME Design Engineering Technical Conference' Community_member=[' Mattson C.A.', ' Howell L.L.', ' Magleby S.P.']>, <Record Conference_Name='ASME Turbo Expo' Community_member=[' Gorrell S.E.']>, <Record Conference_Name='Fall Technical Meeting of the Western States Section of the Combustion Institute, WSSCI' Community_member=[' Fletcher T.H.']>, <Record Conference_Name='Geotechnical Special Publication' Community_member=['Franke K.W.', 'Rollins K.M.']>, <Record Conference_Name='IEEE Photonics Conference, IPC' Community_member=[' Hawkins A.R.', ' Schmidt H.']>, <Record Conference_Name='International Conference on Engineering and Product Design Education' Community_member=[' Howell B.']>, <Record Conference_Name='International Telemetering Conference' Community_member=['Afran M.S.', ' Rice M.', ' Saquib M.', 'Rice M.']>, <Rec

3. Find the impact factors of the journals in your graph

In [8]:
# From documentation site: https://neo4j.com/docs/python-manual/current/

def query_3(tx):
    result = tx.run(
        """
        MATCH(n:Author) RETURN n LIMIT 5
        """
    )
    records = list(result)
    summary = result.consume()
    return records, summary

with conn.driver.session(database="neo4j") as session:
    records, summary = session.execute_read(query_3)
    print(records, summary)

[<Record n=<Node element_id='29101' labels=frozenset({'Author'}) properties={'AuthorName': 'Khah F.S.'}>>, <Record n=<Node element_id='29114' labels=frozenset({'Author'}) properties={'AuthorName': ' Rybkowski Z.K.'}>>, <Record n=<Node element_id='29116' labels=frozenset({'Author'}) properties={'AuthorName': ' Ray Pentecost A.'}>>, <Record n=<Node element_id='29117' labels=frozenset({'Author'}) properties={'AuthorName': ' Smith J.P.'}>>, <Record n=<Node element_id='29118' labels=frozenset({'Author'}) properties={'AuthorName': ' Muir R.'}>>] <neo4j.work.summary.ResultSummary object at 0x7f7c4b589760>


4. Find the h-indexes of the authors in your graph

In [10]:
# From documentation site: https://neo4j.com/docs/python-manual/current/

def query_4(tx):
    result = tx.run(
        """
        MATCH (a:Author)<-[w:WRITTEN_BY]-(p:Paper)-[b:CITED_BY]->(p2:Paper) WITH a, p, count(b) AS citations
WITH a, p, citations ORDER BY citations DESC
WITH a, count(p) AS total, collect(citations) AS list
WITH a, total, list, [x in range(1, size(list)) WHERE x <= list[x - 1] | [list[x - 1], x] ] AS list_hindex
WITH *, list_hindex[-1][1] AS h_index
ORDER BY h_index DESC
RETURN a.AuthorName as Author, h_index
        """
    )
    records = list(result)
    summary = result.consume()
    return records, summary

with conn.driver.session(database="neo4j") as session:
    records, summary = session.execute_read(query_4)
    print(records, summary)

[<Record Author=' Magleby S.P.' h_index=8>, <Record Author=' Howell L.L.' h_index=8>, <Record Author=' Hawkins A.R.' h_index=7>, <Record Author=' Schmidt H.' h_index=7>, <Record Author=' Beard R.W.' h_index=7>, <Record Author=' Hedengren J.D.' h_index=7>, <Record Author=' Gee K.L.' h_index=6>, <Record Author=' Fletcher T.H.' h_index=6>, <Record Author=' Iverson B.D.' h_index=6>, <Record Author=' Mattson C.A.' h_index=6>, <Record Author=' Neilsen T.B.' h_index=5>, <Record Author=' James M.M.' h_index=5>, <Record Author=' Wall A.T.' h_index=5>, <Record Author=' Warnick S.' h_index=5>, <Record Author=' Rice M.' h_index=5>, <Record Author=' Stott M.A.' h_index=5>, <Record Author=' Jensen B.D.' h_index=5>, <Record Author=' Ning A.' h_index=5>, <Record Author=' Gorrell S.E.' h_index=5>, <Record Author=' Harrison W.K.' h_index=5>, <Record Author=' Seamons K.' h_index=5>, <Record Author=' Zappala D.' h_index=5>, <Record Author=' McLain T.W.' h_index=5>, <Record Author=' McLain T.' h_index=4>, 