The tutorial describes existing approaches to model graph databases, different techniques implemented in RDF and Database engines, and their main drawbacks when a large volume of interconnected data needs to be traversed. We will focus on current solutions that have been proposed in the context of both the Semantic Web and Databases to manage large graphs. The target audience includes researchers and practitioners that develop or use query engines to consume RDF graphs. The participants will learn the properties of existing RDF and graph-based engines and how current approaches need to be extended to support efficient graph-based operations. A hands-on session will allow attendees to evaluate the performance and robustness of existing approaches.
Attendees will be able to participate in a hands-on session and implement different graph-core tasks using the API’s offered by existing graph database engines. We will provide different libraries in Java to interact with the API’s of Neo4j and Sparksee. We assume that participants have installed Java 6, and a text editor in their laptops. Mac and Linux environments are also recommended.
The hands-on session will be comprised of four assignments.
First of all, open the terminal an go to the Hands-On folder. You've been given an almost complete code, for the previous assignments at the experiment/ folder. You have to complete the code where is indicated with an INSERT CODE HERE or INSERT String HERE inside a comment.
- Implement the query “Papers written by Peter Smith” using the Sparksee API and Neo4j API.
Complete the code for Neo4j, located in the files experiment/Neo4jCreate.java and experiment/Neo4jQuery.java. Then..
> make Neo4jCreate
> make Neo4jQuery
> ./publications Neo4jCreate neo4j_db
> ./publications Neo4jQuery neo4j_db
Now complete the code for Sparksee, located in the files experiment/SparkseeCreate.java and experiment/SparkseeQuery.java. Then..
> make SparkseeCreate
> make SparkseeQuery
> ./publications SparkseeCreate sparksee_db
> ./publications SparkseeQuery sparksee_db
- Implement the following queries using the Graphium API.
First, let's create the Graphium DB from the NT file publications.nt:
> ./create <Neo4j or Sparksee> publications.nt graphium_db
Now...
- Papers written by Peter Smith
Complete the code in experiment/A.java, and run (in the terminal)...
> make A
> ./publications A graphium_db
- Papers cited by a paper written by Peter Smith that have at most 2 cites
Complete the code in experiment/B.java, and run (in the terminal)...
> make B
> ./publications B graphium_db
- Papers cited by a paper written by Peter Smith or cited by papers cited by a paper written by Peter Smith
Complete the code in experiment/C.java, and run (in the terminal)...
> make C
> ./publications C graphium_db
- Number of papers cited by a paper written by Peter Smith or cited by papers cited by a paper written by Peter Smith
Complete the code in experiment/D.java, and run (in the terminal)...
> make D
> ./publications D graphium_db
- Number of papers cited by a paper written by Peter Smith or cited by papers cited by a paper written by Peter Smith, and have been published in ESWC
Complete the code in experiment/E.java, and run (in the terminal)...
> make E
> ./publications E graphium_db
- Number of papers cited by a paper written by Peter Smith or cited by papers cited by a paper written by Peter Smith, have been published in ESWC and have at most 4 co-authors
Complete the code in experiment/F.java, and run (in the terminal)...
> make F
> ./publications F graphium_db
- Implement graph invariants using the Graphium API.
- Number of nodes/vertices in the graph.
Complete the code in experiment/Nodes.java, and run (in the terminal)...
> make Nodes
> ./publications Nodes graphium_db
- Graph Density.
Complete the code in experiment/Density.java, and run (in the terminal)...
> make Density
> ./publications Density graphium_db
- Compute graph invariant of different RDF graphs, using the Chrysalis tool, and upload the results in the Graphium Chrysalis website.
Go to the DAW website and download one of the datasets (.nt files). Then...
> ./create <Neo4j or Sparksee> <.nt file path> test_db
> ./chrysalis test_db
Now, go to the Graphium Chrysalis website and upload the file chrysalis.log to visualize it.
After you finish every assignment, you can zip the experiment folder and send it to the judge. Good luck!