Skip to content
Permalink
master
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time

Performance Benchmarks

This document plots the results of a series of performance benchmarks in order to compare the performance of graphy against other libraries as well as against itself (using different modes/options). The benchmarks demonstrate a few select task objectives designed to simulate real-world scenarios.

Want to see how other libraries stack up? Feel free to open an issue.

Interpretting the Charts

  • Each data point in the following charts represents the mean value of 5 trials.
  • The X-axis units for all charts are in Millions of Quads, and correspond to the number of triples/quads fed into the process via stdin
  • The Y-axis for each 'Velocity' chart denotes the number of Quads per millisecond (Quads/ms) at which the task objective completed.

Disclaimers

  • Memory-intensive tasks were run with the --max-old-space-size=8192 node.js option (e.g., the distinct task). Some charts show a non-linear progression in time due to the fact that V8's GC starts aggressively trying to free up memory.
  • Memory usage represents the resident stack size (RSS) at the moment the results are reported. For graphy/scan modes, memory usage stats are not yet available.
  • All Turtle input files are using prefixed names for identifiers when possible.

There are multiple modes for graphy:

  • the default mode, which means that validation is enabled for reading
  • 'relaxed' mode, which skips validation for faster reading
  • 'scan' mode, which uses multiple threads (2, 4, 8 or 16 in these trials) to read the input stream

Competitors

Task Objectives

  • Count Task -- Count the number of statements in an RDF document.
  • Distinct Task -- Count the distinct number of triples/quads in an RDF document.
  • Convert Task -- Convert an RDF document from one serialization format to another.

⬇️        ⬇️        ⬇️


Count Task

Count the number of statements in an RDF document.

Test Flavors:

Count Task With N-Triples as input

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for count Task with N-Triples as input Performance Review of memory for count Task with N-Triples as input

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for count Task with N-Triples as input Performance Review of memory for count Task with N-Triples as input

Count Task With Turtle as input

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for count Task with Turtle as input Performance Review of memory for count Task with Turtle as input

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for count Task with Turtle as input Performance Review of memory for count Task with Turtle as input

Distinct Task

Count the distinct number of triples/quads in an RDF document.

Test Flavors:

Distinct Task With N-Triples as input

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for distinct Task with N-Triples as input Performance Review of memory for distinct Task with N-Triples as input

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for distinct Task with N-Triples as input Performance Review of memory for distinct Task with N-Triples as input

Distinct Task With Turtle as input

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for distinct Task with Turtle as input Performance Review of memory for distinct Task with Turtle as input

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for distinct Task with Turtle as input Performance Review of memory for distinct Task with Turtle as input

Convert Task

Convert an RDF document from one serialization format to another.

Test Flavors:

Convert Task With N-Triples as input => Turtle as output

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for convert Task with N-Triples as input => Turtle as output Performance Review of memory for convert Task with N-Triples as input => Turtle as output

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for convert Task with N-Triples as input => Turtle as output Performance Review of memory for convert Task with N-Triples as input => Turtle as output

Convert Task With Turtle as input => N-Triples as output

Input File: Wikidata Data Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for convert Task with Turtle as input => N-Triples as output Performance Review of memory for convert Task with Turtle as input => N-Triples as output

Input File: DBpedia "Person Data" Dump

Velocity (Quads/ms) ▲=👍 Memory Usage (MiB) ▼=👍
Performance Review of elapsed for convert Task with Turtle as input => N-Triples as output Performance Review of memory for convert Task with Turtle as input => N-Triples as output