Configuration Options

Following configuration options are saved to file: test.properties and are used to define the behaviour of the SPB test driver.

ontologiesPath - path to ontologies from reference knowledge, default: ./data/ontologies
referenceDatasetsPath - path to data from reference knowledge, default: ./data/datasets
creativeWorksPath - path to generated data, default: ./data/generated
queriesPath - path to query templates, default: ./data/sparql
definitionsPath - path to definitions.properties configuration file, default: ./definitions.properties
endpointURL - URL of SPARQL endpoint provided by the RDF database, requires updating
endpointUpdateURL - URL of SPARQL endpoint used for update operations, requires updating
datasetSize - amount of generated data (triples), requires updating
generatedTriplesPerFile - number of triples per generated file. Used to split the data generation into a number of files
adjustRefDatasetsSizes - optional, if reference dataset files exist with the extension '.adjustablettl', then for each, a new .ttl file is created with adjusted size depending on the selected size of data to be generated (parameter 'datasetSize'), default value is true
allowSizeAdjustmentsOnDataModels - allows data generator to dynamically adjust the amount of correlations, clusterrings and randomly generated models keeping a ratio of 1/3 for each in generated data model. This property overrides definitions.properties' parameters : majorEvents, minorEvents, correlationsAmount. Default value is true
queryTimeoutSeconds - query timeout in seconds, default value is 300 s
systemQueryTimeoutSeconds - system queries timeout, default value 1h
validationPath - location where generated and reference data related to validation phase is located, can use default value
generateCreativeWorksFormat - serialization format for generated data. Available options : TriG, TriX, N-Triples, N-Quads, N3, RDF/XML, RDF/JSON, Turtle. Use exact names. Required are context aware serialization formats such as: N-Quads, TriX, TriG
warmupPeriodSeconds - warmup period in seconds, requires updating
benchmarkRunPeriodSeconds - benchmark period in seconds, requires updating
aggregationAgents - number of aggregation agents that will execute a mix of aggregation queries simultaneously, requires updating
editorialAgents - number of editorial agents that will execute a mix of update operations simultaneously, requires updating
dataGeneratorWorkers - number of worker threads used by the data generator to produce data, requires updating
generatorRandomSeed - use it to set the random seed for the data generator (default value is 0). e.g. in cases when several benchmark drivers are started in separate processes to generate data - to be used with creativeWorkNextId parameter
creativeWorkNextId - sets the next ID of Creative Works. When running the benchmark driver to generate synthetic data in separate processes, in order to guarantee that all generated creative works will not overlap by their IDs, add an increment in value ~ 2.6M for each 50M generated triples
creativeWorksInfo - name of a file containing system info about the generated dataset, e.g. interesting entities, etc. (will be saved in 'creativeWorksPath')
querySubstitutionParameters - number substitution parameters that will be generated for each query, default value is 100000
benchmarkByQueryRuns - sets the amount of aggregate queries which the benchmark phase will execute. If value is greater than zero then parameter 'benchmarkRunPeriodSeconds' is ignored. e.g. if set to 100, benchmark will measure the time to execute 100 aggregate operations
benchmarkByQueryMixRuns - sets the count of query mixes that will be executed by the benchmark. If value is zero, then execution of query mixes will not be controlled by this parameter, default:0
scriptsPath - sets the path to scripts participating in various benchmark actions. e.g. scripts may be executed after the load process has completed.
minUpdateRateThresholdOps - defines the minimum rate of editorial operations per second which should be reached during the first 15% of benchmark time and should be kept during the rest of the benchmark run in order to have a valid result. If set to zero, update rate threshold is ignored. e.g. if required update rate is set to 6.3 update operations per second, then benchmark will consider that value during its benchmark run and will report invalid results if that rate drops below the threshold
minUpdateRateThresholdReachTimePercent - defines the time frame during which the defined value in property 'minUpdateRateThresholdOps' should be reached. Default value is 0.1 (10%). e.g. if set to 0.1 (i.e. 10%) then the update rate defined in 'updateRateThresholdOps' should be reached during the first 10% of the benchmark run time, if not reached, the result is considered invalid
maxUpdateRateThresholdOps - defines the maximum rate of editorial operations per second. If set to zero that threshold is ignored.
interruptSignalLocation - defines the location of the interrupt signal (a file) which is used to interrupt current driver's run when such interrupt signal has been set by another driver
enableEditorialOpeartionsValidation - enables validation of editorial operations (insert/delete) during benchmark run. Validation is performed on each 'editorialOpsValidationInterval' operation, default : true
editorialOpsValidationInterval - sets the validation interval for editorial operations, default : 100
enableCompressionOnGeneratedData - enables gzip compression on generated data, default: false

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration Options

Clone this wiki locally