Skip to content
This repository has been archived by the owner on Apr 26, 2022. It is now read-only.

Configuration Options

venkot edited this page Dec 4, 2014 · 7 revisions

Following configuration options are saved to file: test.properties and are used to define the behaviour of the SPB test driver.

  • ontologiesPath - path to ontologies from reference knowledge, default: ./data/ontologies
  • referenceDatasetsPath - path to data from reference knowledge, default: ./data/datasets
  • creativeWorksPath - path to generated data, default: ./data/generated
  • queriesPath - path to query templates, default: ./data/sparql
  • definitionsPath - path to definitions.properties configuration file, default: ./definitions.properties
  • endpointURL - URL of SPARQL endpoint provided by the RDF database, requires updating
  • endpointUpdateURL - URL of SPARQL endpoint used for update operations, requires updating
  • datasetSize - amount of generated data (triples), requires updating
  • generatedTriplesPerFile - number of triples per generated file. Used to split the data generation into a number of files
  • adjustRefDatasetsSizes - optional, if reference dataset files exist with the extension '.adjustablettl', then for each, a new .ttl file is created with adjusted size depending on the selected size of data to be generated (parameter 'datasetSize'), default value is true
  • allowSizeAdjustmentsOnDataModels - allows data generator to dynamically adjust the amount of correlations, clusterrings and randomly generated models keeping a ratio of 1/3 for each in generated data model. This property overrides definitions.properties' parameters : majorEvents, minorEvents, correlationsAmount. Default value is true
  • queryTimeoutSeconds - query timeout in seconds, default value is 300 s
  • systemQueryTimeoutSeconds - system queries timeout, default value 1h
  • validationPath - location where generated and reference data related to validation phase is located, can use default value
  • generateCreativeWorksFormat - serialization format for generated data. Available options : TriG, TriX, N-Triples, N-Quads, N3, RDF/XML, RDF/JSON, Turtle. Use exact names. Required are context aware serialization formats such as: N-Quads, TriX, TriG
  • warmupPeriodSeconds - warmup period in seconds, requires updating
  • benchmarkRunPeriodSeconds - benchmark period in seconds, requires updating
  • aggregationAgents - number of aggregation agents that will execute a mix of aggregation queries simultaneously, requires updating
  • editorialAgents - number of editorial agents that will execute a mix of update operations simultaneously, requires updating
  • dataGeneratorWorkers - number of worker threads used by the data generator to produce data, requires updating
  • generatorRandomSeed - use it to set the random seed for the data generator (default value is 0). e.g. in cases when several benchmark drivers are started in separate processes to generate data - to be used with creativeWorkNextId parameter
  • creativeWorkNextId - sets the next ID of Creative Works. When running the benchmark driver to generate synthetic data in separate processes, in order to guarantee that all generated creative works will not overlap by their IDs, add an increment in value ~ 2.6M for each 50M generated triples
  • creativeWorksInfo - name of a file containing system info about the generated dataset, e.g. interesting entities, etc. (will be saved in 'creativeWorksPath')
  • querySubstitutionParameters - number substitution parameters that will be generated for each query, default value is 100000
  • benchmarkByQueryRuns - sets the amount of aggregate queries which the benchmark phase will execute. If value is greater than zero then parameter 'benchmarkRunPeriodSeconds' is ignored. e.g. if set to 100, benchmark will measure the time to execute 100 aggregate operations
  • benchmarkByQueryMixRuns - sets the count of query mixes that will be executed by the benchmark. If value is zero, then execution of query mixes will not be controlled by this parameter, default:0
  • scriptsPath - sets the path to scripts participating in various benchmark actions. e.g. scripts may be executed after the load process has completed.
  • minUpdateRateThresholdOps - defines the minimum rate of editorial operations per second which should be reached during the first 15% of benchmark time and should be kept during the rest of the benchmark run in order to have a valid result. If set to zero, update rate threshold is ignored. e.g. if required update rate is set to 6.3 update operations per second, then benchmark will consider that value during its benchmark run and will report invalid results if that rate drops below the threshold
  • minUpdateRateThresholdReachTimePercent - defines the time frame during which the defined value in property 'minUpdateRateThresholdOps' should be reached. Default value is 0.1 (10%). e.g. if set to 0.1 (i.e. 10%) then the update rate defined in 'updateRateThresholdOps' should be reached during the first 10% of the benchmark run time, if not reached, the result is considered invalid
  • maxUpdateRateThresholdOps - defines the maximum rate of editorial operations per second. If set to zero that threshold is ignored.
  • interruptSignalLocation - defines the location of the interrupt signal (a file) which is used to interrupt current driver's run when such interrupt signal has been set by another driver
  • enableEditorialOpeartionsValidation - enables validation of editorial operations (insert/delete) during benchmark run. Validation is performed on each 'editorialOpsValidationInterval' operation, default : true
  • editorialOpsValidationInterval - sets the validation interval for editorial operations, default : 100
  • enableCompressionOnGeneratedData - enables gzip compression on generated data, default: false
Clone this wiki locally