iBench is a metadata generator for creating arbitrarily large and complex mappings, schemas and schema constraints. iBench can be used with a data generator to efficiently generate realistic data integration scenarios with varying degrees of size and complexity. iBench can be used to create benchmarks for different integration tasks including (virtual) data integration, data exchange, schema evolution, mapping operators like composition and inversion, and schema matching.
The user provides a configuration file specifying what types of meta-data and data to generate (e.g., schemas, data, constraints, mappings, ...) and properties of the generated scenario. iBench produces an XML file storing the meta-data generated based on the user configuration. If requested, iBench also generates instance data for generated schemas.
Detailed documentation is available in the wiki
Installation and Tutorial
- Supported output formats
- Built-in primitives
- User-defined primitives
- Controlling data generation
iBench is written in Java. To build the system you need ant. Simply run
in the main directory. This will build a jar file and create a
build folder. This folder contains a "fat" jar file
iBench.jar and several shell scripts (
.sh for linux and mac and
.bat for windows):
iBench.sh- runs iBench to generate metadata and data
loader.sh- loads schema and data generated by iBench
configGen.sh- automatically generate iBench configuration files
The input to iBench is a configuration file (a text file with key value pairs, i.e., a Java Properties file) that determines the structure and characteristics of the scenario to be created. Some of the parameters control the structure of the generated schemas, mappings, and metadata, some parameters determine which mapping primitives the integration scenario should be composed of, and finally there are parameters that control what metadata is producted and in which format.
- The tech report mentioned below explains the available primitives and parameters
- The Wiki - Configuration File also has a page describing the configuration file format and parameters
cd build ./iBench.sh -c CONFIG_FILE
Example Configuration Files and UDPs
You can find example configurations in the
exampleConfigurations directory. In particular,
configTemplate.txt in this directory is a configuration template containing all supported options with a brief description. Folder
exampleScenarios contains exemplary iBench scenarios that can be used as UDPs, folder
exampleData contains an example user defined data type.
Currently, the following configurations are in folder
configTemplate.txt- a configuration template containing all supported options with a brief description
customData.txt- showcases the use of a custom data type
customPrimitive.txt- showcases the use of a UDP
tutorialOneCopy.txt- a very simple configuration file
independentTargetData.txt- showcases how to generate target data that is not exchanged from the source
Public configuration and UDP repository
Furthermore, we maintain a public repository with example configuration files and integration scenarios (which can be used as UDPs). Additions to this repository from the community are highly encouraged.
More detailed explanations of the configuration file format and how to use user-defined primitives (UDPs) will be added to the Wiki in the future.
- Our recent technical report gives a good overview of the project and current status
- See the iBench project page at University of Toronto for a full list of publications
- Patricia Arocena - Lead Data Architect at the University of Toronto Database Group
- Boris Glavic - Professor at the Illinois Institute of Technology DBGroup
- Renée J. Miller - Professor at the University of Toronto Database Group