Skip to content

Latest commit

 

History

History
106 lines (68 loc) · 5.41 KB

README.md

File metadata and controls

106 lines (68 loc) · 5.41 KB

GProM Logo

iBench

iBench is a metadata generator for creating arbitrarily large and complex mappings, schemas and schema constraints. iBench can be used with a data generator to efficiently generate realistic data integration scenarios with varying degrees of size and complexity. iBench can be used to create benchmarks for different integration tasks including (virtual) data integration, data exchange, schema evolution, mapping operators like composition and inversion, and schema matching.

The user provides a configuration file specifying what types of meta-data and data to generate (e.g., schemas, data, constraints, mappings, ...) and properties of the generated scenario. iBench produces an XML file storing the meta-data generated based on the user configuration. If requested, iBench also generates instance data for generated schemas.

Detailed documentation is available in the wiki

Wiki Documentation

Installation and Tutorial

Usage

Background

Setup Guide

iBench is written in Java. To build the system you need ant. Simply run

ant

in the main directory. This will build a jar file and create a build folder. This folder contains a "fat" jar file iBench.jar and several shell scripts (.sh for linux and mac and .bat for windows):

  • iBench.sh - runs iBench to generate metadata and data
  • loader.sh - loads schema and data generated by iBench
  • configGen.sh - automatically generate iBench configuration files

Getting Started

The input to iBench is a configuration file (a text file with key value pairs, i.e., a Java Properties file) that determines the structure and characteristics of the scenario to be created. Some of the parameters control the structure of the generated schemas, mappings, and metadata, some parameters determine which mapping primitives the integration scenario should be composed of, and finally there are parameters that control what metadata is producted and in which format.

  • The tech report mentioned below explains the available primitives and parameters
  • The Wiki - Configuration File also has a page describing the configuration file format and parameters

Usage Example

cd build
./iBench.sh -c CONFIG_FILE

Example Configuration Files and UDPs

You can find example configurations in the exampleConfigurations directory. In particular, configTemplate.txt in this directory is a configuration template containing all supported options with a brief description. Folder exampleScenarios contains exemplary iBench scenarios that can be used as UDPs, folder exampleData contains an example user defined data type.

Example configurations

Currently, the following configurations are in folder exampleConfigurations:

  • configTemplate.txt - a configuration template containing all supported options with a brief description
  • customData.txt - showcases the use of a custom data type
  • customPrimitive.txt - showcases the use of a UDP
  • tutorialOneCopy.txt - a very simple configuration file
  • independentTargetData.txt - showcases how to generate target data that is not exchanged from the source

Public configuration and UDP repository

Furthermore, we maintain a public repository with example configuration files and integration scenarios (which can be used as UDPs). Additions to this repository from the community are highly encouraged.

Wiki

More detailed explanations of the configuration file format and how to use user-defined primitives (UDPs) will be added to the Wiki in the future.

Publications

  • Our recent technical report gives a good overview of the project and current status
  • See the iBench project page at University of Toronto for a full list of publications

Contact

  • Patricia Arocena - Lead Data Architect at the University of Toronto Database Group
  • Boris Glavic - Professor at the Illinois Institute of Technology DBGroup
  • Renée J. Miller - Professor at the University of Toronto Database Group