Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 831 Bytes

README.md

File metadata and controls

13 lines (9 loc) · 831 Bytes

Semantic-Join

SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora

Overview

This is the benchmark data set used in our experiments described here.

There are 50 test cases of joinable web tables, collected from Google Tables. Each test case has two key columns taken from two seperate tables, which while not equi-join-able, have semantic relationships that can be used to produce joins.

Data set description

There are two files per test case:

  • CaseN_input.txt: this is a test case containing two key columns from two seperate tables, separated by an empty line.
  • CaseN_ground.txt: this is the ground truth join results manually labelled, with join-able keys in the same row.