Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
lib/trec_eval.9.0
output
README.md

README.md

Target Type Identification for Entity-Bearing Queries

This repository provides resources developed within the following paper:

D. Garigliotti, F. Hasibi and K. Balog. Target Type Identification for Entity-Bearing Queries, In SIGIR'17, August 2017.

These resources allow to reproduce the results presented in the Target Type Identification paper.

The repository is structured as follows:

  • data/test_collection/: TSV-formatted dataset with our test collection, built from crowdsourcing annotations;
  • data/qrels/: TSV file used for evaluating the rankings. It was obtained by post-processing the test collection (details in the paper);
  • data/ml/: TSV-formatted machine learning dataset with all the pre-computed features;
  • lib/trec_eval/: TREC evaluation file (see its Readme);
  • output/: all the final TSV run files, containing target types ranked by baseline methods and our proposed approach.

Test collection

This TSV dataset contains the test collection built through a crowdsourcing annotation experiment (details in the paper).

A special <dbo:NONETYPE> label represents a NIL-type annotation.

The columns of this TSV file are self-descriptive.

Precomputed features for learning to rank target types

Each instance of this TSV-formatted dataset is structured as follows:

  • The first and second columns correspond to the query and type;
  • The third column is the target to predict;
  • The rest of the columns corresponds to the 25 features, in the same order as presented in Table 1 of the paper.

Results

Results presented in the paper can be obtained by running the TREC evaluation script, indicating the metrics of interest. E.g., placed on sigir2017-query_types directory, the following

$ /path/to/trec_eval -c -m ndcg_cut.1,5 data/qrels/qrels-tti-CF-filtered_by_NIL+merged.tsv output/ltr/scores-tti-ltr-rf-n_1000-m_3.tsv

evaluates our proposed Learning-to-rank method with the NDCG@1 and NDCG@5 metrics.

Citation

If you use the resources presented in this repository, please cite:

@inproceedings{Garigliotti:2017:TTI,
 author =     {Garigliotti, Dar\'{\i}o and Hasibi, Faegheh and Balog, Krisztian},
 title =      {Target Type Identification for Entity-Bearing Queries},
 booktitle =  {Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval},
 series =     {SIGIR '17},
 year =       {2017},
 pages =      {845--848},
 abstract =   {Identifying the target types of entity-bearing queries can help improve retrieval performance as well as the overall search experience. In this work, we address the problem of automatically detecting the target types of a query with respect to a type taxonomy. We propose a supervised learning approach with a rich variety of features. Using a purpose-built test collection, we show that our approach outperforms existing methods by a remarkable margin.},
 doi =        {10.1145/3077136.3080659},
 publisher =  {ACM},
 keywords =   {entity search, query types, query understanding, semantic search},
}

Contact

Should you have any questions, please contact Darío Garigliotti at dario.garigliotti[AT]uis.no (with [AT] replaced by @).