Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.
/ eth-nlp-project Public archive

An Improvement of SMAPH-S for Entity Linking of Web Queries

Notifications You must be signed in to change notification settings

taivop/eth-nlp-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Improvement of SMAPH-S for Entity Linking of Web Queries

Abstract:

SMAPH-S is a precursor of SMAPH-2, a state-of-the-art system for joint entity mention detection and linking in web queries. Both systems use a piggyback approach to annotate queries. A set of candidate entities is drawn directly from Bing search results or annotations of Bing snippets and therefore performance depends heavily on the accuracy of Bing itself. Our system improves on SMAPH-S by systematically detecting queries which produce uninformative Bing results and rewrites them to extract better candidate entities. To this end, we split query strings into smaller chunks based on their linking probability. We also improve the way mention candidates are generated so that the system is able to handle noisy inputs as they are very common in web queries. Finally, we report the results of experimenting with different regressors in the pruning phase, such as Probabilistic Logistic Regression and AdaBoost.

The piggyback paper contains additional details.

This project is based on marcocor's query annotator stub. The project is mavenized.

Dependencies

  • Python with scikit-learn and Flask.
    • The pruner is written in Python using scikit-learn and relies on Flask to expose an API that is started and called from the Java pipeline.
  • Scala
    • We use Scala to generate the dataset for training the pruner.

Running

Included classes and POM

POM

File pom.xml defines a Maven project. It includes two dependencies: bat-framework and bing-api-java. You need the BAT-framework to benchmark your annotation system, and the Bing java API to access the Bing API (in case your project is built on top of Bing).

Important classes

About

An Improvement of SMAPH-S for Entity Linking of Web Queries

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •