Task Overview

Nowadays, digital-news understanding is often overwhelmed by the deluge of online information. One approach to cover this gap is to outline the news story by highlighting the most relevant facts. For example, recent studies summarize news articles by generating representative headlines. In this repository, we focus on news augmentation and argue news understanding can also be enhanced by surfacing contextual data relevant to the article, such as structured web tables. Specifically, our goal is to match news articles and web tables for news augmentation. For that, we introduce a novel BERT-based attention model to compute this matching degree. Through an extensive experimental evaluation over Wikipedia tables, we compare the performance of our model with standard IR techniques, document/sentence encoders and neural IR models for this task. Lastly, we also present the first news-table corpus from literature. By crawling Wikipedia pages, we collected 275,352 news articles and 298,792 web tables. In addition, our ground truth contains 93,818 matching pairs created by distant supervision strategies.

Dataset

Link to dataset: https://drive.google.com/drive/folders/1UBjHqpTvem9bxHDKnWvMAoBuNQ_N7Ad7?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
BERT_based_model		BERT_based_model
ablation_study		ablation_study
baselines		baselines
evaluation		evaluation
index_elastic_search		index_elastic_search
matchzoo_models		matchzoo_models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BERT_based_model

BERT_based_model

ablation_study

ablation_study

baselines

baselines

evaluation

evaluation

index_elastic_search

index_elastic_search

matchzoo_models

matchzoo_models

README.md

README.md

Repository files navigation

Task Overview

Dataset

About

Releases

Packages

Languages

levysouza/News-Table-Matching

Folders and files

Latest commit

History

Repository files navigation

Task Overview

Dataset

About

Resources

Stars

Watchers

Forks

Languages