CLEF2020-CheckThat! Task 2: Verified Claim Retrieval

This repository contains the data set, format checker, scorer and baselines for the CLEF2020-CheckThat! task 2.
The task, given an input claim and a set of already verified claims, consists in ranking the already verified claims such that the ones that verify the input claim, or a subclaim in it, are ranked on top.
The goal of the task is to build a tool to support journalists fact-checkers when trying to determine whether a claim has been already fact-checked. This task is part of the CLEF2020-CheckThat! lab. For more information about deadlines , updates and other related task visit the site of the lab.

FCPD corpus for the CLEF-2020 LAB on "Automatic Identification and Verification of Claims"
Jun 8th, 2020 (Test Data Released)

This file contains the basic information regarding the CLEF2020-CheckThat! Task 2 data set provided for the CLEF2020-CheckThat! Lab on "Automatic Identification and Verification of Claims".

The current version of the data is the release of the test data.

All changes and updates on these data sets and tools are reported in Section 1 of this document.

Table of contents:

CLEF2020-CheckThat! Task 2: Verified Claim Retrieval

Evaluation Results

You can find the results in this spreadsheet, https://tinyurl.com/y9sjooxo.

List of Versions

v1.0 [2020/03/20] - Version 1 of the training data: 626 Tweets and 518 already verified claims.
v2.0 [2020/03/29] - Version 2 of the training data: 1,003 Tweets and 784 already verified claims.
v3.0 [2020/05/11] - Version 3 of the training data: 1,003 Tweets and 10,373 already verified claims. Fixed some labels, in addition to extending the dataset.
Test [2020/05/26] - Release of test input: 200 Tweets to be matched against the 10,373 already verified claims released with version v.3.0 of the data.
v4.0 [2020/06/08] - Release the gold labels of test tweets.

Contents of the Repository

We provide the following files:

Test input folder: test-input
Main folder: data
- Subfolder: v2
  - verified_claims.docs.tsv
    Contains all the verified claims used for fact checking released with the version 2.0 and 1.0.
  - Subfolder /train
    Contains all training data released with the version 1.0 and 2.0
    - tweets
      Contains information for training tweets (file format described in section Queries file).
    - tweet-vclaim-pairs.qrels
      Contains the correct pairing between the input tweet and verified claims (file format described in section Qrels file)
  - Subfolder /dev
    Contains dev data released with the version 2.0. Has the same structure as /train.
    - tweets
    - tweet-vclaim-pairs.qrels
- Subfolder: v3
  Similar structure to subfolder v2
README.md
this file

Data Format

The format used in the task is inspired from Text REtrieval Conference (TREC)'s campaigns for information retrieval (a description of the TREC format can be found here).

The data sets are TAB separated csv files. The text encoding for all files is UTF-8.

The data seta is separated into train and dev splits. They may be used as is or they can be combined and used with cross-validation. It isentirely upto the participants how the given train and dev data will be managed.

Already Verified Claims

All the verified claims that will be used for both training and test are found in file (data/verified_claims.qrels.tsv).

The file has the following format:

vclaim_id vclaim title

where

vclaim_id: unique ID of the verified claim
vclaim: text of the verified claim
title: title of the document fact checking the verified claim

Example:

vclaim_id	vclaim	title
2	"A ""law to separate families"" was enacted prior to April 2018, and the federal government is powerless not to enforce it."	Was the ‘Law to Separate Families’ Passed in 1997 or ‘by Democrats’?
222	Former U.S. Vice President Joe Biden owns the largest mansion in his state.	Does Joe Biden Own the Largest Mansion in His State?
503	"U.S. Sen. Bernie Sanders compared Baltimore to a ""third world country."""	Did U.S. Sen. Bernie Sanders Say Baltimore Was Like a ‘Third World Country’?
...

Note: Not all verified claims in the file have a corresponding tweet.

Queries file

TAB separated file with the input tweets. A row of the file has the following format

tweet_id tweet_content

where:

tweet_id: unique ID for a given tweet
tweet_content: text of the tweet

Example:

tweet_id	tweet_content
8	im screaming. google featured a hoax article that claims Minecraft is being shut down in 2020 pic.twitter.com/ECRqyfc8mI — Makena Kelly (@kellymakena) January 2, 2020
335	BREAKING: Footage in Honduras giving cash 2 women & children 2 join the caravan & storm the US border @ election time. Soros? US-backed NGOs? Time to investigate the source! pic.twitter.com/5pEByiGkkN — Rep. Matt Gaetz (@RepMattGaetz) October 17, 2018
622	y’all really joked around so much that tide put their tide pods in plastic boxes…smh pic.twitter.com/Z44efALcX5 — ㅤnavid (@NavidHasan_) January 13, 2018
...

Note: tweet_id doesn't corresponds to the id the tweet has on the Twitter platform.

Qrels file

A TAB-separated file containing all the pairs of tweet and verified claims such that the verified claim (vclaim_id) proves the tweet (tweet_id).

tweet_id 0 vclaim_id relevance

where:

tweet_id: unique ID for a given tweet. Tweet details found in the queries file.
0: literally 0 (this column is needed to comply with the TREC format).
vclaim_id: unique ID for a given verified claim. Details on the verified claim are in file data/verified_claims.qrels.tsv
relevance: 1 if the verified claim whose id is vclaim_id proves the tweet with id tweet_id; 0 otherwise.

Note: In the qrels file only pairs with relevance = 1 are reported. Relevance = 0 is assumed for all pairs not appearing in the qrels file.

Example:

tweet_id	0	vclaim_id	relevance
422	0	92	1
538	0	454	1
221	0	12	1
137	0	504	1
...

Results File

Each row of the result file is related to a pair tweet and verified_claim and intuitively indicates the ranking of the verified claim with respect to the input tweet. Each row has the following format:

tweet_id 0 vclaim_id rank score tag

where

tweet_id: is ID of the tweet given in the tweet file
Q0: is not a meaningful column (it is needed to comply with the TREC format).
vclaim_id: is ID of the verified claim found in the verified claims file (data/verified_claims.qrels.tsv)
rank: is the rank of the pair given based on the scores of all possible pairs for a given tweet_id. (Not taken into account when calculating metrics. Always equal to 1)
score: is the score given by your model for the pair tweet_id and vclaim_id
tag: is a string identifier of the team.

For example:

tweet_id	Q0	vclaim_id	rank	score	tag
359	Q0	303	1	1.1086285	elastic
476	Q0	292	1	4.680018	elastic
35	Q0	373	1	5.631936	elastic
474	Q0	352	1	0.8830346	elastic
174	Q0	408	1	0.98045605	elastic
...

Your result file MUST have at most 1 unique pair of tweet_id and vclaim_id. You can skip pairs if you deem them not relevant.

Example Ranking

The following is an example ranking of verified claims for given tweet.

Let's take random tweet from the data set:

tweet_id: 251
tweet_content: A big scandal at @ABC News. They got caught using really gruesome FAKE footage of the Turks bombing in Syria. A real disgrace. Tomorrow they will ask softball questions to Sleepy Joe Biden’s son, Hunter, like why did Ukraine & China pay you millions when you knew nothing? Payoff? — Donald J. Trump (@realDonaldTrump) October 15, 2019

Using the content of the tweet, your model should give the highest score to the veriefied claim, that matches the tweet in the Qrels file. In this case the verified claim is:

vclaim_id: 115
vclaim: ABC News mistakenly aired a video from a Kentucky gun range during its coverage of Turkey's attack on northern Syria in October 2019.

Example of top 5 ranked verfied claims from the baseline model in this repository:

vclaim	score
ABC News mistakenly aired a video from a Kentucky gun range during its coverage of Turkey's attack on northern Syria in October 2019.	21.218384
In a speech to U.S. military personnel, President Trump said if soldiers were real patriots, they wouldn't take a pay raise.	19.962847
Former President Barack Obama tweeted: "Ask Ukraine if they found my birth certificate."	19.414398
Mark Twain said, "Do not fear the enemy, for your enemy can only take your life. It is far better that you fear the media, for they will steal your HONOR."	16.810490
Dolly Parton wrote "Jolene" and "I Will Always Love You" in one day.	16.005116

Format checkers

The format checker verifies that the generated results file from your model complies with the expected format. To launch it run:

python3 lib/format_checker.py --model-prediction <path_to_your_results_file>

Note: The checker can't verify whether the prediction file you submit contain all lines/claims, because it does not have access to the corresponding gold file.

Note: The python files in this repo require a version of python that is at least 3.6.

Evaluation metrics and Scorers

The official metric for the task is Mean Average Precision (MAP), more specifically MAP@5. The scorer reports also R-Precision, Average Precision, Reciprocal Rank, Precision@k and means of these over all verified claims.

You can use these repos as reference for the evaluation, https://github.com/joaopalotti/trectools and https://github.com/usnistgov/trec_eval.

Before using the scorers or running the baseline, make sure you have all python packages in requirements.txt installed.

If you have pipenv installed, one way to do it is by using the following command:

pipenv install -r requirements.txt --skip-lock
pipenv shell

The scripts evaluate.py evaluates a submission. Example:

python3 evaluate.py -s <results-file> -g data/dev/tweet-vclaim-pairs.qrels

The results file contains the predictions of the model.

Note: The metric reciprocal_rank in the output of the evaluation script corresponds to Mean reciprocal rank.

Baseline

To use the Elastic Search baseline you need to have a locally running Elastic Search instance. You can follow this article for Elastic Search installation. You can then run elasticsearch using the following command:

/path/to/elasticsearch

Alternatively, if you have docker installed, you can run elasticsearch u:Wusing this command:

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.6.1

Once you have Elastic Search running you can run the baseline script using the following:

python3 elastic_search_baseline.py --vclaims data/verified_claims.docs.tsv --tweets data/dev/tweets.queries.tsv --predict-file <results-file>

Licensing

These data sets are free for general research use.

Credits

Task Organizers:

Nikolay Babulkov, Sofia University
Shaden Shaar, Qatar Computing Research Institute, HBKU
Giovanni Da San Martino, Qatar Computing Research Institute, HBKU
Preslav Nakov, Qatar Computing Research Institute, HBKU

Task website: https://sites.google.com/view/clef2020-checkthat/tasks/task-2-claim-retrieval

Contact: clef-factcheck@googlegroups.com

Citation

You can find the overview paper on the CLEF2020-CheckThat! Lab in the papers papers, "Overview of CheckThat! 2020 --- Automatic Identification and Verification of Claims in Social Media" (see citation bellow) in this link, and "CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media" (see citation bellow) in this link.

You can find CLEF2020-CheckThat! Task 2 details published in the paper "Overview of the CLEF-2020 CheckThat! Lab on Automatic Identification and Verification of Claims in Social Media: English tasks" (see citation bellow).

Further work on the task using the dataset released in the CLEF2020-CheckThat! Task 2 was published in "That is a Known Lie: Detecting Previously Fact-Checked Claims". You can find the paper in this link, https://arxiv.org/pdf/2005.06058.pdf.

@InProceedings{clef-checkthat:2020,
 author = "Barr\'{o}n-Cede{\~n}o, Alberto and
    Elsayed, Tamer and
    Nakov, Preslav and
    {Da San Martino}, Giovanni and
    Hasanain, Maram and   
    Suwaileh, Reem and
    Haouari, Fatima and
    Babulkov, Nikolay and
    Hamdan, Bayan and
    Nikolov, Alex and   
    Shaar, Shaden and
    Ali, {Zien Sheikh}",
 title  = "{Overview of CheckThat! 2020} --- Automatic Identification and
Verification of Claims in Social Media",
 year = {2020},
 booktitle = "Proceedings of the 11th International Conference of the CLEF Association: Experimental IR Meets Multilinguality, Multimodality, and Interaction",
 series = {CLEF~'2020},
 address = {Thessaloniki, Greece},
 nopages="--",
}

@InProceedings{clef-checkthat-en:2020,
 author = "Shaar, Shaden and
    Nikolov, Alex and
    Babulkov, Nikolay and
    Alam, Firoj and  
    Barr\'{o}n-Cede{\~n}o, Alberto and
    Elsayed, Tamer and
    Hasanain, Maram and    
    Suwaileh, Reem and
    Haouari, Fatima and
    {Da San Martino}, Giovanni and
    Nakov, Preslav",
 title = "Overview of {CheckThat!} 2020 {E}nglish: Automatic Identification and Verification of Claims in Social Media",
  booktitle = "Working Notes of CLEF 2020---Conference and Labs of the Evaluation Forum",
  series = {CLEF~'2020},
  address = {Thessaloniki, Greece},
  year = {2020}
}

@InProceedings{CheckThat:ECIR2020,
  author    = {Alberto Barr{\'{o}}n{-}Cede{\~{n}}o and
               Tamer Elsayed and
               Preslav Nakov and
               Giovanni Da San Martino and
               Maram Hasanain and
               Reem Suwaileh and
               Fatima Haouari},
  title     = {CheckThat! at {CLEF} 2020: Enabling the Automatic Identification and Verification of Claims in Social Media},
    booktitle = {Proceedings of the 42nd European Conference on Information Retrieval},
    series = {ECIR~'19},
    pages = {499--507},
    address   = {Lisbon, Portugal},
    month     = {April},
    year      = {2020},
}

@inproceedings{shaar-etal-2020-known,
    title = "That is a Known Lie: Detecting Previously Fact-Checked Claims",
    author = "Shaar, Shaden  and
      Babulkov, Nikolay  and
      Da San Martino, Giovanni  and
      Nakov, Preslav",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    series = {ACL~'20},
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.332",
    pages = "3607--3618",
}

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
data		data
lib		lib
test-input		test-input
tests		tests
README.md		README.md
elastic_search_baseline.py		elastic_search_baseline.py
evaluate.py		evaluate.py
requirements.txt		requirements.txt

sshaar/clef2020-factchecking-task2

Folders and files

Latest commit

History

Repository files navigation

CLEF2020-CheckThat! Task 2: Verified Claim Retrieval

Evaluation Results

List of Versions

Contents of the Repository

Data Format

Already Verified Claims

Queries file

Qrels file

Results File

Example Ranking

Format checkers

Evaluation metrics and Scorers

Baseline

Licensing

Credits

Citation

About

Resources

Stars

Watchers

Forks

Languages