Skip to content

Prototype system for Task 5 of the CheckThat! Lab @ CLEF 2020.

Notifications You must be signed in to change notification settings

SheffieldGolf/task5

Repository files navigation

CLEF2020-CheckThat! Task 5: Check-worthiness for Political Debates

This repository contains the dataset for the CLEF2020-CheckThat! task 5 on Check-wothiness estimation for political debates. It also contains the format checker, scorer and baselines for the task.

FCPD corpus for the CLEF-2020 LAB on "Automatic Identification and Verification of Claims"
Version 1.0: March 16, 2020 (Data and Baseline Release)

The task is part of the CLEF2020-CheckThat Lab on "Automatic Identification and Verification of Claims". The current version includes the training dataset, evaluation scores and baselines. The test set will be provided in future versions.

Table of contents:

Evaluation Results

TBA

List of Versions

  • v1.0 [2020/03/16] - data. The training data for task 5 contains 50 fact-checked documents - debates, speeches, press conferences, etc.

Contents of the Repository

We provide the following files:

Task Definition

The "Check-worthines for debates" task is defined as "predicting which claim in a political debate should be prioritized for fact-checking". In particular, given a debate, speech or a press conference the goal is to produce a ranked list of its sentences based on their worthiness for fact checking.

NOTE: You can use data from the CLEF-2018 and the CLEF-2019 editions of this task

Data Format

The input files are TAB-separated CSV files with four fields:

line_number speaker text label

Where:

  • line_number: the line number (starting from 1)
  • speaker: the person speaking (a candidate, the moderator, or "SYSTEM"; the latter is used for the audience reaction)
  • text: a sentence that the speaker said
  • label: 1 if this sentence is to be fact-checked, and 0 otherwise

The text encoding is UTF-8.

Example:

...
65 TRUMP So we're losing our good jobs, so many of them. 0
66 TRUMP When you look at what's happening in Mexico, a friend of mine who builds plants said it's the eighth wonder of the world. 0
67 TRUMP They're building some of the biggest plants anywhere in the world, some of the most sophisticated, some of the best plants. 0
68 TRUMP With the United States, as he said, not so much. 0
69 TRUMP So Ford is leaving. 1
70 TRUMP You see that, their small car division leaving. 1
71 TRUMP Thousands of jobs leaving Michigan, leaving Ohio. 1
72 TRUMP They're all leaving. 0
...

Results File Format:

For this task, the expected results file is a list of claims with the estimated score for check-worthiness. Each row contains two tab-separated fields:

line_number score

Where line_number is the number of the claim in the debate and score is a number, indicating the priority of the claim for fact-checking. For example:

1 0.9056
2 0.6862
3 0.7665
4 0.9046
5 0.2598
6 0.6357
7 0.9049
8 0.8721
9 0.5729
10 0.1693
11 0.4115
...

Your result file MUST contain scores for all lines of the input file. Otherwise the scorer will return an error and no score will be computed.

Format checker

The checker for the task is located in the format_checker module of the project. The format checker verifies that your generated results file complies with the expected format. To launch it run:

python3 format_checker/main.py --pred_file_path=<path_to_your_results_file>

run_format_checker.sh includes examples of the output of the checker when dealing with an ill-formed results file. Its output can be seen in run_format_checker_out.txt. Note that the checker cannot verify whether the prediction file you submit contain all lines / claims), because it does not have access to the corresponding gold file.

The script used is adapted from the one for the CLEF2019 Check That! Lab Task 1 (check-worthiness).

Scorer

Launch the scorer for the task as follows:

python3 scorer/main.py --gold_file_path="<path_gold_file_1, path_to_gold_file_k>" --pred_file_path="<predictions_file_1, predictions_file_k>"

Both --gold_file_path and --pred_file_path take a single string that contains a comma separated list of file paths. The lists may be of arbitraty positive length (so even a single file path is OK) but their lengths must match.

<path_to_gold_file_n> is the path to the file containing the gold annotations for debate n and <predictions_file_n> is the path to the corresponding file with participants' predictions for debate n, which must follow the format, described in the 'Results File Format' section.

The scorer invokes the format checker for the task to verify the output is properly shaped. It also handles checking if the provided predictions file contains all lines / claims from the gold one.

run_scorer.sh provides examples on using the scorers and the results can be viewed in the run_scorer_out.txt file.

The script used is adapted from the one for the CLEF2019 Check That! Lab Task 1 (check-worthiness).

Evaluation metrics

The official evaluation measure is Mean Average Precision (MAP). We also report R-Precision, Average Precision, Recipocal Rank, Precision@k and averaged over multiple debates.

Baselines

The baselines module contains a random and a simple ngram baseline for the task. To launch the baseline script you need to install packages dependencies found in requirement.txt using the following:

pip3 install -r requirement.txt

To launch the baseline script run the following:

python3 baselines/baselines.py

Both of the baselines will be trained on all but the latest 20% of the debates as they are used as the dev dataset. The performance of both baselines will be displayed:
Random Baseline AVGP: 0.02098366142405398
Ngram Baseline AVGP: 0.09456735615609717

The scripts used are adapted from the ones for the CLEF2019 Check That! Lab Task 1 (check-worthiness).

Licensing

These datasets are free for general research use.

Citation

Previous Editions

For information about the previous edition of the shared task, refer to CLEF2019-CheckThat! and CLEF2018-CheckThat!.

Credits

Task 5 Organizers:

  • Shaden Shaar, Qatar Computing Research Institute, HBKU

  • Giovanni Da San Martino, Qatar Computing Research Institute, HBKU

  • Preslav Nakov, Qatar Computing Research Institute, HBKU

Task website: https://sites.google.com/view/clef2020-checkthat/tasks/tasks-1-5-check-worthiness?authuser=0

Contact: clef-factcheck@googlegroups.com

About

Prototype system for Task 5 of the CheckThat! Lab @ CLEF 2020.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published