📚 RefSup

Refute or Support: AI-Powered Fact Checking for Scientific Citations

Project Overview

🎯 Studies suggest that 10-20% of scientific citations may be inaccurate, in that the cited source does not support the claim being made. These errors can undermine research credibility, propagate misinformation, and waste researchers' time in tracing citation origins. By automatically detecting inaccurate citations, this project aims to enhance the integrity and reliability of academic publications.

🛠 This project has several components. At the core is a claim verification system that uses a claim statement and evidence sentences to assign a label denoting the relationship: Refute, Support, or Not Enough Information (NEI). This claim verification system is based on the MultiVerS model which uses the Longformer transformer for handling long input sequences. It is trained using diverse datasets including Fever, SciFact, and Citation-Integrity for applicability to scientific texts.

🚀 Additional components, currently in development, are an atomic claim generator that uses an LLM to identify individual claims in a complex input text. Such a system was recently demonstrated as part of the Search-Augmented Factuality Evaluator (SAFE). An evidence retrieval module (such as BM25) will be used to extract evidence sentences from PDFs in a database or provided by the user.

📊 Data Sources

SciFact: 1,409 scientific claims verified against 5,183 abstracts
Citation-Integrity: 3,063 citation instances from biomedical publications
Fever: 185,445 claims from Wikipedia (pre-training)

🧰 Planned Deliverables

Web application for citation accuracy verification
Open-source machine learning models
Detailed research and system documentation

📋 Documentation

See the doc directory for write-ups and Jupyter notebooks:

Motivation and project proposal
Reproduction of Citation-Integrity model
Data wrangling: Citation-Integrity and SciFact
Data exploration: Citation-Integrity and SciFact
Model baselines and starting checkpoints

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
data		data
doc		doc
eval		eval
model		model
predictions		predictions
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 RefSup

Project Overview

📊 Data Sources

🧰 Planned Deliverables

📋 Documentation

About

Releases

Packages

Languages

License

jedick/RefSup

Folders and files

Latest commit

History

Repository files navigation

📚 RefSup

Project Overview

📊 Data Sources

🧰 Planned Deliverables

📋 Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages