📝📊 CVPR 2024 Accepted Papers Dataset

This code creates a fiftyone dataset contains the accepted papers for the 2024 Conference on Computer Vision and Pattern Recognition (CVPR).

The CVPR 2024 conference received 11,532 valid paper submissions, out of which only 2,719 were accepted.

This results in an overall acceptance rate of about 23.6%. However, the dataset currently includes 2,379 papers, which represent those for which we were able to easily find papers.

If you're going to be at CVPR 2024, be sure to come say "Hi!". Here's where you can find me 👇🏼

📄 Dataset Details

📚 Dataset Description

Curated by: Harpreet Sahota, Hacker-in-Residence at Voxel51
Language(s): English (en)
License: CC-BY-ND-4.0

🗂️ Dataset Structure

The dataset includes the following information for each paper:

🖼️ Image of the first page of the paper
📌 title: The title of the paper
👨‍🔬👩‍🔬 authors_list: The list of authors
📄 abstract: The abstract of the paper
🔗 arxiv_link: Link to the paper on arXiv
🔗 other_link: Link to the project page, if available
🏷️ category_name: The primary category of the paper according to the arXiv taxonomy
🏷️ all_categories: All categories the paper falls into, as per the arXiv taxonomy
🔑 keywords: Keywords extracted using GPT-4o

🎯 Uses

This dataset can be used for various purposes, including:

Analyzing trends in research presented at CVPR 2024
Studying the distribution of topics and methods in computer vision research
And more!

🛠️ Dataset Creation

The dataset was created using the following steps:

Scrape the CVPR 2024 website for the list of accepted papers.
Search for each paper's abstract on arXiv using DuckDuckGo.
Extract abstracts, categories, and download PDFs using arXiv.py, a Python wrapper for the arXiv API.
Convert the first page of each paper to an image using pdf2image.
Extract keywords from each abstract using GPT-4o.

🚀 Code Execution Order

To build this dataset, run the following scripts in order:

scrape_cvpr_for_papers.py
get_arxiv_data.py
pre_process_csv.py
create_fiftyone_dataset.py

🙏 Acknowledgements

We gratefully acknowledge the support from the CVPR conference organizers, arXiv for providing an accessible API, and the contributors who made this dataset possible.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
4.10.24_CVPR24_Social_AV.png		4.10.24_CVPR24_Social_AV.png
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📝📊 CVPR 2024 Accepted Papers Dataset

📄 Dataset Details

📚 Dataset Description

🗂️ Dataset Structure

🎯 Uses

🛠️ Dataset Creation

🚀 Code Execution Order

🙏 Acknowledgements

About

Releases

Packages

Languages

License

harpreetsahota204/CVPR-2024-Papers

Folders and files

Latest commit

History

Repository files navigation

📝📊 CVPR 2024 Accepted Papers Dataset

📄 Dataset Details

📚 Dataset Description

🗂️ Dataset Structure

🎯 Uses

🛠️ Dataset Creation

🚀 Code Execution Order

🙏 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages