Skip to content

rmattila/arXamination

Repository files navigation

arXamination

arXamination is a helpful tool powered by a Large Language Model (LLM) that streamlines the initial review of academic papers, including but not limited to arXiv papers. It efficiently provides insights into key aspects of research papers, helping users quickly gauge their quality and relevance. Whether you're a researcher, student, or professional, this tool offers a convenient way to identify essential information in academic papers, saving you time and effort during the paper selection process. Make your research endeavors more manageable with arXamination.

Usage

Run the arxamination tool with an arXiv article ID, a URL to a PDF, or a path to a local PDF file as a command-line argument. For example:

arxamination 1706.03762                     # For an arXiv article
arxamination http://example.com/paper.pdf   # For a paper available via URL
arxamination /path/to/your/file.pdf         # For a local PDF file

The tool will fetch, if necessary, and analyze the specified article:

Screenshot of arXaminator analyzing the Transformers-paper

Installation

Clone the repository:

git clone https://github.com/rmattila/arXamination.git 

Navigate to the project directory:

cd arXamination 

Create a virtual environment (optional but recommended):

conda create -n arxamination-env
conda activate arxamination-env

Install the project's dependencies:

pip install -r requirements.txt

Next, install the arxamination package itself. This step is necessary for users who want to run the command-line tool:

pip install .

What LLM is used? Do I need an API key?

This tool defaults to using GPT4All, which allows for the local execution of LLMs (no GPU required), thereby avoiding API costs. The config.toml file is used to adjust model settings.

Additionally, the tool supports OpenAI's API for models such as GPT-3.5 and 4, with the implementation already included. The architecture is designed for easy extension to other LLM services. To integrate a new service, simply extend the BaseLLM class and implement the get_LLM_response method.

For using OpenAI's models through their API, specify your preferences in the config.toml file. To prevent the API key from being accidentally exposed in your configuration file, it is recommended to set it via the OPENAI_API_KEY environment variable.

Ideas for future improvements

  • Implement retrieval-augmented generation (RAG) to reduce the number of LLM queries
  • Improve the prompt templates and the set of questions
    • Add more "sanity check" questions to evaluate the research's soundness
    • Develop questions aimed at generating new ideas or insights from the article
  • Integration with reference managers (e.g., Zotero and Mendeley).
  • Generate reports in PDF and HTML formats for better documentation and sharing options
  • Analyze, compare, and synthesize insights from multiple articles, identifying commonalities, differences, and generating novel ideas that integrate findings across papers

About

Efficiently perform a first-pass examination of academic papers for essential research quality indicators

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages