Chronos is a baseline framework for question answering under continuous knowledge drift. It provides a unified pipeline for evaluating how large language models (LLMs) adapt to evolving, time-dependent knowledge using structured retrieval and reasoning.
chronos.py # Main Chronos pipeline
baselines/ # Baseline implementations
├── direct.py # Direct generation baseline
├── rag.py # Retrieval-Augmented Generation baseline
└── react.py # ReAct-style reasoning baseline
data/ # Dataset directory (must be prepared before running)
llm_api.py # Unified LLM / embedding API wrapper (OpenAI / Azure / Claude / DeepInfra)
utils.py # Utility functions (EM evaluation, time parsing, quad serialization)Python 3.10+ is recommended.
Install dependencies with:
pip install openai anthropic langchain-core langchain-huggingface sentence-transformers tqdmBefore running the code, select the desired api_source in llm_api.py and place the corresponding API key file in the project root directory.
Run Chronos
python chronos.pyRun Baselines
python baselines/direct.py
python baselines/rag.py
python baselines/react.pyAll experiment outputs are written to:
outputs/{method}/{model_id}/{task}/record_*.jsonEach script automatically reports Exact Match (EM) accuracy for each task upon completion.