Implementation and reproduction code for:
Towards Efficient and Evidence-grounded Mobility Prediction with LLM-Driven Agents
AgentMob is a training-free, tool-based agent framework for individual next-location prediction. It analyzes historical mobility records, summarizes temporal/spatial/transition evidence, asks an LLM to generate and select candidate locations when configured, and falls back to deterministic scoring when an LLM backend is unavailable.
This artifact is prepared for double-blind review. Please do not infer author identity from repository metadata, commit history, package namespaces, local paths, or account names. Any author-identifying information has been removed or replaced with neutral placeholders where possible.
If you discover identity-revealing metadata in the artifact, please ignore it for the purpose of review and report it as an artifact issue.
This repository is intended to let reviewers inspect the implementation and run the released evaluation pipeline when licensed mobility data are available.
| Included | Not included |
|---|---|
| Agent, tool, scoring, reranking, tracing, and visualization code | Raw mobility records |
| Configuration templates and environment variable template | Dataset redistribution or download credentials |
| Batch and single-user evaluation entry points | Private API keys or model-provider credentials |
| Data-independent unit tests and integration tests | Paper result files generated from restricted data |
Without the restricted data files, reviewers can still inspect all source code, install dependencies, run syntax checks, and run the data-independent tests. Full metric reproduction requires placing the licensed datasets in the expected local format.
.
|-- README.md
|-- pyproject.toml
|-- requirements.txt
|-- config/
| |-- config.yaml
| `-- config_yjmob.yaml
|-- core/
| |-- agent.py
| |-- reasoning_agent.py
| |-- candidate_scoring.py
| `-- candidate_reranker.py
|-- tools/
| |-- base.py
| `-- mobility_tools.py
|-- utils/
| |-- litellm_client.py
| |-- llm_client.py
| |-- vllm_client.py
| |-- llm_tracer.py
| `-- experiment_manager.py
|-- prompts/
|-- scripts/
|-- exp_scripts/
|-- tests/
|-- visualization/
`-- docs/
The code package's original command-level README is preserved at docs/CODE_USAGE.md. Additional implementation notes, tool-log docs, visualization notes, and batch-evaluation notes are under docs/.
MobilityPredictionAgent in core/agent.py orchestrates the default tool-based prediction workflow.
ReasoningMobilityAgent in core/reasoning_agent.py provides an alternative per-prediction reasoning workflow.
tools/mobility_tools.py contains the mobility tools for historical retrieval, pattern analysis, candidate generation, and final prediction.
core/candidate_reranker.py and core/candidate_scoring.py implement candidate refinement and fallback scoring.
utils/litellm_client.py provides a LiteLLM-backed client for OpenAI-compatible, Azure OpenAI, and vLLM endpoints. Legacy Azure/vLLM clients are also included.
utils/experiment_manager.py and utils/llm_tracer.py organize batch outputs and LLM traces.
Use Python 3.10 or newer.
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r requirements.txtFor tests, install the development dependencies:
pip install -e ".[dev]"Copy the environment template if you will use an LLM backend:
cp .env.example .envThen fill in the relevant variables for the provider selected in config/config.yaml.
These checks do not require private mobility files:
python -m compileall core tools utils prompts scripts tests main_evaluation.py
python -m pytest tests/test_candidate_reranker.py tests/test_imports.pyThe full test suite includes integration tests that load mobility CSV files. Those tests are expected to fail with FileNotFoundError until the reviewer provides the restricted data under ./data or ./data/origin_data.
The default config is config/config.yaml. Select one provider:
provider: "openai"
providers:
openai:
model: "openai/gpt-4.1-mini"
azure:
model: "azure/gpt-4"
api_version: "2025-04-01-preview"
vllm:
model: "openai/Qwen/Qwen3-8B"
api_base: "http://localhost:22002/v1"Credentials are loaded from .env or the shell environment:
| Provider | Environment variables |
|---|---|
| OpenAI-compatible | OPENAI_API_KEY, OPENAI_API_BASE |
| Azure OpenAI | AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION |
| vLLM | VLLM_API_BASE, VLLM_API_KEY optional |
You can override the provider at runtime with --provider openai, --provider azure, or --provider vllm.
Raw and processed mobility data are not included because of licensing and privacy restrictions. The default configuration expects data under:
data/origin_data/
For the origin dataset type, files should be named:
data/origin_data/{user_id}_hour.csv
The expected columns include time, polygon_id, lat, and lon. Optional columns such as city_name and description are used when present.
Other supported formats:
split_data:
data/<dataset>/train/{user_id}_train.csv
data/<dataset>/validation/{user_id}_validation.csv
data/<dataset>/test/{user_id}_test.csv
yjmob100k:
data/<dataset>/train/{user_id}.csv
data/<dataset>/test/{user_id}.csv
Set the data directory and dataset format in config/config.yaml or via batch-evaluation arguments.
Replace <user_id> with an ID present in your local data directory.
Run the default tools-LLM agent:
python main_evaluation.py \
--config config/config.yaml \
--user-ids <user_id> \
--knowledge-cutoff 2023-02-28 \
--prediction-start 2023-03-01 \
--prediction-end 2023-03-07 \
--agent-mode tools-llm \
--provider openai \
--output-dir evaluation_resultsRun the reasoning agent:
python main_evaluation.py \
--config config/config.yaml \
--user-ids <user_id> \
--knowledge-cutoff 2023-02-28 \
--prediction-start 2023-03-01 \
--prediction-end 2023-03-07 \
--agent-mode reasoningRun deterministic fallback logic without an LLM:
python run_local_evaluation.py \
--config config/config.yaml \
--user-id <user_id> \
--knowledge-cutoff 2023-02-28 \
--prediction-start 2023-03-01 \
--prediction-end 2023-03-07Run all users in a data directory:
python run_batch_evaluation_parallel.py \
--all \
--data-dir ./data/origin_data \
--dataset-type origin \
--cutoff 2023-06-30 \
--start 2023-07-01 \
--end 2023-07-07 \
--agent-mode tools-llm \
--provider openai \
--parallel 5Run a selected subset:
python run_batch_evaluation_parallel.py \
--users user1 user2 user3 \
--data-dir ./data/origin_data \
--dataset-type origin \
--parallel 3 \
--experiment-name subset_testBatch outputs are written under batch_results/ by default:
batch_results/
`-- exp_YYYYMMDD_HHMMSS_<agent_mode>/
|-- config.json
|-- summary.json
|-- batch_results.csv
|-- progress.json
|-- users/
`-- llm_traces/
Start a vLLM OpenAI-compatible server, then run:
python run_vllm_evaluation.py \
--data-dir ./data/origin_data \
--vllm-base-url http://localhost:22002/v1 \
--model Qwen/Qwen3-8B \
--user-id <user_id> \
--knowledge-cutoff 2023-02-28 \
--prediction-start 2023-03-01 \
--prediction-end 2023-03-01T05:00:00Helper scripts for vLLM setup and related notes are in scripts/start_vllm_server.py, scripts/serve_qwen_vllm.sh, and docs/README_VLLM.md.
Run syntax checks:
python -m compileall core tools utils prompts scripts tests main_evaluation.pyRun the test suite:
python -m pytest tests --asyncio-mode=autoSeveral integration tests expect mobility CSV files under ./data or ./data/origin_data. Without the restricted data files, only data-independent tests will pass. This is a data-availability limitation, not an expected code-path failure.
Evaluation outputs include prediction CSV files, per-user JSON metrics, aggregated batch CSV files, progress metadata, and optional LLM trace JSONL files.
The evaluator reports exact-match accuracy, top-5 accuracy, stay/move accuracy, prediction time, and representative error cases.
All preprocessing and inference must respect chronological validity. Tools should only access training history and observations available before the target timestamp.
Location descriptions, when used, should be generated once and shared across LLM-based methods to keep comparisons fair.
For privacy, logs should not include raw GPS coordinates unless the corresponding dataset license allows redistribution.
For double-blind review, do not include author names, institution names, personal account names, acknowledgements, or non-anonymous cloud storage links in config files, logs, notebook metadata, or commit metadata.
This code is released for anonymous academic review. A full license will be provided in the camera-ready or public release.
Dataset access is governed by the original dataset providers. Users are responsible for obtaining permission to use BW, YJMob100K, Shanghai ISP, and any other mobility data, and for complying with all applicable privacy and redistribution terms.
For double-blind review, please use the conference review system for questions and artifact issues. Do not contact the authors directly during the anonymous review period.