SussLLM: Simulated User Search Sessions with Large Language Models

Source code for our paper :
SussLLM: Simulated User Search Sessions with Large Language Models

🚧 Repository Update In Progress 🚧

We are currently refactoring the repository to enhance its usability. Soon, you will be able to Install SussLLM via pip install suss-llm and Utilize enhanced command-line functionality for easier access and operation.

🚨 Upcoming Release Alert 🚨

SussData, the comprehensive dataset generated by SussLLM, will be soon published on Zenodo for public access and research use.

Overview

SussLLM is a Python-based simulation framework designed to model user search behavior using advanced large language models (LLMs). It generates realistic user profiles and search sessions to better understand and analyze how users interact with search engines.

Quick Start

Structure

sussllm/
│
├── sussllm/                              # Main package directory
│   ├── __init__.py                       # Initialization script
│   ├── profiles.py                       # User profile construction
│   ├── simulation.py                     # Simulation process management
│   ├── interaction.py                    # User interaction simulation
│   ├── reasoning.py                      # LLM reasoning and action decision module
│   ├── dataset.py                        # Dataset construction and management
│   ├── train.py                          # Training the models
│   ├── search/interfaces                 # Search interfaces and indexing
│   │   ├── __init__.py
│   │   ├── lucene/                       # Lucene indexing and BM25 search
│   │   ├── bing_search_engine.py         # Bing API search
│   │   └── google_search_engine.py       # Google API search
│   ├── utils/                            # Utility functions and classes
│   │   ├── __init__.py
│   │   ├── logger.py                     # Logging utility
│   │   ├── evaluation_metrics.py         # Evaluation utility
│   │   └── helpers.py                    # Helper functions
│   ├── llm_agent/                        # LLM-agents acting as users
│   ├── user_profile/                     # User profile simulation
│   │   ├── behavioral/                   # User profile simulation using Behavioral-oriented approach
│   │   └── component/                    # User profile simulation using Component-oriented approach
│
├── scripts/                              # Scripts for running simulations and analyses
│   ├── run_simulation.py                 # Script to run simulations
│   └── evaluate_model.py                 # Evaluation script for IR tasks
│
├── models/                               # Directory for ML models
│   ├── bm25_baseline.py                 
│   └── ranking_model.py
│
├── datasets/                             # Directory for storing datasets
│   ├── aol/                
│   └── trec/
│
├── simulations/                          # Simulation output for User profiles and sessions
│   ├── sussdata/                
│   └── user_profiles/
│
├── pyproject.toml                        # Poetry package and dependency management
└── README.md                             # Project overview and usage instructions

Features

User Profile Simulation: Generate detailed user profiles based on behavioral and component-oriented attributes.
Search Session Simulation: Simulate dynamic search sessions including queries, clicks, and decisions.
Reasoning with LLMs: Integrate large language models to enhance simulation realism by generating user reasoning and decision-making processes.
Dataset Management: Handle and manipulate datasets of simulated search sessions and real-world data.
Extensible: Easily extendable for various simulation scenarios and different LLMs.

Prerequisites

Python 3.8 or higher
Poetry for dependency management

Configuration

Set your own API keys in the environment variables:

export OPENAI_API_KEY='your-openai-api-key'
export BING_API_KEY='your-bing-api-key'

Installation

First, clone the repository:

git clone https://github.com/saberzerhoudi/suss-llm.git
cd sussllm

Install the project dependencies using Poetry:

poetry install

Usage

Running Simulations

To run simulations, use the run_simulation.py script:

poetry run python scripts/run_simulation.py

This script generates user profiles, simulates their search sessions, and logs the output.

Evaluating Models

To evaluate your IR models with the simulated data, use the evaluate_model.py script:

poetry run python scripts/evaluate_model.py

Ensure you have a trained model and a dataset path set correctly in the script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SussLLM: Simulated User Search Sessions with Large Language Models

🚧 Repository Update In Progress 🚧

🚨 Upcoming Release Alert 🚨

Overview

Quick Start

Structure

Features

Prerequisites

Configuration

Installation

Usage

Running Simulations

Evaluating Models

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
datasets		datasets
models		models
scripts		scripts
simulations		simulations
sussllm		sussllm
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

padas-lab-de/suss-llm

Folders and files

Latest commit

History

Repository files navigation

SussLLM: Simulated User Search Sessions with Large Language Models

🚧 Repository Update In Progress 🚧

🚨 Upcoming Release Alert 🚨

Overview

Quick Start

Structure

Features

Prerequisites

Configuration

Installation

Usage

Running Simulations

Evaluating Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages