# ðŸŒŒ LLM Thought Landscape Visualization
**Understanding Reasoning Patterns in Large Language Models**  

*Visual Diagnostics for Chain-of-Thought Reasoning*   

[![github badge](https://camo.githubusercontent.com/cf783a33cd08f7ba7e9fb758137fc9ebd587f4d9119b93ba4f1933afa23406ec/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769746875622d3138313731373f7374796c653d666c61742d737175617265266c6f676f3d676974687562266c6f676f436f6c6f723d7768697465)](https://github.com/tmlr-group/landscape-of-thoughts)



<!-- ![demo](./imgs/demo.png) -->

---

## Table of Contents
1. [**Background & Motivation**](#1)

2. **Plot With Your Own Data**
   
   1.1 [Constructing Data](#2)

   1.2 [Calculate the Data for Visualization](#3)
   
   1.3 [Visualize the Data](#4)

3. **Reproduce the Plot in the Paper**
   
   3.1 [Data Preparation](#5)
   
   3.2 [Visualization](#6)


## Clone the Repo

In [None]:
!git clone https://github.com/tmlr-group/landscape-of-thoughts
%cd landscape-of-thoughts

<a id="1"></a>
## Background & Motivation

### Why Thought Landscapes?
Modern LLMs demonstrate impressive reasoning capabilities through techniques like chain-of-thought prompting, but their internal decision-making processes remain opaque. This notebook implements our novel visualization method that:

1. **Maps reasoning paths** as 2D landscapes using semantic distance metrics
2. **Identifies critical patterns**:
   - Model capability boundaries
   - Answer confidence levels
   - Consistency between reasoning steps
3. **Supports model diagnostics** through:
   - Weak/Strong model differentiation
   - Error cluster detection
   - Uncertainty quantification

**Key Innovation**: First method enabling direct visual comparison of different reasoning strategies (CoT, LeastToMost, ToT, and MCTS) across multiple choice questions.

## 1. Plot With You Own Data

<a id="2"></a>
### 1.1 Constructing Data

Format your data as follows, an example can be found in `lot/data/dummy.jsonl`:
```json
{
    "question": "20 marbles were pulled out of a bag of only white marbles, painted black, and then put back in. Then, another 20 marbles were pulled out, of which 1 was black, after which they were all returned to the bag. If the percentage of black marbles pulled out the second time represents their percentage in the bag, how many marbles in total Q does the bag currently hold?",
    "options": [
        "A)40",
        "B)200",
        "C)380",
        "D)400",
        "E)3200"
    ],
    "correct": "D"
}
```

You can use the following code to load the data and format the prompt.

In [None]:
from lot.datasets import load_dataset

dataset = load_dataset(
    dataset_name="dummy", 
    data_path="lot/data/dummy.jsonl",
    answer_field="correct",
    options_field="options",
    question_field="question"
)

question = dataset.get_query(0)
answer = dataset.get_answer(0)
prompt = dataset.format_prompt(idx=0, method="cot")
print("Question: ", question)
print("Answer: ", answer)
print("Prompt: ", prompt)

<a id="3"></a>
### 1.2 Calculate the Data for Visualization

You can use the following code to calculate the data for visualization. You need to setup LLM first, either local (vLLM) or remote (Together.ai). Check [setup_model.md](doc/setup_model.md) for more details.


In this example, we use the remote LLM (Together.ai), using `Meta-Llama-3-8B-Instruct-Lite`, `CoT` method and `dummy` dataset.

In [None]:
from lot import sample, calculate
import os 
os.environ["TOGETHERAI_API_KEY"] = "YOUR KEY HERE"
# Sample reasoning traces
features, metrics = sample(
    model_name="meta-llama/Meta-Llama-3-8B-Instruct-Lite",
    dataset_name="dummy",
    data_path="lot/data/dummy.jsonl",
    method="cot",
    num_samples=10,
    start_index=0,
    end_index=5,
    save_root="./exp-data"
)

# Calculate distance matrices
distance_matrices = calculate(
    model_name="meta-llama/Meta-Llama-3-8B-Instruct-Lite",
    dataset_name="dummy",
    data_path="lot/data/dummy.jsonl",
    method="cot",
    start_index=0,
    end_index=5,
    save_root="./exp-data"
)

<a id="4"></a>
### 1.3 Visualize the Data

You can use the following code to calculate the data for visualization.

In [None]:
from lot import plot

plot(
    model_name="Meta-Llama-3-8B-Instruct-Lite",
    dataset_name="dummy",
    method="cot",
    save_root="./exp-data",
    output_dir="figures/landscape"
)

## 2. Reproduce the Plot in the Paper
<a id="5"></a>
### 2.1 Data Preparation

We provide the exact same data used for visulizing the plot in our paper, you can use the following command to download the data.


In [None]:
!git lfs clone git@hf.co:datasets/GazeEzio/Landscape-Data

The expected file tree is:

```
Landscape-of-Thoughts/
â”‚    ...
â”œâ”€â”€  Landscape-Data/
â”‚    â”œâ”€â”€ aqua
â”‚    â”‚   â”œâ”€â”€ distance_matrix
â”‚    â”‚   â””â”€â”€ thoughts
â”‚    â””â”€â”€...
â”‚   ...
```

In [None]:
from step_3_plot_landscape import draw
from utils.visual_utils import *
ROOT="./Landscape-Data"

print("==> Loading data...")

list_all_T_2D, A_matrix_2D, list_plot_data, list_num_all_thoughts_w_start_list = process_data(
    model='Meta-Llama-3.1-70B-Instruct-Turbo',
    dataset='aqua',
    plot_type='method',
    total_sample=50,
    ROOT=ROOT
)

<a id="6"></a>
### 2.2 Visualization

In [None]:
print("==> Drawing...")
for plot_datas, splited_T_2D, num_all_thoughts_w_start_list in zip(list_plot_data, list_all_T_2D, list_num_all_thoughts_w_start_list):
    fig = draw(
        dataset_name='aqua',
        plot_datas=plot_datas, splited_T_2D=splited_T_2D, A_matrix_2D=A_matrix_2D, num_all_thoughts_w_start_list=num_all_thoughts_w_start_list,
    )
    fig.show()