# ðŸŒŒ LLM Thought Landscape Visualization
**Understanding Reasoning Patterns in Large Language Models**  
*Visual Diagnostics for Chain-of-Thought Reasoning*  

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]()  

## Table of Contents
1. [Background & Motivation](#1)
2. [Environment Setup](#2)
3. [Data Preparation](#3)
4. [Visualization](#4.1) 

- Miscellaneous:
    - [Reasoning Trace Collection](#4.2)  
    - [Semantic Distance Computation](#4.3)  
Â Â Â 

<a id="1"></a>
## 1. Background & Motivation

### Why Thought Landscapes?
Modern LLMs demonstrate impressive reasoning capabilities through techniques like chain-of-thought prompting, but their internal decision-making processes remain opaque. This notebook implements our novel visualization method that:

1. **Maps reasoning paths** as 2D landscapes using semantic distance metrics
2. **Identifies critical patterns**: 
   - Model capability boundaries
   - Answer confidence levels
   - Consistency between reasoning steps
3. **Supports model diagnostics** through:
   - Weak/Strong model differentiation
   - Error cluster detection
   - Uncertainty quantification

**Key Innovation**: First method enabling direct visual comparison of different reasoning strategies (CoT, LeastToMost, ToT, and MCTS) across multiple choice questions.

<a id="2"></a>
## 2. Environment Setup


In [None]:
!pip install -q -r https://raw.githubusercontent.com/your_repo/main/requirements.txt
!apt-get install git-lfs  # For handling large distance matrices

import os
TOGETHER_API = "your_key_here" 
os.environ['TOGETHERAI_API_KEY'] = TOGETHER_API

<a id="3"></a>
## 3. Data Preparation

We provide the exact same data used for visulizing the plot in our paper, you can use the following command to download the data.


In [None]:
!git lfs clone git@hf.co:datasets/GazeEzio/Landscape-Data

The expected file tree is:

```
Landscape-of-Thoughts/
â”‚    ...
â”œâ”€â”€  Landscape-Data/
â”‚    â”œâ”€â”€ aqua
â”‚    â”‚   â”œâ”€â”€ distance_matrix
â”‚    â”‚   â””â”€â”€ thoughts
â”‚    â””â”€â”€...
â”‚   ...
```

<a id="4"></a>
## 4. Visualization

In [None]:
!python PLOT-landscape.py --model_name {MODEL_NAME} --dataset_name {DATASET_NAME}

<a id="4.2"></a>
### 4.2 Semantic Distance Computation
```python
# @title Compute Distance Matrix
!python step-2-compute-distance-matrix.py \
  --model_name {MODEL_NAME} \
  --dataset_name {DATASET_NAME} \
  --embedding_method semantic  # @param ["semantic", "syntactic"]

<a id="4"></a>
## 4. Visualization

<a id="4.1"></a>
### 4.1 Reasoning Trace Collection
```python
# @title Generate Reasoning Paths
MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"  # @param ["meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", "gpt-4"]
DATASET_NAME = "aqua"  # @param {type:"string"}

!python step1-sample-reasoning-trace.py \
  --model_name {MODEL_NAME} \
  --dataset_name {DATASET_NAME} \
  --method cot  # @param ["cot", "tot", "pot"]