tot

Here is 1 public repository matching this topic...

The-Swarm-Corporation / DPO-MCTS-ToT-Training

This module implements a post-training mechanism that allows a language model to explore various reasoning branches (chain-of-thoughts) using a Monte Carlo Tree Search (MCTS) framework. It selects the branch with the best answer using a cosine similarity evaluator that compares the candidate answer to a known correct answer.

multi-agent swarms tot agents dpo

Updated Feb 11, 2025
Python

Improve this page

Add a description, image, and links to the tot topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tot topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tot

Here is 1 public repository matching this topic...

The-Swarm-Corporation / DPO-MCTS-ToT-Training

Improve this page

Add this topic to your repo