System Design Document

Agentic Reasoning System

Challenge: Saptang Labs – Machine Learning Challenge
Author: Turing Machines
Date: October 2025

1. Introduction

1.1 Purpose

This document outlines the design and architecture of an Agentic Reasoning System (ARS) — an AI system capable of autonomously decomposing, planning, executing, and verifying solutions for logic-based reasoning tasks. Unlike monolithic large language models, ARS performs structured multi-step reasoning by integrating lightweight models, symbolic tools, and rule-based planning.

1.2 Scope

The system aims to:

Decompose complex logic problems into solvable subproblems.
Dynamically select the most appropriate solver or tool.
Verify intermediate and final results.

The final deliverable is a modular, interpretable pipeline optimized for accuracy, transparency, and reproducibility.

2. Objectives

Objective	Description
Problem Decomposition	Identify subcomponents and logical relations within complex problems.
Tool Selection	Match subproblems with suitable solvers (symbolic, numeric, code-based).
Execution	Perform computations, symbolic manipulations, or simulations.
Verification	Check for consistency, dimensional correctness, or logical coherence.
Reasoning Trace Generation	Maintain a full record of all reasoning steps, justifications, and verification outcomes.

3. Restrictions

To ensure innovation in system design rather than LLM dependency:

Prohibited: GPT-4, GPT-5, Claude-3, Gemini Ultra, and equivalent reasoning-heavy APIs.
Permitted:
- Small or base open models (e.g., Phi-3-mini, Mistral-7B, Llama-3-8B-Instruct)
- Symbolic tools: SymPy, Z3, PrologPy, MiniKanren
- Algorithmic and rule-based reasoning components

4. System Architecture

4.1 Overview

The system follows a four-phase architecture integrating planning, reasoning, and verification:

┌──────────────────────────┐
│     Input Interface       │
│ (Natural Language Query)  │
└────────────┬──────────────┘
             ▼
┌──────────────────────────┐
│  1. Problem Decomposer    │
│  - LLM hybrid(T5-small)
│  - Generates subproblems  │
└────────────┬──────────────┘
             ▼
┌──────────────────────────┐
│  2. Planner & Tool Mapper │
│  - Builds reasoning graph │
│  - Assigns solvers/tools  │
└────────────┬──────────────┘
             ▼
┌──────────────────────────┐
│  3. Executor & Verifier   │
│  - Runs subtasks          │
│  - Cross-verifies results │
└────────────┬──────────────┘
             ▼
┌──────────────────────────┐
│  4. Reasoning Trace Gen.  │
│  - Logs all steps         │
│  - Produces final answer  │
└──────────────────────────┘

5. Module Descriptions

5.1 Problem Decomposer

Goal: Convert a raw natural language problem into atomic subproblems. Techniques:

we are using T5-small and training it on GSM8K and is available at https://raw.githubusercontent.com/openai/grade-school-math/refs/heads/master/grade_school_math/data/train.jsonl

Output Example:

{"question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
"answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72"}

5.2 Planner & Tool Mapper

Goal: Assign each subproblem to the most efficient solving mechanism and we will be using T5-small on the dataset and is available at https://math-qa.github.io/

Subproblem Type	Tool/Method	Example
Arithmetic	Internal calculator	`200 / (60+40)`
Algebraic	SymPy symbolic solver	Solve for x in `2x + 3 = 7`
Logical	Rule-based inference / Prolog	Deduce from premises
Algorithmic	Python code executor	Simulation or iteration tasks

Output Example:

{
  "plan": [
    {"step": "relative_speed", "tool": "calculator"},
    {"step": "meeting_time", "tool": "symbolic_solver"}
  ]
}

5.3 Executor & Verifier

Goal: Execute subtasks, verify results, and check intermediate consistency and we will be using the following datset for it https://huggingface.co/datasets/D3xter1922/proofwriter-dataset

Verification Strategies:

Redundant evaluation: Use both numeric and symbolic solvers.
Tolerance-based check: abs(result_1 - result_2) < ε.
Dimensional analysis: Ensure unit consistency.
Logic equivalence: Validate logical expressions via Z3 or truth tables.

Output Example:

{
  "execution_results": {
    "relative_speed": "100 km/h",
    "meeting_time": "2 hours"
  },
  "verification": "passed"
}

6. Tool Registry

Tool Name	Type	Library	Capability
SymPy	Symbolic	`sympy`	Algebraic & calculus-based reasoning
Z3 Solver	Logical	`z3-solver`	Logic and constraint satisfaction
Python Executor	Code	`exec()` sandbox	Algorithmic subtask execution
NumPy/Math	Numeric	`numpy`, `math`	Fast computation and array logic
MiniProlog	Rule-based	`prologpy`	Deductive inference tasks

7. 📊 Data Flow

Input: Problem in text form.
Decomposition: Identify structure and subproblems.
Planning: Select sequence and tools.
Execution: Perform calculations/symbolic solutions.
Verification: Validate correctness and consistency.
Reasoning Trace: Construct human-readable explanation.
Output: Final answer + reasoning trace.

8. Example Walkthrough

Input:

"A box contains 5 red, 3 blue, and 2 green balls. If one ball is drawn at random, what is the probability it is not green?"

Pipeline Trace:

Decompose:
- Identify total balls = 5 + 3 + 2 = 10.
- Identify favorable = not green → 8.
- Apply probability formula P = favorable / total.
Select Tools:
- Use symbolic/numeric calculator.
Execute:
- P = 8 / 10 = 0.8.
Verify:
- Alternate check via complementary probability: 1 – (2/10) = 0.8 → matches.

9. Implementation Plan

Phase	Deliverable	Description
Phase 1	Core architecture	Implement decomposer, planner, and executor modules.
Phase 2	Tool integration	Add symbolic, logical, and numerical solvers.
Phase 3	Verification logic	Implement redundancy checks and tolerance metrics.

10. Evaluation Metrics

Metric	Definition
Accuracy	Correct final answers on dataset
Verification Score	% of outputs validated successfully
Interpretability	Clarity of reasoning trace
Modularity	Ease of extension to new tool types
Reproducibility	Ease of running pipeline from scratch

11. Innovation Highlights

Hybrid rule-based + symbolic + neural reasoning.
Self-verifying computation via dual-tool crosschecks.
Transparent reasoning graph instead of hidden chains.
Designed for explainability and scientific rigor.

12. Folder Structure

agentic_reasoning_system/
├── decomposer/
│   └── llm_based.py
├── Tool Mapper/
│   └── tool_selector.py
├── Verification & Logical Reasoning/
│   └── consistency_checker.py
├── datasets/
├── main.py
└── README.md

13. Future Extensions

Integration with graph-based reasoning memory (storing solved subpatterns).
Adaptive tool learning: system updates tool-selection heuristics from past success rates.
Expansion to multi-agent reasoning: planner + verifier + critic agents.

14. Conclusion

This Agentic Reasoning System bridges symbolic, algorithmic, and lightweight neural reasoning to produce reliable, interpretable, and verifiable solutions to logical problems. It emphasizes planning, transparency, and modularity — fulfilling the core objectives of the Saptang Labs Machine Learning Challenge while adhering to its restrictions on pre-trained reasoning-heavy LLMs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

System Design Document

Agentic Reasoning System

1. Introduction

1.1 Purpose

1.2 Scope

2. Objectives

3. Restrictions

4. System Architecture

4.1 Overview

5. Module Descriptions

5.1 Problem Decomposer

5.2 Planner & Tool Mapper

5.3 Executor & Verifier

6. Tool Registry

7. 📊 Data Flow

8. Example Walkthrough

9. Implementation Plan

10. Evaluation Metrics

11. Innovation Highlights

12. Folder Structure

13. Future Extensions

14. Conclusion

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
Datasets		Datasets
Tool Mapper		Tool Mapper
Verification & Logical Reasoning		Verification & Logical Reasoning
Verification & Logical Reasoning		Verification & Logical Reasoning
decomposer		decomposer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

NNEngine/Agentic-AI-System-Design

Folders and files

Latest commit

History

Repository files navigation

System Design Document

Agentic Reasoning System

1. Introduction

1.1 Purpose

1.2 Scope

2. Objectives

3. Restrictions

4. System Architecture

4.1 Overview

5. Module Descriptions

5.1 Problem Decomposer

5.2 Planner & Tool Mapper

5.3 Executor & Verifier

6. Tool Registry

7. 📊 Data Flow

8. Example Walkthrough

9. Implementation Plan

10. Evaluation Metrics

11. Innovation Highlights

12. Folder Structure

13. Future Extensions

14. Conclusion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages