<table style="width:100%">
<tr>
<td style="vertical-align:middle; text-align:left;">
<font size="2">
Supplementary code for the <a href="https://mng.bz/lZ5B">Build a Reasoning Model (From Scratch)</a> book by <a href="https://sebastianraschka.com">Sebastian Raschka</a><br>
<br>Code repository: <a href="https://github.com/rasbt/reasoning-from-scratch">https://github.com/rasbt/reasoning-from-scratch</a>
</font>
</td>
<td style="vertical-align:middle; text-align:left;">
<a href="https://mng.bz/lZ5B"><img src="https://sebastianraschka.com/images/reasoning-from-scratch-images/cover-small.webp" width="100px"></a>
</td>
</tr>
</table>

# Advanced Parser Comparison

- This notebook compares the current parser from chapter 3
against the hybrid LaTeX parser from issue [#133](https://github.com/rasbt/reasoning-from-scratch/issues/133) on selected MATH-500 style answers.

In [1]:
import reasoning_from_scratch.ch03 as ch03
import reasoning_from_scratch.bonus.parser as bonus

backend_ok = bonus.normalize_text_hybrid(r'\frac{1}{2}') == '1/2'
print('LaTeX backend ready:', backend_ok)

LaTeX backend ready: True


- If the above returns False, install the antl4r parser by uncommenting and executing the following line
- You will then need to restart and rerun the notebook

In [2]:
# !uv pip install "antlr4-python3-runtime==4.11.*"

&nbsp;
## Self-parse check

- First, we check that parsers can check their own answers successfully

In [3]:
math_data = ch03.load_math500_test()

In [4]:
# Original parser

correct = 0

for entry in math_data:
    boxed = f"\\boxed{{{entry['answer']}}}"
    extract = ch03.extract_final_candidate(boxed)
    correct += ch03.grade_answer(extract, entry["answer"])

print(f"{correct} out of {len(math_data)} correct")

500 out of 500 correct


In [5]:
# Hybrid parser

correct = 0

for entry in math_data:
    boxed = f"\\boxed{{{entry['answer']}}}"
    extract = ch03.extract_final_candidate(boxed)

    pred = bonus.normalize_text_hybrid(extract)
    gold = bonus.normalize_text_hybrid(entry["answer"])
    correct += int(pred == gold)

print(f"{correct} out of {len(math_data)} correct")

500 out of 500 correct


&nbsp;
## Answer-solution check

- Same as above but checks answers against the provided solutions instead of checking answers against themselves

In [15]:
# Original parser

correct = 0

for entry in math_data:
    boxed = f"{entry['solution']}"
    extract = ch03.extract_final_candidate(boxed)
    correct += ch03.grade_answer(extract, entry["answer"])

print(f"{correct} out of {len(math_data)} correct")

500 out of 500 correct


In [16]:
# Hybrid parser

correct = 0

for entry in math_data:
    boxed = f"{entry['solution']}"
    extract = ch03.extract_final_candidate(boxed)

    pred = bonus.normalize_text_hybrid(extract)
    gold = bonus.normalize_text_hybrid(entry["answer"])
    correct += int(pred == gold)

print(f"{correct} out of {len(math_data)} correct")

500 out of 500 correct


&nbsp;
## Answer-prediction check

- Same as above but checks answers against LLM answers

In [27]:
math500_llm_answers = ch03.load_math500_test(local_path="math500_llm_answer.json")

In [28]:
some_math_samples = [
    {"answer": r"\left( 3, \frac{\pi}{2} \right)", "prediction": r"\boxed{(3,pi/2)}"},
    {"answer": "p - q", "prediction": r"\boxed{p-q}"},
    {"answer": r"\frac{14}{3}", "prediction": r"\boxed{14/3}"},
    {"answer": "9", "prediction": r"\boxed{9}"},
    {"answer": r"\text{Evelyn}", "prediction": r"\boxed{Evelyn}"},
    {"answer": "42", "prediction": r"\boxed{42}"},
    {"answer": "27", "prediction": r"\boxed{27}"},
    {"answer": r"90^\circ", "prediction": r"\boxed{90}"},
    {"answer": r"3\sqrt{13}", "prediction": r"\boxed{3*sqrt(13)}"},
    {"answer": "4", "prediction": r"\boxed{4}"},
    {"answer": "2220", "prediction": r"\boxed{2220}"},
    {"answer": r"\frac{3}{56}", "prediction": r"\boxed{3/56}"},
    {"answer": "284", "prediction": r"\boxed{284}"},
    {"answer": "5", "prediction": r"\boxed{5}"},
    {"answer": r"\sqrt{51}", "prediction": r"\boxed{sqrt(51)}"},
    {"answer": "6 - 5i", "prediction": r"\boxed{6-5i}"},
    {"answer": "-50", "prediction": r"\boxed{-50}"},
    {"answer": r"\pi", "prediction": r"\boxed{pi}"},
    {"answer": "28", "prediction": r"\boxed{28}"},
    {"answer": "3", "prediction": r"\boxed{3}"},
    {"answer": "6+9i", "prediction": r"\boxed{6+9i}"},
    {"answer": "13535", "prediction": r"\boxed{13535}"},
    {"answer": "5", "prediction": r"\boxed{5}"},
    {"answer": "x=5", "prediction": r"\boxed{x=5}"},
    {"answer": "10", "prediction": r"\boxed{10}"},
    {"answer": "1,-2", "prediction": r"\boxed{1,-2}"},
    {"answer": "144", "prediction": r"\boxed{144}"}
]

In [29]:
# Original parser

correct = 0

for entry in math500_llm_answers:
    boxed = f"{entry['prediction']}"
    extract = ch03.extract_final_candidate(boxed)
    correct += ch03.grade_answer(extract, entry["answer"])

print(f"{correct} out of {len(math500_llm_answers)} correct")

457 out of 500 correct


In [30]:
# Hybrid parser

correct = 0

for entry in math500_llm_answers:
    boxed = f"{entry['prediction']}"
    extract = ch03.extract_final_candidate(boxed)
    
    pred = bonus.normalize_text_hybrid(extract)
    gold = bonus.normalize_text_hybrid(entry["answer"])
    correct += int(pred == gold)

print(f"{correct} out of {len(math500_llm_answers)} correct")

458 out of 500 correct


&nbsp;
## LaTeX Edge case check

- These are tricky edge cases with LaTeX-heavy answers

In [31]:
edge_cases = [
    {"answer": r"52_8", "prediction": r"\boxed{52_8}"},
    {"answer": r"11,\! 111,\! 111,\! 100", "prediction": r"\boxed{11,\!111,\!111,\!100}"},
    {"answer": r"x=5", "prediction": r"\boxed{x=5}"},
    {"answer": r"11\sqrt2", "prediction": r"\boxed{11\cdot\sqrt{2}}"},
    {"answer": r"(0,9) \cup (9,36)", "prediction": r"\boxed{(0,9)\cup(9,36)}"},
    {"answer": r"\begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix}", "prediction": r"\boxed{\begin{pmatrix}-1&0\\0&-1\end{pmatrix}}"},
    {"answer": r"\pi", "prediction": r"\boxed{\pi}"},
    {"answer": r"\frac{14}{3}", "prediction": r"\boxed{\dfrac{14}{3}}"},
]

In [32]:
# Original parser

correct = 0

for entry in edge_cases:
    boxed = f"{entry['prediction']}"
    extract = ch03.extract_final_candidate(boxed)
    correct += ch03.grade_answer(extract, entry["answer"])

print(f"{correct} out of {len(tricky_math_samples)} correct")

4 out of 8 correct


In [33]:
# Hybrid parser

correct = 0

for entry in edge_cases:
    boxed = f"{entry['prediction']}"
    extract = ch03.extract_final_candidate(boxed)
    
    pred = bonus.normalize_text_hybrid(extract)
    gold = bonus.normalize_text_hybrid(entry["answer"])
    correct += int(pred == gold)

print(f"{correct} out of {len(edge_cases)} correct")

8 out of 8 correct
