Skip to content
#

llm-reasoning

Here are 20 public repositories matching this topic...

[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate the knowledge and reasoning skills of OR experts when presented with complex optimization modeling tasks.

  • Updated Mar 20, 2025

We introduce a benchmark for testing how well LLMs can find vulnerabilities in cryptographic protocols. By combining LLMs with symbolic reasoning tools like Tamarin, we aim to improve the efficiency and thoroughness of protocol analysis, paving the way for future AI-powered cybersecurity defenses.

  • Updated Nov 4, 2024
  • Haskell

Improve this page

Add a description, image, and links to the llm-reasoning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-reasoning topic, visit your repo's landing page and select "manage topics."

Learn more