Skip to content

egangu/StructFact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

StructFact: Benchmarking Structured Factual Reasoning in Large Language Models

Introduction

Large Language Models (LLMs) demonstrate remarkable capabilities in NLP tasks but face significant challenges when reasoning over structured factual knowledge. Structured data introduces unique characteristics that impact LLM performance:

  1. Heterogeneity - Mixed data types (text, numbers, dates)
  2. Topological Interdependencies - Complex structural relationships
  3. Order Invariance - Permutation-invariant semantics
  4. Sparsity - Handling missing values
  5. Lack of Prior Knowledge - Domain-specific context sensitivity

To address these challenges, we present StructFact - a comprehensive benchmark with:

  • πŸ“Š 13,407 factual queries across diverse structures (tables/lists/graphs)
  • 🌍 Multi-domain coverage with temporal/regional variations
  • 🧩 5 reasoning tasks: Arithmetic Calculation, Geography-Time Reasoning, Multi-hop Reasoning, Composition Understanding, and Combining Structural and Unstructural Reasoning
  • πŸ†• StructFact-Unseen subset for testing generalization on fresh knowledge

File Structure

β”œβ”€β”€ data/
β”‚   └── dataset_demo.json    # Sample dataset entries
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ cal_option.py        # Metric calculation script
β”‚   └── run_llm.py          # Model inference script

Usage

  1. Run Inference
    Configure your LLM in run_llm.sh:

    Then execute:

    chmod +x run_llm.sh
    ./run_llm.sh
  2. Calculate Metrics ` Generate accuracy and task-specific metrics:

    python src/cal_option.py /path/to/your_llm_output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published