# Mappr

> Scale up evaluation report mapping against evaluation frameworks using agentic workflows


::: {.callout-warning}
This notebook is a work in progress.
:::

## Approach

**Problem**: Manually mapping evaluation reports against IOM's Strategic Results Framework (SRF) is time-consuming and resource-intensive with ~150 outputs to analyze.

**Solution**: Multi-stage pipeline leveraging Global Compact for Migration (GCM) as a pruning mechanism:

### Stage 1: Sequential Classification
1. **Enablers & Cross-cutting Priorities** (11 items) - Fast parallel analysis to identify meta-evaluation nature
2. **GCM Objectives** (23 items) - Informed by Stage 1 results, uses condensed representations for efficiency

### Stage 2: Targeted SRF Analysis  
- Use GCM results + `gcm_srf_lut` to prune ~150 SRF outputs to ~20-50 relevant ones
- Deep analysis with full hierarchy context (objective → outcome → output → indicators)

### Key Features
- **Agentic workflow**: LLM navigates document headings, explores sections iteratively until confident
- **DSPy signatures**: Structured reasoning with built-in tracing for evaluation
- **Rate-limited parallelization**: Respects API constraints (15 RPM) using fastcore
- **False positive bias**: Better to over-include than miss relevant mappings

In [None]:
#| default_exp mappr

In [None]:

#| exports
from pathlib import Path
from functools import reduce
from toolslm.md_hier import *
from rich import print
import json
from fastcore.all import *

from typing import List
import dspy

from evaluatr.frameworks import IOMEvalData

In [None]:
#| exports
from dotenv import load_dotenv
import os

load_dotenv()
GEMINI_API_KEY = os.getenv('GEMINI_API_KEY')

In [None]:
#| exports
cfg = AttrDict({
    'lm': 'gemini/gemini-2.0-flash-exp',
    'api_key': GEMINI_API_KEY,
    'max_tokens': 8192,
    'track_usage': False,
})

In [None]:
#| eval: false
doc = Path("../_data/md_library/49d2fba781b6a7c0d94577479636ee6f/abridged_evaluation_report_final_olta_ndoja_pdf/enriched")
pages = doc.ls(file_exts=".md").sorted(key=lambda p: int(p.stem.split('_')[1]))
report = '\n\n---\n\n'.join(page.read_text() for page in pages)
print(report[:1000])

## Hierarchical report navigation

Thanks to `toolslm.md_hier` and a clean markdown structure of a `report` markdown, we can create a nested dictionary of section, subsection, ... as follows:

In [None]:
#| eval: false
hdgs = create_heading_dict(report); hdgs

{'PPMi .... page 1': {},
 'CONTENTS .... page 3': {},
 '1. Introduction .... page 4': {},
 '2. Background of the JI-HoA .... page 5': {'2.1. Context and design of the JI-HoA .... page 5': {},
  '2.2. External factors affecting the implementation of the JI .... page 7': {}},
 '3. Methodology .... page 8': {},
 '4. Findings .... page 10': {'4.1. Relevance .... page 10': {'4.1.1. Relevance of programme activities for migrants, returnees, and communities .... page 10': {}},
  'Overall performance score for relevance: $3.9 / 5$ <br> Robustness score for the evidence: $4.5 / 5$': {'4.1.1.1 Needs of migrants .... page 10': {},
   '4.1.1.2 Needs of returnees .... page 10': {},
   '4.1.1.3 Needs of community members .... page 12': {},
   "4.1.2. Programme's relevance to the needs of stakeholders .... page 12": {'4.1.2.1 Needs of governments .... page 12': {},
    '4.1.2.2 Needs of other stakeholders .... page 13': {}},
   '4.2. Coherence .... page 13': {"4.2.1. The JI-HoA's alignment with the o

In [None]:
#| exports
def find_section_path(
    hdgs: dict, # The nested dictionary structure
    target_section: str # The section name to find
):
    "Find the nested key path for a given section name"
    def search_recursive(current_dict, path=[]):
        for key, value in current_dict.items():
            current_path = path + [key]
            if key == target_section:
                return current_path
            if isinstance(value, dict):
                result = search_recursive(value, current_path)
                if result:
                    return result
        return None
    
    return search_recursive(hdgs)

Then we can retrieve the subsection path (list of nested headings to reach this specific section) in this nested `hdgs` dict :

In [None]:
#| eval: false
path = find_section_path(hdgs, "4.1.1.1 Needs of migrants .... page 10"); path

['4. Findings .... page 10',
 'Overall performance score for relevance: $3.9 / 5$ <br> Robustness score for the evidence: $4.5 / 5$',
 '4.1.1.1 Needs of migrants .... page 10']

Then retrieve the specific subsection content:

In [None]:
#| exports
def get_content_tool(hdgs, keys_list):
    "Navigate through nested levels using the exact key strings"
    return reduce(lambda current, key: current[key], keys_list, hdgs).text

In [None]:
#| eval: false
content = get_content_tool(hdgs, path)
print(content[:500])

## Signatures

A [DSPy signature](https://dspy.ai/learn/programming/signatures) is a declarative specification of input/output behavior of a DSPy module. Signatures allow you to tell the LM (Language Model) what it needs to do, rather than specify how we should ask the LM to do it.

### Stage 1

In [None]:
#| exports
class EnablersCrossCuttingOverview(dspy.Signature):
    """Identify sections relevant to enabler/cross-cutting category"""
    category_title: str = dspy.InputField(desc="Category title")
    category_description: str = dspy.InputField(desc="Category description")
    all_headings: str = dspy.InputField(desc="Complete document structure")
    priority_sections: List[str] = dspy.OutputField(desc="Ordered list of section keys to explore first")
    strategy: str = dspy.OutputField(desc="Reasoning for this exploration strategy")
    # confidence: float = dspy.OutputField(desc="Confidence 0-1")


For instance on "Data and evidence" SRF Enabler number 4:

In [None]:
#| eval: false
lm = dspy.LM(cfg.lm, api_key=cfg.api_key)
dspy.configure(lm=lm)
overview_analyzer = dspy.ChainOfThought(EnablersCrossCuttingOverview)

eval_data = IOMEvalData()
data_evidence = eval_data.srf_enablers[3]  # "Data and evidence" is at index 3
print(f"Analyzing: {data_evidence.title}")
print(f"Description:{data_evidence.description}\n")

In [None]:
#| eval: false
result = overview_analyzer(
    category_title=data_evidence.title,
    category_description=data_evidence.description,
    all_headings=str(hdgs),
)

print(f'Priority sections: {result.priority_sections}')
print(f'Strategy: {result.strategy}')

In [None]:
#| eval: false
condensed_gcm = {
    "7": {
        "title": "Address and reduce vulnerabilities in migration",
        "core_theme": "Protect migrants in vulnerable situations through comprehensive support and rights-based approaches",
        "key_principles": ["Human rights-based approach", "Best interests of the child", "Gender-responsive policies", "Non-discrimination"],
        "target_groups": ["Unaccompanied children", "Women at risk", "Trafficking victims", "Workers facing exploitation", "Persons with disabilities"],
        "main_activities": ["Identification and assistance", "Legal protection and remedies", "Child protection systems", "Status regularization procedures", "Crisis response"]
    },
    "21": {
        "title": "Cooperate in facilitating safe and dignified return and readmission, as well as sustainable reintegration",
        "core_theme": "Safe and dignified return, readmission, and sustainable reintegration of migrants",
        "key_principles": ["Due process and individual assessment", "Prohibition of collective expulsion", "Non-refoulement", "Human right to return"],
        "target_groups": ["Returning migrants", "Children in return processes", "Communities of origin"],
        "main_activities": ["Cooperation frameworks", "Travel documents and identification", "Consular assistance", "Reintegration support", "Monitoring mechanisms"]
    }
}