# Priority View Architecture — Depth as Multiplier

## Change summary

**Conceptual model** (now consistent across both views):
- `base_score` = pure signal scores — "how much work does this person need / how much potential do they have?"
- `× proximity_multiplier` — ancestral line centrality: impacts blast radius and family interest
- `× depth_multiplier` — generational closeness: closer to root = more impact + more family interest

**Specific changes:**

### gold_person_integrity_priority
- **Before**: `base = evidence + completeness + depth_context * 0.10` then `× structural_multiplier`
- **After**: `base = evidence + completeness` then `× structural_multiplier × depth_multiplier`
- Removes depth from the signal sum entirely

### gold_person_narrative_priority
- **Before**: `base = texture*0.45 + family*0.30 + depth_context*0.25` then `× structural_multiplier`
- **After**: `base = texture*0.45 + family*0.30 + context*0.25` then `× structural_multiplier × depth_multiplier`
- Weights now sum to exactly 1.0 and match `ref_intent_category_weights`
- Removes depth from the signal sum; context correctly fills the 0.25 slot

### Depth multiplier values (orignally identical for both views, now tweaked higher for integrity)
| depth | multiplier |
|---|---|
| ≤ 4 | 1.20 |
| 5–6 | 1.10 |
| 7–8 | 1.05 |
| 9+  | 1.00 |

Combined with proximity multiplier (1.5 / 1.25 / 1.0 / 0.8), maximum boost
is 1.5 × 1.20 = **1.80×** for a direct ancestor at shallow depth (for narrative, max is **2.10** for integrity).

**Run order**: Cell 1 (integrity), Cell 2 (narrative), Cell 3 (verification).
Safe to rerun — `CREATE OR REPLACE VIEW`.

In [0]:
-- Cell 1: gold_person_integrity_priority — depth moved to multiplier stage
-- -- Note: the intent category weights should come from ref_intent_category_weights table rather than being hard-coded here
-- =====================================================================

CREATE OR REPLACE VIEW genealogy.gold_person_integrity_priority AS

WITH aggregated AS (
  SELECT
    person_gedcom_id,
    SUM(CASE WHEN category = 'evidence'     THEN weighted_signal_score ELSE 0 END) AS evidence_fragility_score,
    SUM(CASE WHEN category = 'completeness' THEN weighted_signal_score ELSE 0 END) AS completeness_risk_score
  FROM genealogy.gold_person_integrity_signal_scores
  GROUP BY person_gedcom_id
),

-- Pre-compute all three multiplier components for clarity
multipliers AS (
  SELECT
    person_gedcom_id,
    -- Proximity multiplier: ancestral line centrality
    CASE
      WHEN proximity = 0           THEN 1.50
      WHEN proximity = 1           THEN 1.25
      WHEN proximity = 2           THEN 1.00
      ELSE                              0.80
    END AS proximity_multiplier,
    -- Depth multiplier: generational closeness to root
    CASE
      WHEN depth <= 4             THEN 1.40
      WHEN depth BETWEEN 5 AND 6  THEN 1.20
      WHEN depth BETWEEN 7 AND 8  THEN 1.10
      ELSE                             1.00
    END AS depth_multiplier
  FROM genealogy.gold_research_person_signals
)

SELECT
  s.person_gedcom_id,
  b.branch,

  -- ---------------------------------------------------------------
  -- Evidence Fragility Score (pure signal sum)
  -- ---------------------------------------------------------------
  LEAST(100, a.evidence_fragility_score)                    AS evidence_fragility_score,

  CONCAT_WS(', ',
    CASE WHEN s.SIGNAL_VERY_LOW_EVIDENCE_DENSITY  THEN 'very low source density (mostly unsourced)' END,
    CASE WHEN s.SIGNAL_LOW_EVIDENCE_DENSITY       THEN 'low source density'                         END,
    CASE WHEN s.SIGNAL_FACT_CONFLICT              THEN 'transcript conflicts tree'                   END,
    CASE WHEN s.SIGNAL_SINGLE_SOURCE_DEPENDENCE   THEN 'single-source reliance'                     END,
    CASE WHEN s.SIGNAL_UNSOURCED_FAMILY_EVENTS    THEN 'unsourced family events'                    END,
    CASE WHEN s.SIGNAL_DOCS_NOT_TRANSCRIBED       THEN 'documents not transcribed'                  END,
    CASE WHEN s.SIGNAL_IMPRECISE_DATES            THEN 'imprecise dates'                            END,
    CASE WHEN s.SIGNAL_IMPRECISE_PLACES           THEN 'imprecise places'                           END,
    CASE WHEN s.SIGNAL_INCOMPLETE_NAME            THEN 'incomplete name'                            END
  )                                                         AS integrity_evidence_reasons,

  -- ---------------------------------------------------------------
  -- Completeness Risk Score (pure signal sum)
  -- ---------------------------------------------------------------
  LEAST(100, a.completeness_risk_score)                     AS completeness_risk_score,

  CONCAT_WS(', ',
    CASE WHEN s.SIGNAL_NO_BIRTH_RECORDED        THEN 'birth not recorded'        END,
    CASE WHEN s.SIGNAL_MISSING_PARENT           THEN 'missing parent'            END,
    CASE WHEN s.SIGNAL_MISSING_CENSUS_COVERAGE  THEN 'missing expected census'   END,
    CASE WHEN s.SIGNAL_NO_DEATH_RECORDED        THEN 'death not recorded'        END,
    CASE WHEN s.SIGNAL_UNCOVERED_SOURCES        THEN 'uncovered cited sources'   END,
    CASE WHEN s.SIGNAL_NO_MARRIAGES             THEN 'no marriage recorded'      END,
    CASE WHEN s.SIGNAL_LATE_LIFE_GAP            THEN 'late-life records gap'     END,
    CASE WHEN s.SIGNAL_NO_CHILDREN              THEN 'no children recorded'      END,
    CASE WHEN s.SIGNAL_CHILD_GAPS               THEN 'gaps between child births' END,
    CASE WHEN s.SIGNAL_EARLY_LIFE_ONLY          THEN 'early-life records only'   END
  )                                                         AS integrity_structure_reasons,

  -- ---------------------------------------------------------------
  -- Tree position factors (not signals — applied at multiplier stage)
  -- ---------------------------------------------------------------
  m.depth_multiplier,
  CONCAT_WS(', ',
    CASE WHEN s.depth <= 4            THEN 'shallow generation depth' END,
    CASE WHEN s.depth BETWEEN 5 AND 6 THEN 'mid-depth generation'     END
  )                                                         AS integrity_lineage_reasons,

  m.proximity_multiplier                                    AS structural_multiplier,

  -- ---------------------------------------------------------------
  -- Base Integrity Score — pure signals only, no depth term
  -- ---------------------------------------------------------------
  ROUND(
    LEAST(100, a.evidence_fragility_score) * 0.60 +
    LEAST(100, a.completeness_risk_score) * 0.40
  , 2)                                                      AS integrity_base_score,

  -- ---------------------------------------------------------------
  -- Final Integrity Priority Score
  -- base × proximity_multiplier × depth_multiplier
  -- ---------------------------------------------------------------
  ROUND(
    (
      LEAST(100, a.evidence_fragility_score) * 0.60 +
      LEAST(100, a.completeness_risk_score) * 0.40
    )
    * m.proximity_multiplier
    * m.depth_multiplier
  , 2)                                                      AS integrity_priority_score,

  -- ---------------------------------------------------------------
  -- Priority Summary
  -- ---------------------------------------------------------------
  CASE
    WHEN ROUND((LEAST(100, a.evidence_fragility_score) + LEAST(100, a.completeness_risk_score))
               * m.proximity_multiplier * m.depth_multiplier, 2) >= 80
      THEN 'High-risk foundational profile: weak evidence in a key ancestral position.'
    WHEN ROUND((LEAST(100, a.evidence_fragility_score) + LEAST(100, a.completeness_risk_score))
               * m.proximity_multiplier * m.depth_multiplier, 2) >= 60
      THEN 'Moderate integrity risk due to gaps or weak sourcing in core relationships.'
    WHEN ROUND((LEAST(100, a.evidence_fragility_score) + LEAST(100, a.completeness_risk_score))
               * m.proximity_multiplier * m.depth_multiplier, 2) >= 40
      THEN 'Some integrity issues present; suitable for routine housekeeping.'
    ELSE 'Low structural risk; no immediate integrity concerns.'
  END                                                       AS integrity_priority_summary,

  nba.next_best_actions

FROM aggregated a
JOIN  genealogy.gold_research_person_signals          s   ON s.person_gedcom_id = a.person_gedcom_id
JOIN  multipliers                                     m   ON m.person_gedcom_id = a.person_gedcom_id
LEFT JOIN genealogy.gold_person_branch                b   ON b.person_gedcom_id = a.person_gedcom_id
LEFT JOIN genealogy.gold_person_integrity_next_best_actions nba
                                                          ON nba.person_gedcom_id = a.person_gedcom_id

In [0]:
-- Cell 2: gold_person_narrative_priority — depth moved to multiplier stage,
--         context correctly fills the 0.25 slot in the weighted sum
-- Note: the intent category weights should come from ref_intent_category_weights table rather than being hard-coded here
-- =====================================================================

CREATE OR REPLACE VIEW genealogy.gold_person_narrative_priority AS

WITH aggregated AS (
  SELECT
    person_gedcom_id,
    SUM(CASE WHEN category = 'texture'  THEN weighted_signal_score ELSE 0 END) AS life_texture_score,
    SUM(CASE WHEN category = 'family'   THEN weighted_signal_score ELSE 0 END) AS family_drama_score,
    SUM(CASE WHEN category = 'context'  THEN weighted_signal_score ELSE 0 END) AS context_score
  FROM genealogy.gold_person_narrative_signal_scores
  GROUP BY person_gedcom_id
),

multipliers AS (
  SELECT
    person_gedcom_id,
    CASE
      WHEN proximity = 0 THEN 1.50
      WHEN proximity = 1 THEN 1.25
      WHEN proximity = 2 THEN 1.00
      ELSE                    0.80
    END AS proximity_multiplier,
    CASE
      WHEN depth <= 4             THEN 1.20
      WHEN depth BETWEEN 5 AND 6  THEN 1.10
      WHEN depth BETWEEN 7 AND 8  THEN 1.05
      ELSE                             1.00
    END AS depth_multiplier
  FROM genealogy.gold_research_person_signals
)

SELECT
  s.person_gedcom_id,
  b.branch,

  -- ---------------------------------------------------------------
  -- Life Texture Score
  -- ---------------------------------------------------------------
  LEAST(100, a.life_texture_score)                          AS life_texture_score,

  CONCAT_WS(', ',
    CASE WHEN s.SIGNAL_MIGRANT        THEN 'geographical migrant'     END,
    CASE WHEN s.SIGNAL_MILITARY       THEN 'military experience'       END,
    CASE WHEN s.SIGNAL_POSSIBLE_WWI   THEN 'possible WWI involvement'  END,
    CASE WHEN s.SIGNAL_POSSIBLE_WWII  THEN 'possible WWII involvement' END,
    CASE WHEN s.SIGNAL_YOUNG_DEATH    THEN 'died in early adulthood'   END
  )                                                         AS life_texture_reasons,

  -- ---------------------------------------------------------------
  -- Family Drama Score
  -- ---------------------------------------------------------------
  LEAST(100, a.family_drama_score)                          AS family_drama_score,

  CONCAT_WS(', ',
    CASE WHEN NOT s.SIGNAL_NO_MARRIAGES THEN 'has marriage recorded'  END,
    CASE WHEN NOT s.SIGNAL_NO_CHILDREN  THEN 'has children recorded'  END,
    CASE WHEN s.SIGNAL_MULTIPLE_SPOUSES THEN 'multiple spouses'       END
  )                                                         AS family_drama_reasons,

  -- ---------------------------------------------------------------
  -- Context Score (fills the 0.25 slot previously occupied by depth)
  -- ---------------------------------------------------------------
  LEAST(100, a.context_score)                               AS context_score,

  CONCAT_WS(', ',
    CASE WHEN s.SIGNAL_POSSIBLE_OCCUPATION  THEN 'occupation records likely available'   END,
    CASE WHEN s.SIGNAL_VARIED_OCCUPATIONS   THEN 'varied occupational history'           END,
    CASE WHEN s.SIGNAL_TRANSCRIPT_AVAILABLE THEN 'transcribed documents available'       END,
    CASE WHEN s.SIGNAL_HIGH_FAMILY_PAYOFF   THEN 'high-leverage under-researched family' END,
    CASE WHEN s.SIGNAL_POSSIBLE_RESIDENCE   THEN 'residence records likely findable'     END,
    CASE WHEN s.SIGNAL_POSSIBLE_MARRIAGE    THEN 'possible unrecorded marriage'          END,
    CASE WHEN s.SIGNAL_POSSIBLE_CHILDREN    THEN 'possible unrecorded children'          END
  )                                                         AS context_reasons,

  -- ---------------------------------------------------------------
  -- Tree position factors (not signals)
  -- ---------------------------------------------------------------
  m.depth_multiplier,
  CONCAT_WS(', ',
    CASE WHEN s.depth <= 4            THEN 'shallow generation depth' END,
    CASE WHEN s.depth BETWEEN 5 AND 6 THEN 'mid-depth generation'     END
  )                                                         AS integrity_lineage_reasons,

  m.proximity_multiplier                                    AS structural_multiplier,

  -- ---------------------------------------------------------------
  -- Base Narrative Score
  -- Weights sum to exactly 1.0, matching ref_intent_category_weights:
  --   texture 0.45 | family 0.30 | context 0.25
  -- Depth is no longer in this sum — it is a tree-position multiplier
  -- ---------------------------------------------------------------
  ROUND(
    LEAST(100, a.life_texture_score) * 0.45 +
    LEAST(100, a.family_drama_score) * 0.30 +
    LEAST(100, a.context_score)      * 0.25
  , 2)                                                      AS narrative_base_score,

  -- ---------------------------------------------------------------
  -- Final Narrative Priority Score
  -- base × proximity_multiplier × depth_multiplier
  -- ---------------------------------------------------------------
  ROUND(
    (
      LEAST(100, a.life_texture_score) * 0.45 +
      LEAST(100, a.family_drama_score) * 0.30 +
      LEAST(100, a.context_score)      * 0.25
    )
    * m.proximity_multiplier
    * m.depth_multiplier
  , 2)                                                      AS narrative_priority_score,

  -- ---------------------------------------------------------------
  -- Priority Summary
  -- Adjust the thresholds higher as better signals are built for narrative potential
  -- ---------------------------------------------------------------
  CASE
    WHEN ROUND((LEAST(100, a.life_texture_score)*0.45 + LEAST(100, a.family_drama_score)*0.30
                + LEAST(100, a.context_score)*0.25) * m.proximity_multiplier * m.depth_multiplier, 2) >= 20
      THEN 'High value: strong narrative potential in a key ancestral position.'
    WHEN ROUND((LEAST(100, a.life_texture_score)*0.45 + LEAST(100, a.family_drama_score)*0.30
                + LEAST(100, a.context_score)*0.25) * m.proximity_multiplier * m.depth_multiplier, 2) >= 15
      THEN 'Moderate value due to good narrative potential and relevance.'
    WHEN ROUND((LEAST(100, a.life_texture_score)*0.45 + LEAST(100, a.family_drama_score)*0.30
                + LEAST(100, a.context_score)*0.25) * m.proximity_multiplier * m.depth_multiplier, 2) >= 10
      THEN 'Some value; further discoveries may lead to good story.'
    ELSE 'Low value; unlikely to result in a relevant story.'
  END                                                       AS narrative_priority_summary,

  nba.next_best_actions

FROM aggregated a
JOIN  genealogy.gold_research_person_signals               s   ON s.person_gedcom_id = a.person_gedcom_id
JOIN  multipliers                                          m   ON m.person_gedcom_id = a.person_gedcom_id
LEFT JOIN genealogy.gold_person_branch                     b   ON b.person_gedcom_id = a.person_gedcom_id
LEFT JOIN genealogy.gold_person_narrative_next_best_actions nba ON nba.person_gedcom_id = a.person_gedcom_id

In [0]:
-- Cell 3a: Sanity check — confirm scores have shifted as expected
-- Direct ancestors at shallow depth should score higher than before
-- (previously max boost was 1.5×; now max is 1.5 × 1.2 = 1.8×)
-- Distant collaterals at deep depth: unchanged (0.8 × 1.0 = 0.8×)

SELECT
  p.given_name,
  p.surname,
  p.birth_year,
  i.branch,
  s.depth,
  s.proximity,
  i.evidence_fragility_score,
  i.completeness_risk_score,
  i.integrity_base_score,
  i.structural_multiplier   AS proximity_multiplier,
  i.depth_multiplier,
  i.integrity_priority_score,
  i.integrity_priority_summary
FROM genealogy.gold_person_integrity_priority i
JOIN genealogy.gold_person_life p ON p.person_gedcom_id = i.person_gedcom_id
JOIN genealogy.gold_research_person_signals s ON s.person_gedcom_id = i.person_gedcom_id
ORDER BY i.integrity_priority_score DESC
LIMIT 15

In [0]:
-- Cell 3b: Narrative scores — check context_score is now contributing
-- and top scorers make intuitive sense

SELECT
  p.given_name,
  p.surname,
  p.birth_year,
  n.branch,
  s.depth,
  s.proximity,
  n.life_texture_score,
  n.family_drama_score,
  n.context_score,
  n.narrative_base_score,
  n.structural_multiplier   AS proximity_multiplier,
  n.depth_multiplier,
  n.narrative_priority_score,
  n.narrative_priority_summary
FROM genealogy.gold_person_narrative_priority n
JOIN genealogy.gold_person_life p ON p.person_gedcom_id = n.person_gedcom_id
JOIN genealogy.gold_research_person_signals s ON s.person_gedcom_id = n.person_gedcom_id
ORDER BY n.narrative_priority_score DESC
LIMIT 30

In [0]:
-- Cell 3c: Multiplier breakdown — spot check the full range
-- Verify all four combinations of proximity × depth are producing
-- sensible combined multipliers (note this is only for integrity score now as the weights for depth are different)

SELECT
  s.proximity,
  s.depth,
  CASE WHEN s.proximity = 0           THEN 1.50
       WHEN s.proximity = 1           THEN 1.25
       WHEN s.proximity = 2           THEN 1.00
       ELSE 0.80 END                                          AS proximity_mult,
  CASE WHEN s.depth <= 4            THEN 1.40
       WHEN s.depth BETWEEN 5 AND 6 THEN 1.20
       WHEN s.depth BETWEEN 7 AND 8 THEN 1.10
       ELSE 1.00 END                                          AS depth_mult,
  ROUND(
    CASE WHEN s.proximity = 0           THEN 1.50
         WHEN s.proximity = 1           THEN 1.25
         WHEN s.proximity = 2           THEN 1.00
         ELSE 0.80 END
    *
    CASE WHEN s.depth <= 4            THEN 1.40
         WHEN s.depth BETWEEN 5 AND 6 THEN 1.20
         WHEN s.depth BETWEEN 7 AND 8 THEN 1.10
         ELSE 1.00 END
  , 2)                                                        AS combined_mult,
  COUNT(*)                                                    AS n_people
FROM genealogy.gold_research_person_signals s
GROUP BY 1,2,3,4,5
ORDER BY combined_mult DESC
LIMIT 20