# Likely activation potential: SPEC

This notebook will implement the *likely co-occurrence* component and likely activation potential from `main.tex`, using the same ontology input and operational choices as `typical_proof.ipynb`.


## Goal
Compute likely activation potential for resources used in each proof `N`, using:
- context `C` (same as in `typical_proof.ipynb`),
- a *salient set* `S` defined by one of two variants (selectable by a flag),
- likely co-occurrence `Φ_{L_c}(r, C, S)` and likely activation `Φ_L(r, C, S)`.

Output one CSV per analysis; filenames must encode key parameters and include a timestamp.


## Inputs / Parameters
- `START_PROPOSITION`, `END_PROPOSITION`: proof range (same semantics as `typical_proof.ipynb`).
- `EPSILON`: weight for `Φ_L = ε * Φ_h + (1 - ε) * Φ_{L_c}` (parallel to `DELTA` in typical).
- `HISTORY_WEIGHTS`: 3-tuple `(α, β, γ)` for direct/hierarchical/mereological histories (same validation rules).
- `TYPE_SELECTION`: boolean, if `True` use relation/operation types in proposition/proof queries; if `False` use direct concepts.
- `S_VARIANT`: enum flag in `{"statement_only", "statement_plus_related_chunks"}` selecting the salient set definition.
- `EXCLUDED_CONCEPT_IRIS`, `EXCLUDED_CONCEPT_IRI_SUBSTRINGS`: same filtering behavior as in `typical_proof.ipynb`.
- Input TTL: use the same selection logic as `typical_proof.ipynb` (latest TTL in `ontologies/`).


## Context `C` (same as `typical_proof.ipynb`)
For each proof `N`:
- `C` includes resources from definitions, postulates, common notions, and propositions up to `N` (included), plus proofs up to `N-1` (included).
- Use the same query families as `typical_proof.ipynb` for direct / hierarchical / mereological histories and Hebbian co-occurrence.
- Apply the same exclusion filters after each query materializes.
- Build `hebb_C` using the same operational definition of “together” as `typical_proof.ipynb`:
  resources co-occur if they are used in the same definition, postulate, common notion, proposition, or proof (via `refers_to` / `contains_concept`).
- Preserve the same ordered-pair caveat used in `typical_proof.ipynb` (queries return `(o1, o2)` pairs).


## Salient set `S` (two variants)
Let `last_proposition` be proposition `N` (the statement immediately preceding proof `N`).

**Variant 1: `S_VARIANT = "statement_only"`**
- `S` = resources in the *statement* of `last_proposition`.
- Implement via a dedicated SPARQL query (to be written) that extracts resources from the statement, with `TYPE_SELECTION` applied.

**Variant 2: `S_VARIANT = "statement_plus_related_chunks"`**
- Let `S0` be the resources in the statement of `last_proposition` (as above).
- Let `R` be the set of resources that occur in any chunk (definition, postulate, common notion, proposition, or proof)
  that shares at least one resource with `S0`.
- Then `S = S0 ∪ R`.
- Implement via a dedicated SPARQL query (to be written) that
  (a) identifies chunks sharing at least one resource with `S0`, and
  (b) returns all resources from those chunks, with `TYPE_SELECTION` applied.

For each variant, apply the same exclusion filters as in `typical_proof.ipynb`.


## Likely co-occurrence (`Φ_{L_c}`)
- Build `hebb_C` exactly as in the typical analysis, using the same “together” definition.
- Compute `deg_C^S(r) = sum_{r' in S, r' != r} hebb_C(r, r')`.
- Define:
  - `Φ_{L_c}(r, C, S) = deg_C^S(r) / sum_u deg_C^S(u)` if denominator ≠ 0, else 0.
- This is computed only for resources *used in proof `N`* (same as typical).

## Likely activation potential (`Φ_L`)
- Compute historical component `Φ_h(r, C)` using the same direct/hierarchical/mereological histories and weights as typical.
- Combine with likely co-occurrence:
  `Φ_L(r, C, S) = ε * Φ_h(r, C) + (1 - ε) * Φ_{L_c}(r, C, S)`
  where `ε = EPSILON` in `[0, 1]`.


## Four analyses (S variant × TYPE_SELECTION)
Run all four combinations:
1. `S_VARIANT=statement_only`, `TYPE_SELECTION=False`
2. `S_VARIANT=statement_only`, `TYPE_SELECTION=True`
3. `S_VARIANT=statement_plus_related_chunks`, `TYPE_SELECTION=False`
4. `S_VARIANT=statement_plus_related_chunks`, `TYPE_SELECTION=True`

Each run should iterate proofs `N` in `[START_PROPOSITION, END_PROPOSITION]`, compute
`Φ_h`, `Φ_{L_c}`, `Φ_L`, and the set of new resources (same definition as in typical).


## Outputs
For each analysis, write one CSV with rows per `(proof_n, resource)` that include at least:
- `proof_n`
- `resource_used_in_proof`
- `phi_h`
- `phi_lc` (likely co-occurrence)
- `phi_l` (likely activation)
- `is_new_resource` (or a separate list/count of new resources, mirroring typical output)

**Filename requirements**
- Must include: `S_VARIANT`, `TYPE_SELECTION`, proof range, and a timestamp.
- Example pattern: `likely_{s_variant}_type_{type_flag}_p{start}-{end}_{YYYYmmdd_HHMMSS}.csv`

The output directory should mirror `typical_proof.ipynb` (e.g., `output/`).


## SPARQL queries


CASE 1: salient_statement_resources
For salient_statement_resources(last_proposition, type_selection=False) use the following SPARQL queries:
- queries.direct_template_propositions_proofs(last_proposition)
- queries.hierarchical_template_propositions_proofs(last_proposition) [super-concepts of statement resources are statement resources]
- queries.mereological_template_propositions_proofs(last_proposition). [components of statement resources are statement resources]

For salient_statement_resources(last_proposition, type_selection=True) use the following SPARQL queries:
- queries.direct_template_last_item_types(last_proposition)
- queries.hierarchical_template_propositions_proofs(last_proposition) [super-concepts of statement resources are statement resources]
- queries.mereological_template_propositions_proofs(last_proposition). [components of statement resources are statement resources]

These SPARQL queries provide resources and counts. Then the notebook can use these results to proceed with the required calculations.

CASE 2: salient_statement_plus_related_chunks
For salient_statement_plus_related_chunks(last_proposition, type_selection=False) use the following SPARQL queries:
(a) queries.direct_template_propositions_proofs(last_proposition)
(b) queries.hierarchical_template_propositions_proofs(last_proposition) [super-concepts of statement resources are statement resources]
(c) queries.mereological_template_propositions_proofs(last_proposition) [components of statement resources are statement resources]
(d) queries.find_salient_resources_in_definitions_postulates_common_notions(resource_iris)
(e) queries.direct_definitions()
(f) queries.direct_postulates()
(g) queries.direct_common_notions()
(h) queries.hierarchical_definitions()
(i) queries.hierarchical_postulates()
(j) queries.hierarchical_common_notions()
(k) queries.mereological_definitions()
(l) queries.mereological_postulates()
(m) queries.mereological_common_notions().

For salient_statement_plus_related_chunks(last_proposition, type_selection=True) use the following SPARQL queries:
(a) queries.direct_template_last_item_types(last_proposition)
(b) queries.hierarchical_template_propositions_proofs(last_proposition) [super-concepts of statement resources are statement resources]
(c) queries.mereological_template_propositions_proofs(last_proposition) [components of statement resources are statement resources]
(d) queries.find_salient_resources_in_definitions_postulates_common_notions(resource_iris)
(e) queries.direct_definitions()
(f) queries.direct_postulates()
(g) queries.direct_common_notions()
(h) queries.hierarchical_definitions()
(i) queries.hierarchical_postulates()
(j) queries.hierarchical_common_notions()
(k) queries.mereological_definitions()
(l) queries.mereological_postulates()
(m) queries.mereological_common_notions().

TO DO: add new queries to modules/queries.py that are like queries (e)-(m) but parametrized via `VALUES {{ {iri_of_salient_resources} }}`.

The queries (a), (b), and (c) find both the resources that are salient in the last proposition and the counts. 
Reuse the counts for later calculations.sue the resources to create the string resource_iris (adding angular brakets as needed) in queries.find_salient_resources_in_definitions_postulates_common_notions (query (d) in the list above) to find the definitions, postulates, and common notions to be considered.
Use queries (e) thru (m) to find the additional counts needed for later calculations.

The spec requires that both queries:
- return resources as `?o` with optional counts,
- respect `TYPE_SELECTION`, and
- are filtered by `EXCLUDED_CONCEPT_IRIS` and `EXCLUDED_CONCEPT_IRI_SUBSTRINGS`
  immediately after each query result is materialized.


## Validation / Edge cases
- If `S` is empty, then all `deg_C^S` are 0 and `Φ_{L_c}` must be 0.
- If `hebb_C` is empty, `Φ_{L_c}` must be 0.
- Ensure `EPSILON ∈ [0,1]` and `HISTORY_WEIGHTS` sum to 1.
- For `N=1`, context contains definitions, postulates, common notions, and proposition 1; there are no prior proofs.
