# Relational Analysis

## Logic

Matrix of with ontologies in rows and segments in columns. Cell values for an ontology-segment pair are the number of topics in the ontology that are semantically similar to a segment.

Can apply query logic for topic discovery and concept mapping between ontologies.

## Data Sources

Data sources for this analysis are described below.

### Corpus

The [corpus](https://www.constitutueproject.org) comprises the set of in-force national constitutions compiled by the CCP.  

### Ontologies


- CCP-FACET: A faceted version of the [CCP ontology](https://www.constitutueproject.org).
- IDEA-GLO: [International IDEA Database Glossary](https://www.idea.int/data-tools)
- NC-DCC: NÃºcleo Constituyente,Diccionario Constitucional Chileno, Second Edition

All ontologies were formatted to conform to the Sartori Network's ontology specification of:

- One topic per row
- A minimum column set comprising the following fields:
    - key: a short topic identifier. If this is not provided by the ontology owner, then an integer is used.
    - label: a short human-readable text label.
    - description: a longer descriptive text.
- A first row containing the column names: `key`, `label`, `description`.





## Initialialisation

### Load code and model

In [None]:
__author__      = 'Roy Gardner'
__copyright__   = 'Copyright 2025, Roy and Sally Gardner'

%run ./_library/packages.py
%run ./_library/utilities.py



In [None]:
model_path = '../model/'

exclusion_list = []
_,_,files = next(os.walk(model_path))
for file in files:
    if '_encodings.json' in file:
        exclusion_list.append(file)

model_dict = initialise(model_path,exclusion_list=exclusion_list)


## Concept Mapping

In [None]:
threshold = 0.74

ont_count = len(model_dict['ontologies_dict'])
ont_labels = sorted(list(model_dict['ontologies_dict'].keys()))

matrix_dict = {}
ont_segment_matrix = []

for ont_label in ont_labels:
    ont_matrix = np.array(model_dict[f'{ont_label}_topic_segment_matrix'])
    M = np.where(ont_matrix>=threshold,1,0).astype(int)
    marginal = np.sum(M,axis=0)
    ont_segment_matrix.append(marginal)
    matrix_dict[ont_label] = M

U = np.array(ont_segment_matrix)
print(U.shape)


    

In [None]:
class MatrixColumnQuery:
    def __init__(self,matrix):
        self.matrix = matrix
        self.n_cols = matrix.shape[1]
    
    def all_nonzero(self):
        """Find columns where all elements are non-zero"""
        return [col for col in range(self.n_cols) 
                if np.all(self.matrix[:, col] != 0)]
    
    def any_nonzero(self,positions):
        """Find columns with non-zero values at ANY of the specified positions"""
        return [col for col in range(self.n_cols)
                if np.any(self.matrix[positions, col] != 0)]
    
    def all_positions_nonzero(self,positions):
        """Find columns with non-zero values at ALL specified positions"""
        return [col for col in range(self.n_cols)
                if np.all(self.matrix[positions, col] != 0)]
    
    def exactly_nonzero(self,positions):
        """Find columns with non-zero ONLY at specified positions (zeros elsewhere)"""
        return [col for col in range(self.n_cols)
                if (np.all(self.matrix[positions, col] != 0) and 
                    np.all(self.matrix[np.setdiff1d(range(7), positions), col] == 0))]
    
    def count_nonzero(self,count):
        """Find columns with exactly 'count' non-zero elements"""
        return [col for col in range(self.n_cols)
                if np.count_nonzero(self.matrix[:, col]) == count]
    
    def query(self,condition):
        """Flexible query with lambda function"""
        return [col for col in range(self.n_cols)
                if condition(self.matrix[:, col])]

# Example usage:
matrix = np.array([
    [1, 0, 2, 1, 8],
    [2, 1, 0, 1, 3], 
    [0, 2, 3, 1, 5],
    [1, 0, 1, 0, 2],
    [0, 1, 0, 1, 1],
    [2, 0, 2, 1, 5],
    [1, 1, 1, 0, 3]
])


q = MatrixColumnQuery(U)




In [None]:
print('All non-zero segments:',q.all_nonzero())

count = 2
results = q.count_nonzero(count)
print(f'Segments with exactly {count} non-zero elements:',len(results))
print()


for segment_index in results:
    segment_id = model_dict['encoded_segments'][segment_index]
    segment_text = model_dict['segments_dict'][segment_id]['text']
    print(segment_text)
    print(U[:,segment_index])
    
    #print([ont_labels[i] for i,v in enumerate(U[:,segment_index]) if v > 0])
    for i,v in enumerate(U[:,segment_index]):
        if v == 0:
            continue
        ont_label = ont_labels[i]
        ont_matrix = matrix_dict[ont_label]
        # Recover the topics
        topic_indices = np.nonzero(ont_matrix[:,segment_index])[0]
        topic_ids = [model_dict[f'{ont_label}_encoded_topics'][index] for index in topic_indices]
        topic_labels = [model_dict[f'{ont_label}_topics_dict'][topic_id]['Label'] for topic_id in topic_ids]
        print(ont_labels[i],topic_labels)
        
    
    
    
    print()

