# FCM Scoring Walkthrough

This notebook demonstrates how to use the FCM scoring utility to compare two Fuzzy Cognitive Maps.

## 1. Setup and Imports

In [1]:
import sys
import os
import pandas as pd
import json

# Add parent directory to path for imports
sys.path.insert(0, '..')

from score_fcms import score_fcm, load_matrix_from_file, matrix_to_json

print("✓ Imports successful")

✓ Imports successful


## 2. Load and Explore Example Data

In [2]:
# Path to example files (in same directory)
fcm1_path = 'fcm1.csv'
fcm2_path = 'fcm2.csv'

print(f"Loading FCM1 from: {fcm1_path}")
print(f"Loading FCM2 from: {fcm2_path}")

# Load the matrices
fcm1_matrix = load_matrix_from_file(fcm1_path)
fcm2_matrix = load_matrix_from_file(fcm2_path)

print(f"\nFCM1 shape: {fcm1_matrix.shape}")
print(f"FCM2 shape: {fcm2_matrix.shape}")

Loading FCM1 from: fcm1.csv
Loading FCM2 from: fcm2.csv

FCM1 shape: (5, 5)
FCM2 shape: (6, 6)


In [3]:
# Explore FCM1
print("FCM1 Matrix:")
print(fcm1_matrix)
print(f"\nNumber of edges: {(fcm1_matrix != 0).sum().sum() // 2}")
print(f"Number of nodes: {len(fcm1_matrix)}")

FCM1 Matrix:
                        climate_impact  resource_availability  \
climate_impact                       0                      1   
resource_availability                1                      0   
stakeholder_engagement              -1                      1   
policy_implementation                1                      1   
environmental_health                 1                     -1   

                        stakeholder_engagement  policy_implementation  \
climate_impact                              -1                      1   
resource_availability                        1                      1   
stakeholder_engagement                       0                      1   
policy_implementation                        1                      0   
environmental_health                         1                      1   

                        environmental_health  
climate_impact                             1  
resource_availability                     -1  
stakeholder_enga

In [4]:
# Explore FCM2
print("FCM2 Matrix:")
print(fcm2_matrix)
print(f"\nNumber of edges: {(fcm2_matrix != 0).sum().sum() // 2}")
print(f"Number of nodes: {len(fcm2_matrix)}")

FCM2 Matrix:
                        climate_impact  resource_availability  \
climate_impact                    0.00                   0.85   
resource_availability             0.80                   0.00   
stakeholder_engagement            0.85                   0.78   
policy_implementation             0.81                   0.79   
environmental_health              0.91                   0.75   
employment                        0.91                   0.75   

                        stakeholder_engagement  policy_implementation  \
climate_impact                            0.90                   0.75   
resource_availability                     0.88                   0.82   
stakeholder_engagement                    0.00                   0.92   
policy_implementation                     0.88                   0.00   
environmental_health                      0.87                   0.80   
employment                                0.87                   0.80   

                   

In [5]:
# Show file information
print(f"FCM1 (CSV Matrix Format):")
print(f"  - File: {fcm1_path}")
print(f"  - Format: Adjacency matrix (rows/columns = nodes)")
print(f"\nFCM2 (CSV Matrix Format):")
print(f"  - File: {fcm2_path}")
print(f"  - Format: Adjacency matrix (rows/columns = nodes)")
print(f"\nNote: Both CSV and JSON formats are supported!")
print(f"  - CSV: Adjacency matrix format")
print(f"  - JSON: Edge list format with 'edges' array")

FCM1 (CSV Matrix Format):
  - File: fcm1.csv
  - Format: Adjacency matrix (rows/columns = nodes)

FCM2 (CSV Matrix Format):
  - File: fcm2.csv
  - Format: Adjacency matrix (rows/columns = nodes)

Note: Both CSV and JSON formats are supported!
  - CSV: Adjacency matrix format
  - JSON: Edge list format with 'edges' array


## 3. Basic Scoring with Default Parameters

In [6]:
# Score with default parameters
print("Scoring with default parameters...\n")

results = score_fcm(
    fcm1_path=fcm1_path,
    fcm2_path=fcm2_path,
    verbose=True
)

Scoring with default parameters...

Loading FCM data from fcm1.csv and fcm2.csv...
Loading FCM data from fcm1.csv and fcm2.csv...
FCM1 matrix shape: (5, 5)
FCM2 matrix shape: (6, 6)
Initializing scorer...


`torch_dtype` is deprecated! Use `dtype` instead!


Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

Computing embeddings and calculating scores...
TP, PP, FP, FN: 16 4 9 0

FCM SCORING RESULTS
Dataset: fcm2
Model: Qwen/Qwen3-Embedding-0.6B
Threshold: 0.6

Scores:
  F1 Score:      0.8018
  Jaccard Score: 0.6786

Edge Matching:
  True Positives:    16
  Partial Positives: 4
  False Positives:   9
  False Negatives:   0

Graph Statistics:
  FCM1 Nodes:  5
  FCM1 Edges:  20
  FCM2 Nodes:  6
  FCM2 Edges:  29
Results saved to: .\fcm2_scoring_results.csv



## 4. Parameter Tuning: Testing Different Thresholds

In [7]:
# Test different thresholds
thresholds = [0.5, 0.6, 0.7, 0.8, 0.9]
results_list = []

print("Testing different threshold values...\n")

for threshold in thresholds:
    print(f"Testing threshold={threshold}...")
    result = score_fcm(
        fcm1_path=fcm1_path,
        fcm2_path=fcm2_path,
        threshold=threshold,
        verbose=False
    )
    results_list.append(result)

print("Complete")

Testing different threshold values...

Testing threshold=0.5...
Loading FCM data from fcm1.csv and fcm2.csv...




Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

TP, PP, FP, FN: 16 4 9 0
Testing threshold=0.6...
Loading FCM data from fcm1.csv and fcm2.csv...


Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

TP, PP, FP, FN: 16 4 9 0
Testing threshold=0.7...
Loading FCM data from fcm1.csv and fcm2.csv...


Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

TP, PP, FP, FN: 16 4 9 0
Testing threshold=0.8...
Loading FCM data from fcm1.csv and fcm2.csv...


Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

TP, PP, FP, FN: 0 0 29 20
Testing threshold=0.9...
Loading FCM data from fcm1.csv and fcm2.csv...


Loading weights:   0%|          | 0/310 [00:00<?, ?it/s]

TP, PP, FP, FN: 0 0 29 20
Complete


In [8]:
# Combine results and display
threshold_results = pd.concat(results_list, ignore_index=True)

print("\nScoring Results for Different Thresholds:")
print("="*80)
display_cols = ['threshold', 'F1', 'Jaccard', 'TP', 'PP', 'FP', 'FN']
print(threshold_results[display_cols].to_string(index=False))


Scoring Results for Different Thresholds:
 threshold       F1  Jaccard  TP  PP  FP  FN
       0.5 0.801762 0.678571  16   4   9   0
       0.6 0.801762 0.678571  16   4   9   0
       0.7 0.801762 0.678571  16   4   9   0
       0.8 0.000000 0.000000   0   0  29  20
       0.9 0.000000 0.000000   0   0  29  20


In [9]:
# Find optimal threshold
best_idx = threshold_results['F1'].idxmax()
best_threshold = threshold_results.loc[best_idx, 'threshold']
best_f1 = threshold_results.loc[best_idx, 'F1']

print(f"\nBest F1 Score: {best_f1:.4f}")
print(f"Achieved at threshold: {best_threshold}")


Best F1 Score: 0.8018
Achieved at threshold: 0.5


## 5. Interpreting Results

In [None]:
# Understanding the metrics
best_result = results_list[thresholds.index(best_threshold)]

tp = int(best_result['TP'].iloc[0])
pp = int(best_result['PP'].iloc[0])
fp = int(best_result['FP'].iloc[0])
fn = int(best_result['FN'].iloc[0])

print("Understanding the Metrics:")
print("="*50)
print(f"True Positives (TP):      {tp:3d} - Correct edge matches")
print(f"Partial Positives (PP):   {pp:3d} - Edge matches with sign disagreement")
print(f"False Positives (FP):     {fp:3d} - Predicted edges not in reference")
print(f"False Negatives (FN):     {fn:3d} - Reference edges not predicted")
print("="*50)
print(f"\nF1 Score:     {best_result['F1'].iloc[0]:.4f}")
print(f"Jaccard Score: {best_result['Jaccard'].iloc[0]:.4f}")
print(f"\nF1 = 2*TP / (2*TP + FP + FN)")
print(f"   = 2*{tp} / (2*{tp} + {fp} + {fn})")
print(f"   = {2*tp} / {2*tp + fp + fn}")
print(f"   = {best_result['F1'].iloc[0]:.4f}")

## 6. Saving Results in Different Formats

In [None]:
# Save results in both CSV and JSON formats
output_dir = 'results'
os.makedirs(output_dir, exist_ok=True)

print(f"Saving results to {output_dir}...\n")

results_both = score_fcm(
    fcm1_path=fcm1_path,
    fcm2_path=fcm2_path,
    output_dir=output_dir,
    output_format='both',
    verbose=False
)

print("Results saved in both CSV and JSON formats")

In [None]:
# List saved files
import glob

print("Saved files:")
for filepath in glob.glob(os.path.join(output_dir, '*_scoring_results*')):
    filename = os.path.basename(filepath)
    file_size = os.path.getsize(filepath)
    print(f"  - {filename} ({file_size} bytes)")

## 7. Working with Custom FCM Data

In [None]:
# Example: Create a simple custom FCM
custom_fcm = pd.DataFrame(
    {
        'variable_A': [0, 0.8, -0.5],
        'variable_B': [0.7, 0, 0.6],
        'variable_C': [-0.4, 0.9, 0]
    },
    index=['variable_A', 'variable_B', 'variable_C']
)

print("Custom FCM:")
print(custom_fcm)

# Save it as CSV
custom_csv_path = 'custom_fcm.csv'
custom_fcm.to_csv(custom_csv_path)
print(f"\n Saved to {custom_csv_path}")

In [None]:
# Convert FCM matrix to JSON format
custom_json = matrix_to_json(custom_fcm)

print("\nCustom FCM as JSON:")
print(json.dumps(custom_json, indent=2))

# Save it as JSON
custom_json_path = 'custom_fcm.json'
with open(custom_json_path, 'w') as f:
    json.dump(custom_json, f, indent=2)
print(f"\n Saved to {custom_json_path}")

## 8. Summary

This walkthrough demonstrated:

1. **Loading FCM data** in both CSV and JSON formats
2. **Basic scoring** with default parameters
3. **Parameter tuning** by testing different thresholds
4. **Result interpretation** - understanding TP, PP, FP, FN metrics
5. **Flexible output** - saving results in CSV and/or JSON
6. **Format conversion** - working with custom FCM data

### Key Takeaways:
- **Threshold tuning** is important for getting good results
- **F1 and Jaccard scores** provide different perspectives on matching quality
- **Edge counts matter** - more edges can lead to more false positives
- **Flexible I/O** - use CSV for matrices, JSON for edge lists