# Phase 3: Semantic Subspace Analysis

This notebook analyzes the outputs of the Phase 3 pipeline, focusing on:
1. **Dimensionality (K)**: Evolution of intrinsic semantic complexity.
2. **Semantic Drift**: Grassmanian distance between temporal subspaces.
3. **Semantic Entropy**: Internal diversity of meanings.
4. **Frame Projections**: Alignment with sociological anchors (Function, Affect, Social).

**Input**: `data/phase3_redesign` (Parquet files).

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os

# Config
DATA_DIR = '../data/phase3_redesign'
SUFFIX = '_v3_final'

# Load Data
try:
    df_k = pd.read_parquet(os.path.join(DATA_DIR, f'window_dimensionality{SUFFIX}.parquet'))
    df_drift = pd.read_parquet(os.path.join(DATA_DIR, f'semantic_drift_timeseries{SUFFIX}.parquet'))
    df_entropy = pd.read_parquet(os.path.join(DATA_DIR, f'semantic_entropy_timeseries{SUFFIX}.parquet'))
    df_proj = pd.read_parquet(os.path.join(DATA_DIR, f'frame_projections{SUFFIX}.parquet'))
    print("Data loaded successfully.")
except FileNotFoundError as e:
    print(f"Error loading data: {e}")
    print("Ensure the pipeline has finished running.")

## 1. Dimensionality Evolution (K)

In [None]:
plt.figure(figsize=(12, 4))
plt.plot(pd.to_datetime(df_k['window_start']), df_k['k_optimal'], marker='o', label='Optimal K (Horn)')
plt.ylabel('K (Dimensionality)')
plt.title('Evolution of Semantic Dimensionality')
plt.grid(True, alpha=0.3)
plt.legend()
plt.show()

## 2. Semantic Drift & Entropy

In [None]:
fig, ax1 = plt.subplots(figsize=(12, 5))

color = 'tab:red'
ax1.set_xlabel('Date')
ax1.set_ylabel('Semantic Drift', color=color)
ax1.plot(pd.to_datetime(df_drift['window_start']), df_drift['drift'], color=color, label='Drift')
ax1.tick_params(axis='y', labelcolor=color)

ax2 = ax1.twinx()  
color = 'tab:blue'
ax2.set_ylabel('Semantic Entropy', color=color)  
ax2.plot(pd.to_datetime(df_entropy['window_start']), df_entropy['entropy'], color=color, linestyle='--', label='Entropy')
ax2.tick_params(axis='y', labelcolor=color)

plt.title('Semantic Drift vs Entropy')
plt.grid(True, alpha=0.3)
plt.show()

## 3. Sociological Frame Projections

In [None]:
# Melt for plotting
df_proj_melt = df_proj.melt(id_vars='window_start', var_name='Dimension', value_name='Projection')

plt.figure(figsize=(14, 6))
sns.lineplot(data=df_proj_melt, x='window_start', y='Projection', hue='Dimension')
plt.title('Projection of Yape Meaning onto Sociological Frames')
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.show()