Semantic Tensor Analysis (STA) sits on top of your embeddings or vector store and gives you temporal drift, trajectories, and token-level alignmentβno custom infra required. It keeps token-level detail alongside session-level summaries so you can inspect drift without losing context.
- β Token- and session-level embeddings (BERT + Sentence-BERT) in one pipeline
- β Drift metrics and clustering for ordered text sessions (CSV/JSON/TXT)
- β Visual explanations (PCA, heatmaps, trajectories) tailored to time-ordered data
- β Domain presets for clinical notes, learning progress, research logs, and conversations
- Researchers tracking concept drift over time
- Clinicians / ABA teams monitoring progress across notes
- Anyone with time-stamped text who wants more than cosine similarity
- Dual-resolution memory: token-level (BERT) + sequence-level (SBERT) stored together for Hungarian token alignment, token drift heatmaps, and session trajectories without re-embedding.
- Ragged, mask-aware analytics: pad/stack/flatten utilities consistently handle variable-length sessions across PCA, clustering, trajectoriesβno silent truncation.
- Temporal semantics first: velocity/acceleration of meaning, inflection-point cues, and multi-view trajectories for ordered text (not just static similarity).
- Concept evolution with alignment: session clustering + transition graphs plus token alignment to show what moved and how.
- Vision grounding for charts: server-side PlotlyβPNG snapshots fed to local vision GGUF (llama.cpp); graceful fallback to text-only if vision isnβt available.
- Storage hygiene: built-in storage stats/cleanup (sidebar + CLI), CPU-portable persistence.
- Grounded LLM context: prompts reuse analysis context (clusters, PCA axes, drift) instead of generic summaries.
STA tracks meaning at two resolutions:
- Token-level (BERT): follow individual concept drift -, Session-level (Sentence-BERT): follow overall semantic movement
Both are kept so you can align tokens while also inspecting higher-level trajectories.
Scenario: ABA Therapy Progress Tracking
You have 30 therapy session notes for a patient over 6 months:
# Load sessions
store = UniversalMemoryStore()
for note in therapy_notes:
store.add_session(note)
# Run analysisSTA automatically reveals:
-
Semantic Trajectory (
evolution_tab)- "Patient meaning shifted from cluster 'behavioral challenges' (weeks 1-8) to 'skill acquisition' (weeks 9-20) to 'generalization' (weeks 21-30)"
- Velocity graph shows rapid progress in weeks 12-15, plateau in weeks 22-26
-
Token-Level Drift (
token_alignment_heatmap)- Words that appeared/disappeared: "tantrum" (high early, faded), "independence" (emerged week 10), "peer interaction" (emerged week 18)
- Optimal alignment shows which specific concepts persisted vs. transformed
-
Concept Evolution (
concepts_tab)- KMeans identifies 4 semantic clusters: "regulation struggles", "skill building", "social engagement", "mastery"
- Transition graph shows patient moved through clusters sequentially with brief regression in week 23
-
Inflection Points (
trajectory_computation)- Week 12: Acceleration spike (breakthrough moment)
- Week 23: Temporary deceleration (regression or plateau)
- Week 28: Final acceleration (consolidation phase)
-
PCA Narrative (
dimensionality_tab+ LLM)- "PC1 (43% variance) represents 'independence vs. support needs'"
- "PC2 (28% variance) represents 'emotional regulation vs. dysregulation'"
- "Patient trajectory: moved positively along PC1 while PC2 oscillated, then stabilized"
-
Domain-Aware Insights (
AI_insights_tab)- "Based on 6-month span, this represents a typical ABA intensive phase"
- "The regression in week 23 aligns with expected variance in skill acquisition"
- "Recommend: Continue current approach, monitor for sustained generalization"
All of this from just uploading a CSV. No custom code, no manual analysis.
-
(Optional) Create and activate a venv:
python -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
- Additional system requirement: For the CLI CSV import feature, ensure your Python installation includes
tkinter(standard on most desktop Python distributions).
- Additional system requirement: For the CLI CSV import feature, ensure your Python installation includes
-
Start the Streamlit app:
streamlit run app.py
- On first load, the sidebar opens to let you upload a CSV. After upload, the sidebar stays minimized for more canvas space.
- Try with
ultimate_demo_dataset.csvoraba_therapy_dataset.csvin the repo root.
-
Interactive CLI demo (optional):
python demo.py
- Type sentences to build memory,
importto load a CSV (requirestkinter),plotfor PCA/heatmap,driftfor metrics,tokensfor token-level drift,exitto quit.
- Type sentences to build memory,
app.py: Streamlit web app (tabs: Overview, Evolution, Patterns, Dimensionality, Concepts, Explain, AI Insights); wires sidebar chat and loaders.src/semantic_tensor_analysis/app/: App modules (main.py,tabs/,sidebar.py,sidebar_chat.py,temporal_visualizations.py, assets/config).src/semantic_tensor_analysis/memory/: Core STM types (universal_core.py), text embedder (text_embedder.py), drift (drift.py,sequence_drift.py), storage (store.py), legacy shim modules that forward toarchive/legacy_embedders/when explicitly enabled.src/semantic_tensor_analysis/storage/: Storage manager/stats/cleanup utilities (manager.py).src/semantic_tensor_analysis/streamlit/: Streamlit helpers (utils.py,plots.py) used across tabs.src/semantic_tensor_analysis/analytics/: Tensor batching, dimensionality, trajectories, and concept analytics.src/semantic_tensor_analysis/visualization/: Plotting backends (viz/,tools/, Streamlit-facingplots.py).src/semantic_tensor_analysis/chat/: LLM integration (llama_cpp_analyzer.py,unified_analyzer.py, insights inanalysis.py, history parsing).src/semantic_tensor_analysis/demos/: CLI demos.archive/legacy_embedders/: Archived embedders kept for compatibility only.data/: Demo CSVs (ultimate_demo_dataset.csv,aba_therapy_dataset.csv).tests/: Test suite.pyproject.toml: Package metadata/dependencies.
examples/aba_progress.ipynb: Load ABA demo CSV β embed via API β quick PCA view.examples/finance_narrative.ipynb: Embed narrative CSV β run concept clustering β inspect clusters.
Open in Jupyter/VS Code and run locally; both use the STA API (no Streamlit).
- Mask-aware batching for variable-length sessions (
pad_and_stack,masked_session_means) - Token-level drift with Hungarian alignment and token importance drift
- Trajectories with velocity/acceleration to spot rapid semantic shifts
- PCA + clustering over ordered sessions for broad patterns and transitions
- PCA timelines and 3D trajectories
- Similarity and token-alignment heatmaps
- Concept evolution and transition graphs
- Ridgeline/distribution views
- Trajectory tunnel (experimental) for long-run drift
- Token + sentence embeddings kept together for downstream prompts
- Domain-aware summaries (clinical, learning, research, conversations)
- Axis interpretation for PCA dimensions
- Clinical progress tracking
- Learning/journey mapping
- Research note evolution
- Conversation/topic drift
- Draft/version comparison
- CSV/JSON/TXT ingestion
- Persistent storage (CPU-portable)
- Session state management in Streamlit
- Test suite coverage across embedding, storage, and viz
- CLI demo for fast iteration
ultimate_demo_dataset.csv: High-quality demo with clear trajectories and richer, longer texts.aba_therapy_dataset.csv: ABA-specific schema/content; extended to a larger set for the same client.
Upload either via the Streamlit sidebar to explore the full suite of analyses.
Expected columns (typical): session_id, date, title (optional), text.
STA uses llama.cpp as the default backend (sidebar auto-configured to http://localhost:8080, model local). Ollama UI is deprecated.
Advantages: Faster inference, lower memory footprint, vision support with the right GGUF.
-
Install llama-cpp-python:
pip install llama-cpp-python
-
Download a GGUF model:
- Vision (Apple M4/16GB):
Qwen/Qwen3-VL-4B-Instruct-GGUF(e.g., Q4_0 or Q4_K_M). - Text-only: 4β8B Q4/Q5 GGUFs (Mistral-7B, Llama-3-8B, Qwen2-7B, Phi-3-Mini) work well.
- Vision (Apple M4/16GB):
-
Run
llama-server:./server -m /path/to/model.gguf -c 4096 --host 0.0.0.0 --port 8080
-
In the app:
- Sidebar auto-uses llama.cpp at
http://localhost:8080with modellocal. - Vision snapshot button will leverage a vision-capable GGUF if provided.
- For vision models (e.g., Qwen3-VL), start llama-server with both model and projector, e.g.:
llama-server \ -m /path/to/Qwen3VL-8B-Instruct-Q4_K_M.gguf \ --mmproj /path/to/mmproj-Qwen3-VL-8B-Instruct-Q8_0.gguf \ --port 8080 --ctx-size 5000
- Sidebar auto-uses llama.cpp at
You can use STA without any LLM backend. The core analysis and visualizations work independently. You'll just miss the AI-generated narrative insights.
- Drift alerts
- Sentence search
- HTML dashboard
- Enhanced multimodal support
- Clinical applications
- The
venv/directory is excluded from git and should not be committed. - LLM Integration: STA supports two LLM backends:
- llama.cpp (recommended): Use local GGUF models for faster, memory-efficient inference
- Ollama: Traditional Ollama server with model management
- The Streamlit app renders Matplotlib figures inline; no external windows will block interaction.
- Key dependencies:
torch,transformers,scikit-learn,plotly,streamlit,pandas,numpy,rich,requests,llama-cpp-python. - tkinter (for file browser): Usually pre-installed with Python. On Linux, install with
sudo apt-get install python3-tkif needed. - Storage: Session files are stored under
data/universal/. Check sidebar storage stats and use the cleanup expander to prune old sessions; CLI available viapython -m semantic_tensor_analysis.storage.manager --statsand cleanup options.
If you use this codebase or ideas in your research, please cite the accompanying paper or link to this repository.
This section maps the semantic-tensor-memory.tex write-up (and associated PDF) to the codebase. It documents feature completeness and correspondence.
- The paper/TeX describes the motivation, architecture, algorithms, applications, and limitations of STM.
- The codebase implements STA with ragged tensor handling, dual-resolution embeddings, token alignment, and domain-aware LLM interpretation.
| Area | Paper Coverage | Codebase Coverage | Notes |
|---|---|---|---|
| STA Architecture | Yes | Yes | Aligned; dynamic dims and ragged sequences implemented. |
| Data Import | Yes | Yes | CSV upload in Streamlit; CLI import with tkinter. |
| Visualization | Yes | Yes | PCA, heatmaps, token alignment, token trajectories. |
| LLM Integration | Yes | Yes | Axis Explainer; domain-aware insights with time scale. |
| Applications | Yes | Yes | ABA and general datasets provided. |
| Example Analysis | Yes | Yes | Demo datasets included. |
| Limitations/Future | Yes | Partial | Multimodal audio, alerts, streaming, storage optimizations. |
| UI/CLI Details | Brief | Yes | More detail in codebase/README than in paper. |
| Figures | Yes | Yes | All figures rendered inline in app; assets can be saved. |
- All major features and analyses described in the paper are implemented.
- The code includes practical details (CLI commands, Streamlit UI) beyond the paper.
- Remaining roadmap items: audio modality, drift alerts/governance, streaming ingestion, storage efficiency, and expanded tests/CI.
STA operates on sessions (temporal snapshots containing variable-length sequences), not individual vectors:
# A session is a variable-length sequence
session = UniversalEmbedding(
event_embeddings=[token_1_emb, token_2_emb, ..., token_n_emb], # n varies per session
sequence_embedding=session_mean, # Holistic meaning
events=[EventDescriptor(...), ...] # Token metadata
)
# Sessions vary in length:
session_1: [100 tokens Γ 768 dims]
session_2: [237 tokens Γ 768 dims]
session_3: [89 tokens Γ 768 dims]This enables dual-resolution analysis: zoom into token-level details or analyze session-level trends.
The key innovation for handling variable-length sessions:
from semantic_tensor_analysis.analytics.tensor_batching import (
pad_and_stack,
masked_session_means,
flatten_with_mask
)
# Convert ragged sequences to batched tensor
sessions_tensor, mask = pad_and_stack(sessions)
# Shape: [3, 237, 768] (padded to max length = 237)
# Mask: [3, 237] boolean (False = padding, ignore in computation)
# Compute session-level statistics (ignoring padding)
session_means = masked_session_means(sessions_tensor, mask)
# Shape: [3, 768] - one mean per session
# Flatten to token level with provenance tracking
flat_tokens, session_ids, token_ids = flatten_with_mask(sessions_tensor, mask)
# flat_tokens: [426, 768] (100 + 237 + 89 tokens total)
# session_ids: [426] (which session each token came from)
# token_ids: [426] (position within session)Why this matters:
- Padding doesn't corrupt statistics (masked operations)
- Can analyze at session OR token granularity seamlessly
- Enables optimal token alignment across sessions (Hungarian algorithm)
- PCA can operate on all tokens while preserving session boundaries
CSV/JSON/TXT
β
Text Embedding (dual-resolution)
ββ Token embeddings [n_tokens, 768] via BERT
ββ Sequence embedding [768] via Sentence-BERT
β
Session Creation (UniversalEmbedding)
β
Storage (UniversalMemoryStore)
β
Ragged Tensor Batching (pad_and_stack)
β
Global Analysis
ββ PCA across all sessions/tokens
ββ Concept clustering (KMeans)
ββ Token alignment (Hungarian)
ββ Drift computation (cosine distance)
β
Visualization
ββ Temporal trajectories (velocity, acceleration)
ββ Heatmaps (session similarity, token alignment)
ββ 3D semantic space (PCA projection)
ββ Concept evolution graphs
β
Optional: LLM narrative generation (Ollama)
The key insight: Operations are across sessions (temporal), not within a database (spatial).
You can! STA essentially packages what you'd build in a research notebook into a reusable framework:
Without STA:
# You'd need to implement:
- Dual BERT + S-BERT embedding pipeline
- Ragged tensor padding and masking logic
- Hungarian algorithm for token alignment
- Drift velocity/acceleration computation
- 10+ specialized visualization functions
- Domain-adaptive prompts for LLM analysis
- Streamlit UI for interactive explorationWith STA:
# Just load your data
store = UniversalMemoryStore()
for session in sessions:
store.add_session(session)
# Everything else is ready to useSTA saves you from re-implementing this infrastructure for every temporal semantic analysis project.
Great tools, different purposes:
| Feature | LangSmith | W&B | STA |
|---|---|---|---|
| Conversation tracking | β Excellent | β | β |
| Metric dashboards | β | β Excellent | β |
| Semantic drift analysis | β | β | β Token + session level |
| Token alignment | β | β | β Hungarian algorithm |
| Trajectory computation | β | β | β Velocity, acceleration |
| Domain-specific workflows | β | β | β Clinical, learning, research |
Use LangSmith/W&B for production monitoring. Use STA for deep temporal semantic analysis.
Simple similarity misses temporal patterns:
# Simple approach: pairwise similarity
similarity(session_1, session_2) # β 0.87
similarity(session_2, session_3) # β 0.82
# STA approach: temporal dynamics
velocity = compute_drift_velocity([session_1, session_2, session_3])
# β [0.13, 0.18] (change is accelerating)
inflection_points = detect_rapid_shifts(velocity)
# β [session_5, session_12] (when meaning changed rapidly)
token_drift = token_importance_drift(session_1, session_3)
# β ["anxiety": high drift, "coping": low drift] (which concepts changed)STA provides the calculus of semantic change, not just static snapshots.
Session-based is intentional for certain domains:
- Clinical notes: Each therapy session is a natural boundary
- Learning journeys: Each lesson/assignment is discrete
- Research evolution: Each draft/experiment is a snapshot
- Meeting summaries: Each meeting is a unit of analysis
Future work: STA could support streaming by defining windows, but sessions align with how many domains naturally structure temporal data.
- Consecutive/session-pair alignment via Hungarian algorithm (in
sequence_drift.py). - Visualize with
viz.heatmap.token_alignment_heatmap(returns a Matplotlib Figure; rendered inline in Streamlit).
- Prompts in
chat_analysis.pyinfer domain and an appropriate time scale (days/weeks/months/quarters) from the dataset date span. - Explain tab uses
AnalysisExplanationfields:what_it_means,why_these_results,what_to_do_next.
- Port 8501 in use:
lsof -ti:8501 | xargs -r kill -9 - Ollama not running: install/start Ollama and pull a model (e.g.,
qwen3:latest). - PyTorch view/reshape error: the PCA pipeline uses
.reshape(...)and contiguous tensors intensor_batching.py. pytestnot found: install viapip install pytestor use the app directly.