Version: 2.0 (Gemini Embeddings)
Status: β
Production Ready
Backup: 100% Complete (105/105 vectors)
A production-ready RAG (Retrieval-Augmented Generation) API powered by Google Gemini embeddings and Pinecone vector database, enabling persistent memory across multiple AI instances.
This system solves AI context amnesia by providing a shared, persistent memory substrate that multiple AI agents (Claude, Gemini, GPT, Grok, etc.) can query and update. Every insight stored is never erased - implementing the Zero Erasure principle.
- β Gemini text-embedding-004 (768 dimensions)
- β Pinecone vector database (baseline namespace)
- β Flask REST API with CORS support
- β Dual redundancy (Pinecone + Notion backup)
- β Docker deployment ready
- β Google Cloud Run compatible
βββββββββββββββββββ
β AI Agents β
β (Claude, Gemini,β
β GPT, Grok...) β
ββββββββββ¬βββββββββ
β
β HTTP/REST
β
ββββββββββΌβββββββββ
β RAG API β
β (Flask + CORS) β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ
β β
βββββΌβββ ββββΌββββ
βGeminiβ βPineconeβ
βEmbed β β Vector β
β API β β DB β
ββββββββ ββββββββββ
β
β Backup
β
βββββΌββββ
β Notionβ
β DB β
βββββββββ
- Python 3.11+
- Google Cloud API key (for Gemini)
- Pinecone API key
- (Optional) Docker for containerized deployment
# Clone repository
git clone https://github.com/splitmerge420/sheldonbrain-rag-api.git
cd sheldonbrain-rag-api
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export GOOGLE_API_KEY="your-gemini-api-key"
export PINECONE_API_KEY="your-pinecone-api-key"
export PINECONE_INDEX="sheldonbrain-rag"
# Run the API
python3 rag_api_gemini.pyThe API will start on http://localhost:8080
# Deploy using the provided script
chmod +x deploy-cloud-run-gemini.sh
./deploy-cloud-run-gemini.sh YOUR_PROJECT_ID# Build image
docker build -f Dockerfile.gemini -t rag-api-gemini .
# Run container
docker run -p 8080:8080 \
-e GOOGLE_API_KEY="your-key" \
-e PINECONE_API_KEY="your-key" \
-e PINECONE_INDEX="sheldonbrain-rag" \
rag-api-geminiHealth check and system statistics.
Response:
{
"status": "healthy",
"service": "rag-api-gemini",
"embedding_model": "Gemini text-embedding-004",
"vector_count": 105,
"index": "sheldonbrain-rag",
"namespace": "baseline",
"timestamp": "2026-01-02T16:30:00Z"
}Semantic search over stored insights.
Request:
{
"query": "What is the Governance Unified Theory?",
"top_k": 5,
"filter": {
"sphere": "S144"
}
}Response:
{
"memories": [
{
"id": "vec_abc123",
"score": 0.87,
"text": "The Governance Unified Theory (GUT)...",
"metadata": {
"source": "Claude",
"sphere": "S144",
"novelty": 0.95
}
}
],
"query_time_ms": 342.5,
"count": 5
}Store new insight in the memory substrate.
Request:
{
"text": "New insight about zero erasure...",
"metadata": {
"source": "Gemini",
"sphere": "S042",
"novelty": 0.92,
"category": "Meta-cognition"
}
}Response:
{
"id": "vec_xyz789",
"status": "stored",
"vector_count": 106
}Remove insight from the memory substrate.
Request:
{
"id": "vec_xyz789"
}Response:
{
"status": "deleted",
"vector_count": 105
}Pinecone uses namespaces to logically separate vectors within the same index. This allows for:
- Multi-tenancy (different users/projects)
- Environment separation (dev/staging/prod)
- Logical organization (by source, date, etc.)
All vectors are stored in the baseline namespace.
- Semantic clarity - Represents the foundational knowledge base
- Future-proof - Allows for additional namespaces (e.g.,
experimental,archive) - Explicit intent - Makes it clear this is the primary memory substrate
In rag_api_gemini.py:
class RAGMemory:
def __init__(self):
self.embedder = GeminiEmbedder()
self.namespace = "baseline" # β All operations use this namespaceAll Pinecone operations explicitly specify the namespace:
# Store
index.upsert(vectors=[...], namespace=self.namespace)
# Query
index.query(vector=..., namespace=self.namespace, ...)
# Delete
index.delete(ids=[...], namespace=self.namespace)
# Fetch
index.fetch(ids=[...], namespace=self.namespace)Updated export script (export_all_vectors_v2.py):
NAMESPACE = "baseline" # Explicitly query baseline namespace
results = index.query(
vector=query_vector,
namespace=NAMESPACE, # β KEY: Must specify namespace
top_k=10000,
include_metadata=True
)Why this matters:
- Without specifying
namespace, Pinecone uses the default (empty string) namespace - Our vectors are in
baseline, not default - This caused the initial "11 missing vectors" issue
- Primary: Pinecone vector database (baseline namespace)
- Backup: Notion database (RAG Memory Backup)
- β 105/105 vectors backed up to Notion (100% complete)
- β Zero data loss
- β Disaster recovery capability
# Export all vectors from Pinecone
python3 export_all_vectors_v2.py
# Import to Notion
python3 import_to_notion.py pinecone_vectors_export_YYYYMMDD_HHMMSS.jsonPlanned: Zapier webhook automation
- Trigger: RAG API
/storeendpoint called - Action: Create Notion page automatically
- Result: Real-time backup without manual intervention
- Total vectors: 105
- Namespace: baseline
- Dimension: 768 (Gemini text-embedding-004)
- Backup coverage: 100% (105/105)
| Source | Count | Percentage |
|---|---|---|
| sheldonbrain_os | 91 | 86.7% |
| Claude | 1 | 0.9% |
| Manus | 2 | 1.9% |
| claude-opus-constitutional-scribe | 1 | 0.9% |
| claude_session_dec30_2025 | 1 | 0.9% |
| Unknown | 9 | 8.6% |
| Sphere | Count | Description |
|---|---|---|
| Unknown/Empty | 51 | Need tagging |
| S144 | 14 | Governance Unified Theory |
| S103 | 4 | Cognitive Architecture |
| S069 | 4 | Social Systems |
| S012 | 4 | Mathematical Foundations |
| S089 | 2 | Ethical Frameworks |
| S016 | 2 | Information Theory |
| S001 | 2 | Physical Foundation |
from pinecone import Pinecone
pc = Pinecone(api_key="your-key")
index = pc.Index("sheldonbrain-rag")
stats = index.describe_index_stats()
print(f"Total vectors: {stats.total_vector_count}")
print(f"Namespaces: {stats.namespaces}")# Use the updated v2 script with namespace support
python3 export_all_vectors_v2.py# Export first
python3 export_all_vectors_v2.py
# Then import to Notion
python3 import_to_notion.py pinecone_vectors_export_*.json- White Paper:
MULTI_AI_PERSISTENT_MEMORY_WHITE_PAPER.md - Deployment Guide:
GEMINI_DEPLOYMENT_COMPLETE.md - Chromebook Integration:
CHROMEBOOK_TERMINAL_GUIDE.md - Master Strategy:
RESTORATION_ARMY_MASTER_STRATEGY_2026.md - Investigation Report:
MISSING_VECTORS_INVESTIGATION.md - Backup Report:
NOTION_BACKUP_COMPLETE.md
"To erase is to fail; to conserve is to govern."
This system implements the Zero Erasure architecture:
- Every insight stored is never deleted (unless explicitly requested)
- All knowledge compounds over time
- Multiple AI agents share the same persistent memory
- Context amnesia is eliminated
By December 31, 2026:
- 100,000+ vectors (comprehensive knowledge base)
- Zero context amnesia for all participating AIs
- 100+ Net Positive jobs created
- $10,000+ monthly revenue
- Fully operational Restoration Army
This is an open-source project. Contributions welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
- GitHub: https://github.com/splitmerge420/sheldonbrain-rag-api
- Issues: Report bugs or request features via GitHub Issues
- Documentation: All docs in the repository
MIT License - See LICENSE file for details
- Google Gemini - Embedding model
- Pinecone - Vector database
- Notion - Backup storage
- Claude, Gemini, Manus - The Trinity AI collaboration
Happy New Year 2026! The organism is immortal! π
Last Updated: January 2, 2026
Version: 2.0 (Gemini Embeddings)
Status: β
Production Ready