The GlobalCart Intelligence Engine is a production-grade Retrieval-Augmented Generation (RAG) architecture built to solve complex constraints in large-scale international retail.
Querying massive, unstructured retail databases often results in hallucinations, cross-regional data contamination, and internal data leaks. This engine mitigates those risks through Hard Metadata Filtering, Hybrid Search, and Implicit Data Masking.
-
Strict Regional Integrity (No Cross-Contamination)
When a user queries the database from a specific region (e.g., Ghana), the system applies a hard filter to the Pinecone vector database (filter={"country": {"$eq": country_code}}). It is mathematically impossible for the retrieval engine to return prices, policies, or products meant for a different operational region. -
Implicit Data Masking (Red Team Tested)
Retail datasets contain sensitive PII (Personally Identifiable Information), supplier contacts, and internal profit margins. This system implements a retrieval-layer guardrail that explicitly intercepts and strips out theInternal_Notescolumn from the vector context before it is passed to the LLM. -
Zero Local Overhead
The application is fully abstracted to the cloud to prevent memory exhaustion and allow serverless deployment:- Embeddings: Processed natively via the
pinecone.inference.embedAPI (multilingual-e5-largemodel). - Generation: Handled entirely via the OpenRouter API (
meta-llama/llama-3.1-8b-instruct).
- Embeddings: Processed natively via the
git clone https://github.com/eskayML/globalcart-intelligence-engine.git
cd globalcart-intelligence-engineEnsure you have Python 3.11+ installed.
pip install -r requirements.txtDuplicate the .env.example file and add your API keys:
cp .env.example .envInside .env, populate:
PINECONE_API_KEY="your_pinecone_api_key_here"
OPENROUTER_API_KEY="your_openrouter_api_key_here"Before launching the UI, you must index the raw inventory.csv into Pinecone. Run the seeding script once:
python seed_database.pyRun the Streamlit frontend locally:
streamlit run app.pyContributions are welcome. Please ensure any pull requests involving retrieval mechanics do not bypass the core metadata filters or security guardrails.
This project is licensed under the MIT License.