Create Threat Model and Discuss RAG with its security risks for LLM #241

jsotiro · 2023-11-04T20:37:28Z

Retrieval augmented generation (RAG) is a technique to enrich LLMs with the apps/org own data. It has become very popular as it lowers the complexity entry to enriching input in LLM apps, allows for better access controls as opposed to fine-tuning, and is known to reduce hallucination (see https://www.securityweek.com/vector-embeddings-antidote-to-psychotic-llms-and-a-cure-for-alert-fatigue/) see also the excellent Samsung paper on enterprise use of GenAI and the role of RAG.

Currently, we have occasional references to RAG but we should create a threat model and discuss this on its own and the impact on the LLM top 10. As RAG becomes a distinct enterprise PATTERN it also creates its own security risks and expands the attack surface with:
This includes

a second process of generating embeddings that can vary in complexity and can be using the main or secondary LLM
vector database
the use of embeddings in prompts and responses
use of other services (eg Azure Cognitive Services) or specialised plugins, for instance, the OpenAI upsert retrieval plugin for semantic search plugin(https://github.com/openai/chatgpt-retrieval-plugin)

*Some useful links***

architectural approaches
Azure: https://github.com/Azure/GPT-RAG
AWS SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html
AWS Bedrock RAG workshop: https://github.com/aws-samples/amazon-bedrock-rag-workshop

security concerns
Security of AI Embeddings explained
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Embedding Layer: AI (Brace For These Hidden GPT Dangers)

GangGreenTemperTatum · 2024-04-14T03:08:13Z

heya @jsotiro , doing some housekeeping on the repo, did you get chance to bring this up and is anything outstanding needed? i also have this issue which basically superseeds i believe as i also mentioned different API architectures and mediums

jsotiro added discuss Indicates that this issue requires a deeper discussion v2 A topic for v2 discussion labels Nov 4, 2023

GangGreenTemperTatum self-assigned this Apr 14, 2024

GangGreenTemperTatum added the diagram Issues related to the Top 10 diagram label Apr 14, 2024

GangGreenTemperTatum mentioned this issue May 19, 2024

Ads/v2 dld diagram artifacts revamp #312

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Threat Model and Discuss RAG with its security risks for LLM #241

Create Threat Model and Discuss RAG with its security risks for LLM #241

jsotiro commented Nov 4, 2023

GangGreenTemperTatum commented Apr 14, 2024

Create Threat Model and Discuss RAG with its security risks for LLM #241

Create Threat Model and Discuss RAG with its security risks for LLM #241

Comments

jsotiro commented Nov 4, 2023

Some useful links**

GangGreenTemperTatum commented Apr 14, 2024

*Some useful links***