# Retrieval Augmented Generation (RAG) using LangChain
    Description
Large Language Models (LLMs) are being integrated into computers, phones, and software applications, but they do have one drawback: their knowledge is limited by their training data, which is slow and costly. Enter Retrieval Augmented Generation (RAG)! RAG enables you to integrate external data with LLMs. In this notebook, you'll learn state-of-the-art techniques for loading, processing, and retrieving external data for LLMs! You'll utilize vector databases, the latest LLMs, including GPT-4o-Mini, and the LangChain framework to create RAG applications. This notebook concludes with a chapter on Graph RAG, a twist on traditional RAG that uses graph databases for more reliable data retrieval.

- **Framework**: A framework is a predefined software infrastructure that provides a set of components, tools, and rules to simplify and accelerate application development.

### Loading Documents for RAG with LangChain

“Loading Documents for RAG with LangChain” usually refers to the step where you bring external data (PDFs, Word docs, web pages, databases, etc.) into a LangChain pipeline so it can be chunked, embedded, stored in a vector database, and later retrieved by the LLM.

In [34]:
#Import Langchain (framework) package:
#pip install langchain & pip install langchain_community#

from langchain_community.document_loaders import PyPDFLoader
PATH_TO_PDF_FILE = "/Users/alexandreohayon/Desktop/financial_report_desj.pdf"
#Create your loader
loader = PyPDFLoader("financial_report_desj.pdf")
#Get informatins about the file
docs = loader.load() 

In [35]:
docs

[Document(metadata={'producer': 'Wdesk Fidelity Content Translations Version 014.002.103', 'creator': 'Workiva', 'creationdate': '2025-08-11T17:10:56+00:00', 'moddate': '2025-08-11T15:23:43-04:00', 'title': 'RG_T2_2025_FR', 'author': 'anonymous', 'source': 'financial_report_desj.pdf', 'total_pages': 95, 'page': 0, 'page_label': '1'}, page_content="Rapport financier\nDeuxième trimestre de 2025\nLe Mouvement Desjardins enregistre des excédents de 900\xa0M$ \npour le deuxième trimestre de 2025 et franchit le cap des 500 G$ d'actifs\nMESSAGE DE LA DIRECTION\nLévis, le 12 août 2025 – Au terme du deuxième trimestre terminé le 30\xa0juin 2025, le Mouvement Desjardins, plus grand groupe financier coopératif en \nAmérique du Nord, a enregistré des excédents avant ristournes aux membres de 900\xa0M$, comparativement à 918\xa0M$ pour la période correspondante \nde 2024. Cette diminution des excédents s'explique principalement par une hausse de la dotation à la provision pour pertes de crédit, en 