# Loading PDFs

## Using PyPDF

In [1]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("documents/langchain-report.pdf")
pages = loader.load()

In [32]:
pages[0].__dict__

{'id': None,
 'metadata': {'source': 'documents/langchain-report.pdf', 'page': 0},
 'page_content': 'LangChain State of AI 2024 ReportDive into LangSmith product usage patterns that show how the AI ecosystemand the way people are building LLM apps is evolving.BY LANGCHAIN6 MIN READDEC 19, 2024\nSubscribeLangChain State of AI 2024 Reporthttps://blog.langchain.dev/langchain-state-of-ai-2024/\n1 of 1230/01/25, 11:58 AM',
 'type': 'Document'}

In [17]:
content = "\n"
for page in pages:
    content = content + page.page_content

In [18]:
print(content)


LangChain State of AI 2024 ReportDive into LangSmith product usage patterns that show how the AI ecosystemand the way people are building LLM apps is evolving.BY LANGCHAIN6 MIN READDEC 19, 2024
SubscribeLangChain State of AI 2024 Reporthttps://blog.langchain.dev/langchain-state-of-ai-2024/
1 of 1230/01/25, 11:58 AMAnother year of building with LLMs is coming to an end — and 2024 didn’tdisappoint. With nearly 30k users signing up for LangSmith every month, we’relucky to have front row seats to what’s happening in the industry. As we did last year, we want to share some product usage patterns thatshowcase how the AI ecosystem and practice of building LLM apps are evolving.As folks have traced, evaluated, and iterated their way around LangSmith,we’ve seen a few notable changes. These include the dramatic rise of open-source model adoption and a shift from predominantly retrieval workflows to AIagent applications with multi-step, agentic workflows. Dive into the stats below to learn exact

In [19]:
total_content = "\n".join(document.page_content for document in pages)

In [20]:
print(total_content)

LangChain State of AI 2024 ReportDive into LangSmith product usage patterns that show how the AI ecosystemand the way people are building LLM apps is evolving.BY LANGCHAIN6 MIN READDEC 19, 2024
SubscribeLangChain State of AI 2024 Reporthttps://blog.langchain.dev/langchain-state-of-ai-2024/
1 of 1230/01/25, 11:58 AM
Another year of building with LLMs is coming to an end — and 2024 didn’tdisappoint. With nearly 30k users signing up for LangSmith every month, we’relucky to have front row seats to what’s happening in the industry. As we did last year, we want to share some product usage patterns thatshowcase how the AI ecosystem and practice of building LLM apps are evolving.As folks have traced, evaluated, and iterated their way around LangSmith,we’ve seen a few notable changes. These include the dramatic rise of open-source model adoption and a shift from predominantly retrieval workflows to AIagent applications with multi-step, agentic workflows. Dive into the stats below to learn exact

## Using Unustructured 

In [21]:
from langchain_community.document_loaders import UnstructuredPDFLoader

loader = UnstructuredPDFLoader("documents/langchain-report.pdf", mode="elements")

data = loader.load()





In [31]:
data[0].__dict__

{'id': None,
 'metadata': {'source': 'documents/langchain-report.pdf',
  'coordinates': {'points': ((0.0, 0.0),
    (0.0, 10.0),
    (143.019, 10.0),
    (143.019, 0.0)),
   'system': 'PixelSpace',
   'layout_width': 612,
   'layout_height': 792},
  'file_directory': 'documents',
  'filename': 'langchain-report.pdf',
  'languages': ['eng'],
  'last_modified': '2025-01-30T11:58:07',
  'page_number': 1,
  'filetype': 'application/pdf',
  'category': 'Header',
  'element_id': '1da0382d9d88b2293bc96c4a59931468'},
 'page_content': 'LangChain State of AI 2024 Report',
 'type': 'Document'}

In [33]:
total_content = "\n".join(document.page_content for document in pages)

In [34]:
print(total_content)

LangChain State of AI 2024 ReportDive into LangSmith product usage patterns that show how the AI ecosystemand the way people are building LLM apps is evolving.BY LANGCHAIN6 MIN READDEC 19, 2024
SubscribeLangChain State of AI 2024 Reporthttps://blog.langchain.dev/langchain-state-of-ai-2024/
1 of 1230/01/25, 11:58 AM
Another year of building with LLMs is coming to an end — and 2024 didn’tdisappoint. With nearly 30k users signing up for LangSmith every month, we’relucky to have front row seats to what’s happening in the industry. As we did last year, we want to share some product usage patterns thatshowcase how the AI ecosystem and practice of building LLM apps are evolving.As folks have traced, evaluated, and iterated their way around LangSmith,we’ve seen a few notable changes. These include the dramatic rise of open-source model adoption and a shift from predominantly retrieval workflows to AIagent applications with multi-step, agentic workflows. Dive into the stats below to learn exact

# Loading Markdown

In [35]:
from langchain_community.document_loaders import UnstructuredMarkdownLoader

loader = UnstructuredMarkdownLoader("documents/readme-langchain.md", mode="elements")

data = loader.load()

In [36]:
print("\n".join(element.page_content for element in data))

🦜️🔗 LangChain
⚡ Build context-aware reasoning applications ⚡
Looking for the JS/TS library? Check out LangChain.js.
To help you ship LangChain apps to production faster, check out LangSmith. LangSmith is a unified developer platform for building, testing, and monitoring LLM applications. Fill out this form to speak with our sales team.
Quick Install
With pip:
bash pip install langchain
With conda:
bash conda install langchain -c conda-forge
🤔 What is LangChain?
LangChain is a framework for developing applications powered by large language models (LLMs).
For these applications, LangChain simplifies the entire application lifecycle:
Open-source libraries: Build your applications using LangChain's open-source components and third-party integrations. Use LangGraph to build stateful agents with first-class streaming and human-in-the-loop support.
Productionization: Inspect, monitor, and evaluate your apps with LangSmith so that you can constantly optimize and deploy with confidence.
Deploym

# Loading web pages

## Simple and fast

In [37]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

page_url = "https://python.langchain.com/docs/how_to/chatbots_memory/"

loader = WebBaseLoader(web_paths=[page_url])
docs = loader.load()


USER_AGENT environment variable not set, consider setting it to identify your requests.


In [43]:
print(docs[0].page_content)






How to add memory to chatbots | ü¶úÔ∏èüîó LangChain






Skip to main contentIntegrationsAPI ReferenceMoreContributingPeopleError referenceLangSmithLangGraphLangChain HubLangChain JS/TSv0.3v0.3v0.2v0.1üí¨SearchIntroductionTutorialsBuild a Question Answering application over a Graph DatabaseTutorialsBuild a simple LLM application with chat models and prompt templatesBuild a ChatbotBuild a Retrieval Augmented Generation (RAG) App: Part 2Build an Extraction ChainBuild an AgentTaggingBuild a Retrieval Augmented Generation (RAG) App: Part 1Build a semantic search engineBuild a Question/Answering system over SQL dataSummarize TextHow-to guidesHow-to guidesHow to use tools in a chainHow to use a vectorstore as a retrieverHow to add memory to chatbotsHow to use example selectorsHow to add a semantic layer over graph databaseHow to invoke runnables in parallelHow to stream chat model responsesHow to add default invocation args to a RunnableHow to add retrieval to chatbotsHow to use few

## Advanced parsing

In [42]:
from langchain_unstructured import UnstructuredLoader

page_url = "https://python.langchain.com/docs/how_to/chatbots_memory/"
loader = UnstructuredLoader(web_url=page_url)

docs = []
async for doc in loader.alazy_load():
    docs.append(doc)

In [None]:
docs

## Loading PPTs

In [44]:
from langchain_community.document_loaders.powerpoint import UnstructuredPowerPointLoader


loader = UnstructuredPowerPointLoader("documents/prompts.pptx")

data = loader.load()

In [46]:
print(data[0].page_content)

Creación de Aplicaciones con IA



James Espichan Vilca

Staff Machine Learning Engineer @ Latam Airlines

Building Generative AI Application from PoC to Production

Langchain Contributor

jamesev15



Introducción al desarrollo con IA Generativa: 

Tema:

Buenas prácticas de Prompts

Zero-shot, one-shot y Few-shot prompting

CoT Prompting

Iteraciones

Roleplay

Knowledge Generation-Integration

Hiperparámetros



Buenas Prácticas de Prompts



Buenas prácticas de Prompts

Siempre es mejor escribir instrucciones claras

Si la información que se busca obtener no está dentro de los datos de entrenamiento del modelo, es mejor ofrecer un texto de referencia al modelo

Los modelos son buenos resolviendo problemas complejos, pero todo tiene su límite. Si el problema es muy complejo, se sugiere dividir el problema en sub-problemas más sencillos

Si la respuesta involucra pasos lógicos complejos, se le puede instruir a la LLM como debe razonar, de esa forma "se le da tiempo a la LLM para pens

# Load youtube videos

In [47]:
from langchain_community.document_loaders import YoutubeLoader

loader = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=QsYGlZkevEg", add_video_info=False
)

In [48]:
data = loader.load()

In [49]:
print(data[0].page_content)

LADIES AND GENTLEMEN, PEDRO PASCAL! [ CHEERS AND APPLAUSE ] >> THANK YOU, THANK YOU. THANK YOU VERY MUCH. I'M SO EXCITED TO BE HERE. THANK YOU. I SPENT THE LAST YEAR SHOOTING A SHOW CALLED "THE LAST OF US" ON HBO. FOR SOME HBO SHOES, YOU GET TO SHOOT IN A FIVE STAR ITALIAN RESORT SURROUNDED BY BEAUTIFUL PEOPLE, BUT I SAID, NO, THAT'S TOO EASY. I WANT TO SHOOT IN A FREEZING CANADIAN FOREST WHILE BEING CHASED AROUND BY A GUY WHOSE HEAD LOOKS LIKE A GENITAL WART. IT IS AN HONOR BEING A PART OF THESE HUGE FRANCHISEs LIKE "GAME OF THRONES" AND "STAR WARS," BUT I'M STILL GETTING USED TO PEOPLE RECOGNIZING ME. THE OTHER DAY, A GUY STOPPED ME ON THE STREET AND SAYS, MY SON LOVES "THE MANDALORIAN" AND THE NEXT THING I KNOW, I'M FACE TIMING WITH A 6-YEAR-OLD WHO HAS NO IDEA WHO I AM BECAUSE MY CHARACTER WEARS A MASK THE ENTIRE SHOW. THE GUY IS LIKE, DO THE MANDO VOICE, BUT IT'S LIKE A BEDROOM VOICE. WITHOUT THE MASK, IT JUST SOUNDS PORNY. PEOPLE WALKING BY ON THE STREET SEE ME WHISPERING TO A 6-