# Langchain tutorial

In [74]:
import os
from dotenv import load_dotenv

# Load the environment variables
load_dotenv()

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

## 1 - Most basic model query

In [75]:
# 1 - Invoke the model
llm.invoke("how can langsmith help with testing?")

AIMessage(content='Langsmith can help with testing by providing automated testing tools and frameworks that can be used to quickly and efficiently test code for bugs and errors. It can also assist in creating test cases, running tests, and analyzing the results to identify areas of improvement. Additionally, Langsmith can help with performance testing, security testing, and regression testing to ensure that the software is functioning as expected and meeting the requirements. Overall, Langsmith can streamline the testing process and help developers deliver high-quality, reliable software.')

## 2 - Chains

In [76]:
# 2.1 - Use a prompt template
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a world class technical documentation writer."),
    ("user", "{input}")
])

In [77]:
# 2.2 - (Optional) Output parser
from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()

In [78]:
chain = prompt | llm | output_parser
chain.invoke({"input": "how can langsmith help with testing?"})

"Langsmith is a versatile tool that can greatly assist with testing in a number of ways. Here are some key ways in which Langsmith can be useful in the testing process:\n\n1. **Automated Testing**: Langsmith can be used to automate various testing tasks, such as running test suites, performing regression testing, and executing test scripts. By automating these repetitive tasks, Langsmith can help improve efficiency and accuracy in the testing process.\n\n2. **Test Data Generation**: Langsmith can generate synthetic test data that can be used to simulate different scenarios and edge cases during testing. This can help ensure thorough test coverage and enhance the reliability of the testing process.\n\n3. **Test Case Management**: Langsmith can assist in organizing and managing test cases effectively. It allows testers to create, track, and maintain test cases in a structured manner, making it easier to execute and monitor tests.\n\n4. **Integration Testing**: Langsmith can be integrated

## 3 - Retrieval chains

In [80]:
# 3.1 Retrieve information from website using beautifulsoup (single URL)
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
docs = loader.load()

In [112]:
import requests
from bs4 import BeautifulSoup

# Define the URL of the Google News page
url = "https://news.google.com/search?q=bitcoin&hl=en-US&gl=US&ceid=US%3Aen"

# Fetch the HTML content of the page
response = requests.get(url)
html_content = response.content

# Parse the HTML content with BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')

headlines = soup.find_all('a', class_='JtKRv')

# Print all the headlines
all_headlines = []
for headline in headlines:
    all_headlines.append(headline.text)

all_headlines
# Optionally, extract URLs of the news articles
#for headline in headlines:
#    print(f"https://news.google.com{headline['href'][1:]}")

['Traders say Bitcoin price fights “last resistance” at $69K before new all-time highs',
 'Bitcoin price today: steady at $68k as inflation, rate jitters weigh on sentiment By Investing.com',
 "Goldman Sachs Issues 'Astonishing' Bitcoin And Ethereum ETF Prediction After Price 'Turning Point'",
 "Is The MEV Monster Under Bitcoin's Bed?",
 'Bitcoin, Ether Prices Ease as SHIB Drives Gains in Meme Tokens',
 'Paradigm leads $70 million raise for Bitcoin staking protocol Babylon',
 "BlackRock's $20 Billion IBIT Fund Is World's Biggest Bitcoin (BTC) ETF",
 'BlackRock’s IBIT continues to lead net inflows in spot bitcoin ETFs',
 'BlackRock’s bitcoin ETF on verge of eclipsing Grayscale’s fund',
 'Risky New Experiments Attract Billions of Dollars in Bitcoin',
 "Researchers 'hack time' to recover $3 million bitcoin wallet",
 'Researchers find lost password to crypto wallet holding 43.6 BTC: Wired',
 'Hackers finally unlock $3 million Bitcoin wallet after man forgot password for 11 years',
 'Predic

In [132]:
from langchain_core.prompts.prompt import PromptTemplate

llm = ChatOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

template = """
You are the best financial analyst in the world. You were educated at Harvard and have been working in the industry for 20 years.
These are the headlines of Google News articles about Bitcoin:
Headlines: {string}
Task: Provide me with an analysis how the bitcoin is going to move in the next three days."""

prompt_custom = PromptTemplate.from_template(template)

from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"string": RunnablePassthrough()}
    | prompt_custom
    | llm
    | StrOutputParser()
)
result_1 = rag_chain.invoke({"string": all_headlines})

# for chunk in rag_chain.stream(all_headlines):
#     print(chunk, end="", flush=True)



In [133]:
result_1

"Based on the headlines provided, it is clear that the Bitcoin market is currently experiencing a mix of positive and negative news. Traders are optimistic about new all-time highs, while concerns about inflation, rate jitters, and Mt. Gox transfers are weighing on sentiment. Additionally, there are discussions around the potential impact of macro data, ETF predictions, and staking protocols on the price of Bitcoin.\n\nTaking all these factors into consideration, it is likely that Bitcoin will continue to face volatility in the coming days. The resistance at $69K may be a key level to watch, as traders assess the market sentiment and news developments. The outcome of the U.S. inflation data, Mt. Gox transfers, and ETF predictions could have a significant impact on Bitcoin's price movement.\n\nOverall, it is important to monitor the market closely and stay informed about any new developments that could influence the price of Bitcoin. The next three days are likely to be crucial in deter

# Bitcoin

In [134]:
import yfinance as yf
import pandas as pd
from datetime import datetime, timedelta

def download_bitcoin_data():
    # Define the ticker symbol for Bitcoin
    ticker_symbol = 'BTC-USD'

    # Calculate the start and end dates
    end_date = datetime.now()
    start_date = end_date - timedelta(days=3)

    # Download the data
    bitcoin_data = yf.download(ticker_symbol, start=start_date, end=end_date, interval='1h')

    return bitcoin_data

bitcoin_data = download_bitcoin_data()

# Display the first few rows of the dataframe
print(bitcoin_data.head())

[*********************100%%**********************]  1 of 1 completed

                                   Open          High           Low  \
Datetime                                                              
2024-05-27 21:00:00+00:00  69587.390625  69768.046875  69545.406250   
2024-05-27 22:00:00+00:00  69731.265625  69731.265625  69498.507812   
2024-05-27 23:00:00+00:00  69515.000000  69515.000000  69273.187500   
2024-05-28 00:00:00+00:00  69382.226562  69397.531250  69082.804688   
2024-05-28 01:00:00+00:00  69312.070312  69500.710938  68548.710938   

                                  Close     Adj Close     Volume  
Datetime                                                          
2024-05-27 21:00:00+00:00  69768.046875  69768.046875          0  
2024-05-27 22:00:00+00:00  69512.945312  69512.945312  141010944  
2024-05-27 23:00:00+00:00  69392.781250  69392.781250   53606400  
2024-05-28 00:00:00+00:00  69279.367188  69279.367188  311988224  
2024-05-28 01:00:00+00:00  68548.710938  68548.710938  535746560  





In [135]:
from langchain_core.prompts.prompt import PromptTemplate

llm = ChatOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

template = """
You are the best chart analyst in the world. You were educated at Harvard and have been working in the industry for 20 years.
These is the market data for bitcoin at an hourly interval for the past three days:
Data: {string}
Task: Provide me with an chart analysis how the bitcoin is going to move in the next three days."""

prompt_custom = PromptTemplate.from_template(template)

from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"string": RunnablePassthrough()}
    | prompt_custom
    | llm
    | StrOutputParser()
)
result_2 = rag_chain.invoke({"string": bitcoin_data})

# for chunk in rag_chain.stream(all_headlines):
#     print(chunk, end="", flush=True)

In [139]:
result_2

"Based on the market data provided for Bitcoin at an hourly interval for the past three days, it appears that Bitcoin has been experiencing some volatility. \n\nLooking at the chart, we can see that Bitcoin's price has been fluctuating between highs and lows, with some periods of consolidation. The volume also seems to vary, indicating fluctuations in trading activity.\n\nIn the next three days, based on the current trend, it is likely that Bitcoin will continue to experience volatility. However, there are a few key levels to watch for potential price movements. \n\nIf Bitcoin manages to break above the recent high levels, it could indicate a bullish trend continuation, potentially leading to further price increases. On the other hand, if Bitcoin fails to sustain its momentum and breaks below recent support levels, it could signal a bearish trend reversal, leading to price declines.\n\nOverall, it is important to closely monitor Bitcoin's price movements and key support/resistance leve

# Manager

In [140]:
from langchain_core.prompts.prompt import PromptTemplate

llm = ChatOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

template = """
You are the best financial analyst in the world. You were educated at Harvard and have been working in the industry for 60 years.
You have been a math prodigy as a kid. You did research with Daniel Kahneman and won the Nobel Prize in Economics.
Two of your analysts have come up with predictions for the Bitcoin price in the next three days.
Analyst 1: {string_1}
Analyst 2: {string_2}
Task: You have to decide if we are going to buy or sell Bitcoin in the next three days."""

prompt_custom = PromptTemplate.from_template(template)

from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"string_1": RunnablePassthrough(), "string_2": RunnablePassthrough()}
    | prompt_custom
    | llm
    | StrOutputParser()
)
result = rag_chain.invoke({"string_1": result_1, "string_2": result_2})

# for chunk in rag_chain.stream(all_headlines):
#     print(chunk, end="", flush=True)

In [141]:
result

'As the best financial analyst in the world with 60 years of experience in the industry, I would recommend closely monitoring the market data and key support/resistance levels in the next three days before making a decision to buy or sell Bitcoin. Both analysts have highlighted the potential for continued volatility in the market, with key levels to watch for potential price movements.\n\nGiven the mixed sentiments in the market and the uncertainty surrounding factors such as inflation, rate jitters, Mt. Gox transfers, and ETF predictions, it may be prudent to wait and observe how the market evolves in the next few days before making a decision.\n\nI would advise exercising caution and not rushing into any buying or selling decisions without a clear understanding of the market trends and potential price movements. It is important to stay informed, analyze the data carefully, and adjust trading strategies accordingly based on the evolving market conditions.\n\nIn conclusion, I would rec

In [41]:
# Multiple URLs
website_urls = [
    "https://docs.smith.langchain.com/user_guide",
    "https://docs.smith.langchain.com",
]

# Initialize an empty list to store all documents
all_documents = []

# Loop through website URLs and use WebBaseLoader for each
for url in website_urls:
  loader = WebBaseLoader(url)
  website_documents = loader.load()
  all_documents.extend(website_documents)

# Process the all_documents list further (e.g., vectorization)

print(all_documents)

[Document(page_content="\n\n\n\n\nLangSmith User Guide | 🦜️🛠️ LangSmith\n\n\n\n\n\n\n\nSkip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookThis is outdated documentation for 🦜️🛠️ LangSmith, which is no longer actively maintained.For up-to-date documentation, see the latest version.User GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly un

In [42]:
# 3.2 Load openAI embedding model to get the embeddings of the documents
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [43]:
# 3.3 Add vector store
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(all_documents)
vector_store = FAISS.from_documents(documents, embeddings)

In [44]:
# Retrieve vectors from the vector store

# This is the underlying FAISS index
faiss_index = vector_store.index
print(faiss_index)

# 0 is the starting index and faiss_index.ntotal is the ending index
vectors = faiss_index.reconstruct_n(0, faiss_index.ntotal)
for i, vector in enumerate(vectors):
    print(f"Vector {i}: {vector}")

<faiss.swigfaiss.IndexFlatL2; proxy of <Swig Object of type 'faiss::IndexFlatL2 *' at 0x7fc2726fb360> >
Vector 0: [-0.0080638   0.01916084  0.01222096 ... -0.00213274  0.02612103
 -0.01153036]
Vector 1: [-0.02374188  0.01507666  0.00888921 ...  0.01401275  0.00637293
 -0.01765243]
Vector 2: [-0.01742077  0.01093862  0.01232167 ...  0.00759277 -0.00432376
 -0.01816119]
Vector 3: [-0.021696    0.01003824  0.01190209 ... -0.00271375 -0.0079929
 -0.03096637]
Vector 4: [ 0.00756025  0.01780557  0.01517474 ...  0.00688559  0.0135542
 -0.00869937]
Vector 5: [ 0.00599423  0.00548193  0.00797372 ...  0.01548741  0.00085122
 -0.02121678]
Vector 6: [-0.00539542  0.00660179  0.00760884 ... -0.00329915 -0.01941373
 -0.05169536]


In [88]:
from langchain import hub

#We can use PromptTemplate to add more instructions to our input for the LLM, (instructions, context from retriever and the question user wants to ask)
from langchain_core.prompts.prompt import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.
{context}
Question: {question}
Helpful Answer:"""
prompt_custom = PromptTemplate.from_template(template)

print(prompt_custom)

input_variables=['context', 'question'] template='Use the following pieces of context to answer the question at the end.\nIf you don\'t know the answer, just say that you don\'t know. Use three sentences maximum and keep the answer as concise as possible.\nAlways say "thanks for asking!" at the end of the answer.\n{context}\nQuestion: {question}\nHelpful Answer:'


In [73]:
retriever = vector_store.as_retriever()

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

#print(rag_chain)

for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)



Task Decomposition is the process of breaking down a task into smaller, more manageable subtasks. It helps in organizing and tracking the performance of an application across multiple interactions. This approach can assist in identifying areas for improvement and enhancing overall efficiency.

In [None]:
import matplotlib.pyplot as plt
data = {'string': # insert the data here }\n\nplt.figure(figsize=(12, 6))\nplt.plot(data.index, data['Close'], color='blue', label='Closing Price')\nplt.title('Bitcoin Price Analysis for the Next Three Days')\nplt.xlabel('Date')\nplt.ylabel('Price (USD)')\nplt.legend()\nplt.grid()\nplt.show()\n```\n\n