# 作業二：LangChain

## 請先安裝以下 (可能用得到的) packages

<mark>Please  install more packages if needed.

In [1]:
!pip install -U -q python-dotenv
!pip install -U -q openai
!pip install -U -q langchain
!pip install -U -q langchain_openai
!pip install -U -q langchain_community
!pip install -U -q langchain_experimental

In [2]:
!pip install -U -q langchain-text-splitters
!pip install -U -q chromadb

In [3]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


In [4]:
# 將當前目錄設定成 .env 存放的那個資料夾
import os
os.chdir('/content/gdrive/MyDrive')

In [5]:
from dotenv import load_dotenv
import os

load_dotenv()

os.environ["HUGGINGFACEHUB_API_TOKEN"] = os.getenv('HUGGINGFACEHUB_API_TOKEN', '沒讀到 HUGGINGFACEHUB API key')

# 失敗的話會列印出：'沒讀到 HUGGINGFACEHUB_API key'
api_token = os.environ["HUGGINGFACEHUB_API_TOKEN"]

# 不希望API key值被印出所以多加一道檢查
if api_token != '沒讀到 HUGGINGFACEHUB API key':
    print("有讀到 HUGGINGFACEHUB_API key")
else:
    print(api_token)

有讀到 HUGGINGFACEHUB_API key




---



## Part 1: Building a Simple QA App [25分]

🔺請依照以下指示，創建 1 個 LLMchain，並用其回答指定問題：

1. **Install and import necessary libraries**
2. **Define an LLM** <5分>
  - 定義一個你想使用的模型
  - Either an LLM model or a Chat model
3. **Define a Prompt** <5分>
  - 使用 `PromptTemplate` or `ChatPromptTemplate` 創建一個 prompt
  - 這個 prompt 必須包含變數 `question`
4. **Define a Chain** <5分>
  - 使用剛剛定義的 LLM & prompt 創建一個 chain
5. **Invoke the Chain & Print the Answer** <10分>
  - <mark>請用這個問題：`"Are there aliens on the Moon?"`
  - 需用到 `.invoke()`
  - 修改問號處並使用以下程式碼印出結果：

    ```
    answer = ???????
    print(answer.content)
    ```



In [6]:
# 1. Install and import necessary libraries
from langchain_community.llms import HuggingFaceEndpoint
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

In [7]:
# 2. 定義模型
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm_try = HuggingFaceEndpoint(repo_id=repo_id, temperature = 0.7)

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [8]:
# 3. 定義 prompt
template = """
You are now an extraterrestrial enthusiast, firmly believing in the existence of aliens on the moon. Please reply to "{question}"
"""
prompt = PromptTemplate.from_template(template)

In [9]:
# 4. 定義 chain
chain = prompt | llm_try

In [10]:
# 5.6 Invoke the Chain & 列印結果
answer = chain.invoke({"question": "Are there aliens on the Moon?"})
print(answer)


Subject: 🌕 Extraterrestrial Moon Discoveries 👽

Hi there! I'm thrilled that you've asked about the possibility of aliens on the moon. Given my newfound interest in all things extraterrestrial, I've been keeping a close eye on the latest discoveries and theories.

There have indeed been several intriguing findings over the years that have fueled speculation about lunar life. Here are a few:

1. Anomalous Lunar Transient Phenomena (ALTP): These are unexplained bright spots or flashes observed on the lunar surface by various observatories. Some researchers believe they could be signs of subsurface water or even artificial structures.
2. The Far Side Moon Mystery: The far side of the moon, which is always facing away from Earth, has no permanent human presence. However, there have been reports of strange structures and formations visible only from this side. Some theories suggest these could be remnants of an ancient lunar civilization.
3. The Cydonia Region: This area on the moon is famo

### 參考資料
- [[D15] LangChain 專案實做 - Hello LangChain](https://ithelp.ithome.com.tw/articles/10322165?fbclid=IwZXh0bgNhZW0CMTAAAR3uGLgZxX6ZUKXFzGHlCflUcIXDk3AWqjNLtICRzgqkGEmisQd4I76ar9k_aem_AYLjV9jct7fZVFaRjypjts1SfF1rnX5W4sd_cjLeAktZeNtxyxGfUv1mksXGgLXcvN0C29sJuAkXiIIH13RXXULE)
- [【初學筆記】取得Hugging Face API Tokens 到執行簡單LLM問答內容生成(Inference)](https://medium.com/@Ting_YTK/%E5%88%9D%E5%AD%B8%E7%AD%86%E8%A8%98-%E5%8F%96%E5%BE%97hugging-face-api-tokens-%E5%88%B0%E5%9F%B7%E8%A1%8C%E7%B0%A1%E5%96%AEllm%E5%95%8F%E7%AD%94%E5%85%A7%E5%AE%B9%E7%94%9F%E6%88%90-inference-3473baa347f6)
- [LLM Note Day 23 - LangChain 中二技能翻譯](https://ithelp.ithome.com.tw/articles/10335721)



---



## Part 2: Customize 3 Math Tools [30分]



🔺請依照以下指示，創建 3 個 LangChain 工具 for 基本數學運算：

1. **Install and import necessary libraries**
2. **Define Tools**

    - 請將三個工具分別命名成： `divide`, `minus`, `exponentiate`
    - 使用 `langchain_core.tools` 模組中的 @tool decorator 來定義每個工具，需定義每個工具的參數和回傳類型 <18分>
    - 在工具的函數中編寫程式碼，以執行特定的數學運算 <6分>

        🌟HINTS🌟
      - divide: 第1個整數 除以 第2個整數
      - minus: 第1個整數 減 第2個整數
      - exponentiate: 第1個整數的(第2個整數)次方

3. **Store the Tools** <6分>
  - 請將三個工具以 list 方式，儲存在 `tools` 這個變數

In [11]:
## 1. Install and import necessary libraries
from langchain_core.tools import tool

In [12]:
## 2. Define Tools
@tool
def divide(a: int, b:int) -> float:
  """Divide the first integer by the second integer."""
  return a/b

@tool
def minus(a: int, b:int) -> int:
  """Subtract the second integer from the first integer."""
  return a-b

@tool
def exponentiate(a: int, b:int) -> float:
  """The first integer raised to the power of the second integer."""
  return a**b

In [13]:
## 3. Store the Tools
tools = [divide, minus, exponentiate]

### 參考資料

- [最全面又最浅显易懂的Langchain快速上手教程（下）](https://juejin.cn/post/7345869813329936399)



---



## Part 3: Combine Tools with Agent [45分]

🔺請依照以下指示，結合 Part 2 的 `tools` 創建 1 個 agent
0. **請保留 Part 2 儲存的 `tools`**
  - <mark>如果 part 2 part 3 分成兩次做，請先跑一次 part 2 的 code，再繼續執行以下操作
1. **Install and import necessary libraries**
2. **Define an LLM** <5分>
  - 選用一個 HuggingFace 模型


3. **Define Prompt** <20分>
  - 使用 `ChatPromptTemplate.from_messages()` 創造一個 prompt <
  - 這格程式碼會長得像：
    ```python
      prompt = ChatPromptTemplate.from_messages(
        [
          ("¿¿", "❓❓❓",),
          ("placeholder", "❓❓❓"),
          ("¿¿", "❓❓❓"),
          ("placeholder", "{agent_scratchpad}"),
        ]
      )
    ```
  - 當然你也可以自由發揮🐳
  
4. **Define Agent Chain** <15分>
  - 使用 `initialize_agent()` 定義一個 agent chain
  - <mark>可能需要的模組和函數
    ```python
      from langchain.agents import AgentType
      from langchain.agents import initialize_agent
    ```
5. **Invoke Agent Executor** <5分>
  - 使用以下程式碼印出模型的回覆結果
    ```python
      ## invoke agent_executor 列印模型的回覆結果
      result = agent_executor.invoke({
      "input": "Take 2 to the power of 6, then divide this result obtained by the result of 24 minus 8, then square the whole result"
      })

      print(result)
    ```

In [14]:
# 1. Install and import necessary libraries
from langchain.prompts import ChatPromptTemplate
from langchain.agents import AgentExecutor
from langchain.agents import AgentType
from langchain.agents import initialize_agent

In [15]:
# 2. 定義 LLM
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(repo_id=repo_id)

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [16]:
# 3. 定義 prompt
prompt = ChatPromptTemplate.from_messages(
   [
     ("user", "I have a question. {input}. Please carefully find the correct algorithm."),
     ("assistant", "Okay, let me tell you the answer.")
   ]
 )

In [17]:
# 4. 定義 agent_chain
agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose=True
)
agent_executor = prompt | agent_chain

  warn_deprecated(


In [18]:
# 5. 列印模型的回覆結果
result = agent_executor.invoke({
    "input": "Take 2 to the power of 6, then divide this result obtained by the result of 24 minus 8, then square the whole result"
 })

print(result)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction:
```json
{
  "action": "divide",
  "action_input": {
    "a": 64,
    "b": 16
  }
}
```
[0m
Observation: [36;1m[1;3m4.0[0m
Thought:[32;1m[1;3m I need to subtract 8 from 24 first.
Action:
```json
{
  "action": "minus",
  "action_input": {
    "a": 24,
    "b": 8
  }
}
```

[0m
Observation: [33;1m[1;3m16[0m
Thought:[32;1m[1;3m Now I can find 2 raised to the power of 6 and divide it by 16.
Action:
```json
{
  "action": "exponentiate",
  "action_input": {
    "a": 2,
    "b": 6
  }
}
```

[0m
Observation: [38;5;200m[1;3m64[0m
Thought:[32;1m[1;3m Now I can divide 64 by 16.
Action:
```json
{
  "action": "divide",
  "action_input": {
    "a": 64,
    "b": 16
  }
}
```

[0m
Observation: [36;1m[1;3m4.0[0m
Thought:[32;1m[1;3m Now I need to square 4.
Action:
```json
{
  "action": "exponentiate",
  "action_input": {
    "a": 4,
    "b": 2
  }
}
```


[0m
Observation: [38;5;200m[1;3m16[0m
Thought:[32;1m

### 參考資料

- [[LangChain for LLM Application Development] 課程筆記- Agents](https://hackmd.io/@YungHuiHsu/rkBMDgRM6?utm_source=preview-mode&utm_medium=rec)
- [LangChain 9 模型Model I/O 聊天提示词ChatPromptTemplate, 少量样本提示词FewShotPrompt](https://blog.csdn.net/zgpeace/article/details/134608897)

## 🌟BONUS: 針對 YT 影片問問題 [20分]🌟

1. 使用 [YouTube transcripts Loaders](https://python.langchain.com/v0.1/docs/integrations/document_loaders/youtube_transcript/) 抓取任何「長度大於10分鐘」的 YouTube影片的cc字幕，會讀成一個長文本
2. 將此文本做成向量，存到一個 VectorStore DB （建議使用 [Chroma](https://python.langchain.com/v0.1/docs/integrations/vectorstores/chroma/)）
3. 建立一個 retriever 以及基於此 retriever 的 LLM chain（[助教課的Colab](https://colab.research.google.com/drive/1T7Q4X9qeppvMuYI3pJLN4OfD9L_F6YMx?usp=sharing)有教ㄛ🐳）
4. 設計一個和此影片有關的問題
5. 最後請把  **`chain.invoke('貼上你自己設計的問題')`  的結果列印出來**




In [19]:
!pip install --upgrade --quiet  youtube-transcript-api

In [20]:
# Install and import necessary libraries
from langchain_community.document_loaders import YoutubeLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

In [21]:
# YouTube transcripts Loaders
loader = YoutubeLoader.from_youtube_url(
    "https://www.youtube.com/watch?v=lYtdZo0aW50", add_video_info=False
)
film = loader.load()

# Split the text
text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000, chunk_overlap=100)
sp_film = text_splitter.split_documents(film)

num_total_characters = sum([len(x.page_content) for x in sp_film])

print(f"我們有 {len(sp_film)} splitted documents")
print(f"平均每個 splitted document 有 {num_total_characters / len(sp_film):,.0f} characters")

我們有 4 splitted documents
平均每個 splitted document 有 2,586 characters


In [22]:
# Get the embeddings engine ready
embeddings = HuggingFaceInferenceAPIEmbeddings(
    model_name="sentence-transformers/all-MiniLM-l6-v2",
    api_key=api_token,
)

# Set the vectorstore
vectorstoreDB = Chroma(embedding_function = embeddings,
                        persist_directory = './Chroma_DATABASE')
vectorstoreDB.add_documents(sp_film)

['459a7d18-5de8-417b-804d-786cba33fcc0',
 'b8763bde-3966-46d4-b0f4-40614cbaf8de',
 '70188cc7-b140-4e18-b7c6-fad7e3075d54',
 'a171c430-cc42-425a-a37d-4e7e5a1cc8f3']

In [23]:
# Define retriever
retriever = vectorstoreDB.as_retriever(search_kwargs={"k": 1})

## Define the prompt
template = """Please answer the following question based on the context provided:

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

# Set the LLM chain
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [24]:
# Question
user_query = "How did the speaker rebel when he was 14?"

In [25]:
# Invoke the chain
chain.invoke(user_query)

'\nAnswer: The speaker rebelled by doing his math homework in secret, locking himself in a room, and hiding it from his father. When his father discovered him, the speaker lied and claimed he was masturbating instead, hoping his father would not be disappointed in him for doing homework. The speaker also mentioned that his parents are very traditional and hard to please, making it difficult for him to bond with them or enjoy simple pleasures like watching TV.'