<a href="https://colab.research.google.com/github/donghuna/AI-Expert/blob/main/%ED%99%A9%EC%8A%B9%EC%9B%90/%5BStudent%5DOllama%2BLangChain_for_tools_usingOpenAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Effective & Effiecient LLM use

이 노트북에서는 LLM을 효과적으로 사용하는 방법에 대해서 다룹니다.

openai api를 사용하기 힘든 상황을 가정해 local에서 small-LLM을 배포해 사용합니다.

## Overview
1. LLM via api (로컬 환경에서)
  1. Ollama 설치 및 사용법 익히기 (web-ui)
  2. Ollama api 사용하기
2. LLM으로 논문 요약본 만들기
3. LLM으로 DB에서 중요한 정보 추출하기

# 패키지 설치하기
이 노트북에서 사용하는 모든 패키지를 미리 설치합니다.


In [1]:
!pip install --quiet openai \
  langchain langchain_community langchain_core langchain_openai langchainhub \
  python-dotenv \
  tenacity \
  google-search-results \
  unstructured \
  arxiv pymupdf \
  tiktoken \
  streamlit streamlit-folium wikipedia

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m328.3/328.3 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m983.6/983.6 kB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m39.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m355.8/355.8 kB[0m [31m21.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.9/45.9 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m37.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.5/3.5 MB[0m [31m40.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m44.1 MB/s[0

# 1. LLM을 api로 사용해보기!

작년 봄부터 chatgpt가 유명해짐에 따라 LLM을 web-ui에서 chat형태로 많이 사용해보셨을 거 같습니다.

최근에는 여러 플러그인도 chatgpt에서 지원함에 따라 다양한 기능을 할 수 있는 gpt-store도 사용가능한데요.

이 챕터에서는 LLM 사용을 web에서의 chat형태에서 벗어나 python코드로 사용할 수 있는 방법으로 확장해나가려합니다.

# 로컬에서 LLM 사용하기
보안 혹은 비용 문제로 openai의 api를 사용하기 힘든 경우가 있습니다.

보안과 비용문제가 없다면 [openai api 문서](https://platform.openai.com/docs/overview) 를 보면서 코드를 작성해보셔도 좋습니다.

하지만 이 노트북에서는 openai api 사용이 힘든 경우를 가정해 진행됩니다.

추가로 최근에는 sLLM (7\~13B 규모), medium size LLM (70\~200B)의 성능이 chatgpt (gpt4)를 능가하는 경우도 많이 리포팅되고 있어 어떤 상황에서는 gpt보다는 이 노트북에서 소개해드리는 방식을 사용하는 게 좋을 수 있습니다.

## Ollama 배포하기
### Reference
* [Ollama.com](https://ollama.com/library)
*  [Quick Start](https://github.com/ollama/ollama?tab=readme-ov-file#ollama)


#### Ollama 공식, 베어메탈 장비에 배포하기
~~~bash
# install
curl -fsSL https://ollama.com/install.sh | sh
# run
ollama run llama3
~~~

#### Ollama 공식, 도커로 배포하기
[reference](https://hub.docker.com/r/ollama/ollama)
~~~bash
# cpu only
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

# for gpu please refer the link. The command differs by the GPU vendor.
~~~

#### 웹 UI 포함한 도커 컴포즈로 배포하기
[reference](https://github.com/open-webui/open-webui)

**USE THIS! Easiest with web ui**


## LLAMA3 example
### If runned with Official
~~~bash
ollama run llama3
ollama run llama3:70b
ollama run llama3:text
ollama run llama3:70b-text
~~~

### with web ui
just select the model then you can use it (if the model is not downloaded, it will download automatically)

## WebUI 사용해보기
* 주소: http://ldi.snu.ac.kr:3000/
* email: samsung.test@samsung.test.kr
* password: 1234

## ollama api를 openai api로 사용하기
ollama의 내부 api는 openai api를 mimic해서 만들어져 있습니다.

그렇기 때문에 openai 라이브러리로 ollama 에 배포되어 있는 모델을 사용해볼 수 있습니다.


In [4]:
# 환경 설정하기
# OPENAI_API_BASE 와 OPENAI_API_KEY 를 환경변수로 설정해줍니다.

import os


# should be http://{deployed server url}:11434/v1

# os.environ["OPENAI_API_BASE"] = "http://ldi.snu.ac.kr:11434/v1" # RTX 3080, 11GB
# model_name = "llama3"

os.environ["OPENAI_API_BASE"] = "http://dilab15.snu.ac.kr:11434/v1" # A6000, 48GB
model_name = "llama3:8b-instruct-fp16"

os.environ["OPENAI_API_KEY"] = "ollama"


## openai api로 chatting 해보기

In [6]:
from openai import OpenAI
from google.colab import userdata


# OpenAI 인스턴스 init
client = OpenAI(
    # base_url=os.environ["OPENAI_API_BASE"],
    api_key=userdata.get('OPEN-AI_KEY'),
    organization=userdata.get('MY_OPENAI_ORG_ID'),
    project=userdata.get('MY_OPENAI_PROJECT_ID')
)

# chat completion으로 응답 생성하기
response = client.chat.completions.create(
    model='gpt-3.5-turbo',
    messages=[
        # 시스템 프롬프트, LLM이 어떤 역할을 하거나, 전체 대화에서의 제약사항을 적어줄 수 있습니다.
        {"role": "system", "content": "You are a helpful assistant."},

        # 메인 프롬프트: user와 assitant가 번갈아가면서 나옵니다.
        # 보통 user로 시작합니다.
        # 아래 예시의 경우, LLM이 한번 답변을 한 상황입니다.
        # LLM의 답변은 assitant role로 적어주야합니다.

        ### 대화의 내용은 2020년 월드 시리즈에서 어느 팀이 이겼는지를 묻고 있습니다.
        ### 2020년에 Dodgers가 Rays 상대로 3:1로 우승한 것을 확인할 수 있습니다. (https://www.google.com/search?q=2020+world+series&oq=2020+world+series&sourceid=chrome&ie=UTF-8)
        ### (LLama3는 2020년 이후에 학습되었기 때문에 해당 정보를 학습하고 기억하고 있는 모습입니다.)
        ### 어디서 경기가 열렸는지 추가로 질문해봅시다.
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The LA Dodgers won in 2020."},
        {"role": "user", "content": "Where did the game start?"}
    ]
)

print(response.choices[0].message.content)

The 2020 World Series was played at a neutral site due to the COVID-19 pandemic. The games were held at Globe Life Field in Arlington, Texas.


In [None]:
# 이어서 대화를 해봅시다.
msgs = [
     {"role": "user", "content": "Recommend me 5 Korean foods for dinner."}
  ]
# chat completion으로 응답 생성하기
response = client.chat.completions.create(
    model=model_name,
    messages=msgs
)

print(response.choices[0].message.content)

Here are 5 delicious Korean dishes that are perfect for dinner:

1. **Bulgogi** (): A classic Korean dish, bulgogi is a marinated beef dish that's grilled to perfection. The sweet and savory marinade makes the beef tender and flavorful.
2. **Jeyuk bokkeum** (): This spicy stir-fried pork dish is a staple in Korean cuisine. Thinly sliced pork is stir-fried with vegetables, chili peppers, and a variety of spices, creating a bold and aromatic flavor profile.
3. **Naengmyeon** (): If you're looking for something light and refreshing, naengmyeon (cold buckwheat or starch noodles) is the way to go. This dish typically consists of chewy noodles served with a spicy sauce, sliced cucumbers, pear slices, and boiled egg.
4. **Dakgangjeong** (): Crispy fried chicken glazed in a sweet and savory soy-ginger sauce - what's not to love? Dakgangjeong (Korean-style fried chicken) is a popular Korean dish that's easy to make and always satisfying.
5. **Doenjang jjigae** (): For those who love fermented f

In [None]:
# 수학 문제 풀어보기
# CoT, ICL
# CoT: https://arxiv.org/pdf/2205.11916
# Define the prompt with a one-shot example
# in-context learning 관련 논문들이 많이 나온 이후로 예시를 프롬프트와 함께 제공하는, few-shot example, 방식이 많이 정형화 되었습니다.
# 일부 테스크에서는 예시가 없으면 성능이 크게 저하되는 모습을 보여줄정도로 in-context learning은 LLM prompting에서 새로운 era를 열었습니다.
prompt = [
    {"role": "user", "content": """
Solve the following math problem:

Example:
Problem: What is 7 + 5?
Solution: 7 + 5 = 12

Now solve this problem:
Problem: What is 8 * 6?

Think step by step.

Solution:
"""}
]

# Call the OpenAI API with the defined prompt
response = client.chat.completions.create(
    model=model_name,
    messages=prompt,
    temperature=0.2
)

# Extract and print the response
solution = response.choices[0].message.content.strip()
print(solution)


Let's solve the problem step by step.

Problem: What is 8 * 6?

Step 1: Multiply 8 and 6

8 × 6 = ?

To multiply, I'll count how many groups of 8 there are in 6 sets:

* 8 + 8 = 16 (2 groups)
* 16 + 8 = 24 (3 groups)
* 24 + 8 = 32 (4 groups)
* 32 + 8 = 40 (5 groups)
* 40 + 8 = 48 (6 groups)

So, the result of multiplying 8 and 6 is:

8 × 6 = 48

Solution:
Problem: What is 8 * 6?
Solution: 8 * 6 = 48


In [None]:
# llama3-8B의 학습데이터는 2023년 3월까지의 정보입니다.
# https://huggingface.co/meta-llama/Meta-Llama-3-8B
msgs = [
     {"role": "user", "content": "How is the interest rate in US?"}
  ]
# chat completion으로 응답 생성하기
response = client.chat.completions.create(
    model=model_name,
    messages=msgs
)

print(response.choices[0].message.content)

The current interest rates in the United States are set by the Federal Reserve, which is the central bank of the country. The Fed uses monetary policy tools to influence interest rates and promote maximum employment, stable prices, and moderate long-term interest rates.

As of March 2023, the key interest rates set by the Federal Reserve are:

1. **Federal Funds Target Rate**: This is the primary tool used by the Fed to set short-term interest rates. As of March 15, 2023, the federal funds target rate is between 4.50% and 4.75%.
2. **Federal Reserve Discount Rate**: This is the rate at which banks borrow money directly from the Federal Reserve. The current discount rate is 5.00%.

These interest rates affect a range of other interest rates in the economy, including:

1. **Prime Lending Rate**: This is the rate at which banks lend to their most creditworthy customers. As of March 2023, the prime lending rate is around 6.50%.
2. **Mortgage Rates**: The 30-year fixed mortgage rate has bee

In [None]:
# https://www.bbc.com/news/articles/c1ddj7v9y97o
msgs = [
     {"role": "user", "content": """
The US Federal Reserve has signalled that it will cut its key interest rate just once this year despite inflation easing.
Back in March, the central bank had been expected to reduce borrowing costs three times by the end of 2024.
However, on Wednesday, new forecasts from Fed officials who make decisions on rates pencilled in a single reduction.
The new outlook emerged after the Fed voted to hold interest rates at their current 23-year high even as inflation ticked lower.
Inflation, which measures the pace of price rises, slowed to 3.3% in the year to May. That compares with 3.4% in the 12 months to April.
However, between April and May inflation was unchanged and it remains above the Fed's 2% target.
Jerome Powell, chair of the Federal Reserve, said that only "modest" progress had been made on hitting the target and the central bank would need to see "good inflation readings" before interest rates can be cut.
US interest rates were held at 5.25%-5.5%.
Anastassia Fedyk, assistant professor of finance at Haas Business School at the University of California Berkeley, told the BBC's Today programme: "We did get some good news in terms of better inflation numbers.
"But the Fed is still being pretty cautious so they are signalling that in the future they are going to be doing one, most likely, rate drop and not a very large one at that."
Some analysts suggested that the central bank would backtrack on the number of interest rate cuts this year.
Ian Shepherdson, chief economist at Pantheon Macroeconomics, said that reducing forecasts of interest rate cuts from three to one this year was "unnecessarily aggressive".
While economists at Wells Fargo said it would be a "close call" between making one or two reductions in 2024.
Officials at the US Fed were split over how many interest rate cuts they expected this year. Of the 19 policymakers who gave their outlook, four expected no cut, seven forecasted one reduction while eight thought there would be two.
US jobs surge casts doubt over interest rate cuts
Will the UK and US cut interest rates like Europe?
When will interest rates come down?
Forecasts from the US Fed signalled one modest cut to 5%-5.25%.
Mr Powell acknowledged that a reduction of this size would not have a major impact on the US economy.
But he said when a cut finally does come it would be “a consequential decision for the economy” which “you want to get right".
While inflation eased a little, the US employment market remains robust. Recent data showed that US employers added 272,000 jobs in May - far above the 185,000 expected.
Ms Fedyk said: "The Fed is trying to react to the data but not overreact to the data."
Some other major economies have cut interest rates, including the European Central Bank and the Bank of Canada.
But the US - and the UK - are yet to make a similar move. The Bank of England will meet next week and is widely expected to hold interest rates at 5.25%, their highest level for 16 years.
The Consumer Prices Index (CPI) measure of inflation has slowed significantly in the UK from a high of 11.1% in October 2022 to 2.3% currently.
However, some elements of inflation remain stubbornly high. At the same time, average wage growth in the UK remains strong compared to inflation.
Earlier this week, Ruth Gregory, deputy chief UK economist at Capital Economics said: "Overall, the stickiness of wage growth may not stop the Bank from cutting interest rates for the first time in August, as we are forecasting, as long as other indicators such as pay settlements data and next week’s CPI inflation release show decent progress."

How is the interest rate in US?
"""}
  ]
# chat completion으로 응답 생성하기
response = client.chat.completions.create(
    model=model_name,
    messages=msgs
)

print(response.choices[0].message.content)

According to the article, the interest rate in the United States has been held at its current level of 5.25%-5.5%, which is a 23-year high. The Federal Reserve (Fed) has signalled that it will only cut its key interest rate once this year, and even then, the cut is expected to be modest, likely reducing the rate to 5%-5.25%.


# Tools
LLM을 사용하다 보면 LLM이 만능인가라는 생각이 듭니다.

하지만 LLM도 그 능력이 자신이 학습한 데이터안에서 문자를 생성하는 데에 한정되어 있다는 것을 몇시간 사용하다보면 알게됩니다.

그러면 어떻게 하면 LLM의 능력을 확장시킬 수 있을까요?

사람과 마찬가지로 그 능력에 특화된 도구를 사용하는 식으로 이룰 수 있습니다.

마치 손으로 종이를 찢어도 되지만 가위를 사용하는 사람처럼 말이죠.

그러면 이런것도 가능할까요?

1. LLM이 학습한 이후에 발행된 논문에 대해 요약본을 얻고 주요 핵심 내용을 질문하거나
2. 여행계획을 짜는 데 주요 여행지를 알아서 검색해주고, 여행 계획을 짜주는 식으로 말이죠.

이어지는 세션에서는
1. LLM이 arxiv에 접근해서 논문을 가져오고, 적절히 분할해 요약본을 만들고, 사용자의 질문에 답변할 수 있게 능력을 확장해봅니다.
2. 여러분이 직접 랭체인 문서를 읽어보며 툴들을 활용해봅니다.


# [Arxiv] Arxiv PDF 요약해보기

많은 preprint 페이퍼들이 arxiv에 공개됩니다. 그 만큼 arxiv에 주요한 페이퍼들을 follow up 하는 것만으로도 관련 내용을 이해하는데에 많은 도움이 됩니다.

하지만 pre-print다 보니, 하루에도 수십개의 페이퍼들이 올라옵니다.

그 중에서 중요한걸 고르는 것도, 중요해보이는 페이퍼를 읽어서 정리하는 것도 시간을 많이 필요로 합니다.

이떄 LLM을 활용해볼 수 있습니다.

LLM에게 요약, 주요 contribution 정리, 중요 결과를 물어봅시다.

위 기능들을 직접 앞서 구현한 `chat.compleitions` 로도 구현해볼 수 있지만
너무 많은 번거로움과 오류가 발생할 수 있습니다.

Langchain을 사용하면 이런 번거로움과 에러 핸들링에서 벗어날 수 있습니다.

langchain에서는 LLM외에 LLM과 여러 툴과 소통하면 목적을 달성할 수 있게 도와줍니다.

또한, 문서가 길어 'lost-in-the-middle' (문서 전체의 중간을 까먹는 현상) 을 방지 할 수 있게 문서를 shard해서 잘라주고, 각 문서별 요약을 한 뒤에 요약을 해주는 방식도 제공해줍니다.

Langchain 에 대한 자세한 내용을 알고 싶다면 [langchain docs](https://python.langchain.com/v0.2/docs/introduction/)를 읽어보세요.

In [None]:
# imports
import os
import warnings
from pprint import pprint

from dotenv import load_dotenv
from langchain_community.document_loaders import PyPDFLoader, PDFMinerLoader, ArxivLoader, PyMuPDFLoader
from langchain_community.retrievers import ArxivRetriever

from langchain.prompts import PromptTemplate
from langchain_community.chat_models import ChatOpenAI
from langchain.chains import LLMChain

from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.question_answering import load_qa_chain

In [None]:
# 설정 셋업.
load_dotenv()
VERBOSE = False
arxiv_id = "1706.03762" # Attention Is All You Need

In [None]:
# pdf 읽기
# 직접 pdf를 읽어서 사용할 수도 있지만
# 여기서는 arxiv에서 자동으로 불러오는 기능을 사용해봅시다.

# # load pdf
# loader = PyMuPDFLoader(pdf_name)
# pdf_pages = loader.load_and_split()

# load from arxiv
retriever = ArxivRetriever(load_max_docs=1)
retriever.get_full_documents = True
retriever.doc_content_chars_max = 1_000_000
arxiv_doc = retriever.invoke(arxiv_id)[0]

In [None]:
# ========== 0 LLM setting ========== #
# 랭 체인을 사용해 LLM을 설정합니다.
llm = ChatOpenAI(
    base_url=os.environ["OPENAI_API_BASE"],
    api_key=os.environ["OPENAI_API_KEY"],
    temperature=0,
    model_name=model_name
)

  warn_deprecated(


In [None]:
# ========== 1 문서분할 ========== #
# 문서를 나눠줍니다.
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",  # 분할기준
    chunk_size=3000,  # 사이즈
    chunk_overlap=500,  # 중첩 사이즈
)

# 분할 실행
split_docs = text_splitter.split_documents([arxiv_doc])
# 총 분할된 도큐먼트 수
print(f'총 분할된 도큐먼트 수: {len(split_docs)}')

총 분할된 도큐먼트 수: 5


In [None]:
# Map-Reduce를 활용해 분할 처리를 해줍니다.
# map 함수를 이용해 분할된 문서들에 요약을 시킵니다.
# reduce 함수로 분할된 문서의 요약을 하나로 합쳐줍니다.

# ========== 2 Map 단계 ========== #

# Map 단계에서 처리할 프롬프트 정의
# 분할된 문서에 적용할 프롬프트 내용을 기입합니다.
# 여기서 {pages} 변수에는 분할된 문서가 차례대로 대입되니다.
map_template = """The following is a page from the document:
{pages}
Please summarize the content of the page.
Response:"""

# Map 프롬프트 완성
map_prompt = PromptTemplate.from_template(map_template)

# Map에서 수행할 LLMChain 정의
map_chain = LLMChain(llm=llm, prompt=map_prompt)

  warn_deprecated(


In [None]:
# ========== 3 Reduce 단계 ========== #

# Reduce 단계에서 처리할 프롬프트 정의
reduce_template = """These are partial summaries from each page of the documents:
{doc_summaries}
Please summarize the summaries into a single coherent summary.
Response:"""

# Reduce 프롬프트 완성
reduce_prompt = PromptTemplate.from_template(reduce_template)

# Reduce에서 수행할 LLMChain 정의
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

# 문서의 목록을 받아들여, 이를 단일 문자열로 결합하고, 이를 LLMChain에 전달합니다.
combine_documents_chain = StuffDocumentsChain(
    llm_chain=reduce_chain,
    document_variable_name="doc_summaries"  # Reduce 프롬프트에 대입되는 변수
)

# Map 문서를 통합하고 순차적으로 Reduce합니다.
reduce_documents_chain = ReduceDocumentsChain(
    # 호출되는 최종 체인입니다.
    combine_documents_chain=combine_documents_chain,
    # 문서가 `StuffDocumentsChain`의 컨텍스트를 초과하는 경우
    collapse_documents_chain=combine_documents_chain,
    # 문서를 그룹화할 때의 토큰 최대 개수입니다.
    token_max=4000,
)

In [None]:
# ========== 4 Map-Reduce 통합단계 ========== #

# 문서들에 체인을 매핑하여 결합하고, 그 다음 결과들을 결합합니다.
map_reduce_chain = MapReduceDocumentsChain(
    # Map 체인
    llm_chain=map_chain,
    # Reduce 체인
    reduce_documents_chain=reduce_documents_chain,
    # 문서를 넣을 llm_chain의 변수 이름(map_template 에 정의된 변수명)
    document_variable_name="pages",
    # 출력에서 매핑 단계의 결과를 반환합니다.
    return_intermediate_steps=False,
    verbose=VERBOSE
)


In [None]:
# ========== 5 실행 결과 ========== #

# Map-Reduce 체인 실행
# 입력: 분할된 도큐먼트(1의 결과물)
result = map_reduce_chain.invoke(split_docs, {"show_progress": True})
# 요약결과 출력
pprint(result)

chain = load_qa_chain(
    llm=llm,
    chain_type="map_reduce",
    verbose=VERBOSE
)




{'input_documents': [Document(page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗†\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on atten

In [None]:
# QnA: main contribution
query = "What are the main contributions of this paper?"
result_qa_chain = chain.run(input_documents=split_docs, question=query)
pprint(result_qa_chain)

  warn_deprecated(


('Based on the provided text, it appears that the main contributions of this '
 'paper are:\n'
 '\n'
 '1. **Introducing the Transformer model**: The authors propose a new '
 'architecture for sequence-to-sequence tasks that replaces traditional '
 'recurrent neural networks (RNNs) and convolutional neural networks (CNNs) '
 'with self-attention mechanisms.\n'
 '2. **Self-Attention Mechanism**: They introduce a novel self-attention '
 'mechanism that allows the model to attend to all positions in the input '
 'sequence simultaneously, rather than sequentially as in RNNs or CNNs.\n'
 '3. **Parallelization**: The Transformer model can be parallelized more '
 'easily than RNNs and CNNs, making it faster and more efficient for '
 'large-scale applications.\n'
 '4. **Improved Performance**: The authors demonstrate that their Transformer '
 'model achieves state-of-the-art results on several machine translation '
 'tasks, outperforming previous models while using fewer parameters and less '
 

In [None]:
# QnA: main methodology
query = "What is the main methodology of this paper?"
result_qa_chain = chain.run(input_documents=split_docs, question=query)
pprint(result_qa_chain)



('Based on the provided content, the main methodology of this paper appears to '
 'be the development and presentation of a new neural network architecture '
 'called the Transformer, which is designed for sequence-to-sequence tasks '
 'such as machine translation. The key innovations of the Transformer are:\n'
 '\n'
 '1. **Self-Attention Mechanism**: Instead of using recurrent or convolutional '
 'layers to model dependencies between input sequences, the Transformer uses '
 'self-attention mechanisms to allow the model to attend to all positions in '
 'the input sequence simultaneously.\n'
 '2. **Encoder-Decoder Architecture**: The Transformer consists of an encoder '
 'and a decoder, where the encoder maps the input sequence to a continuous '
 'representation, and the decoder generates the output sequence one step at a '
 'time.\n'
 '3. **Positional Encoding**: To inject information about the relative or '
 'absolute position of the tokens in the sequence, the paper uses positional '

In [None]:
# QnA: main results
query = "What are the important results to look at in this paper? Please provide with numbers. Provide all the results if possible."
result_qa_chain = chain.run(input_documents=split_docs, question=query)
pprint(result_qa_chain)



('Based on the provided text, here are some of the important results mentioned '
 'in the paper:\n'
 '\n'
 '**BLEU Scores**\n'
 '\n'
 '* The Transformer achieves better BLEU scores than previous state-of-the-art '
 'models on the English-to-German and English-to-French newstest2014 tests:\n'
 '\t+ EN-DE: 27.3 (Transformer base model), 28.4 (Transformer big)\n'
 '\t+ EN-FR: 38.1 (Transformer base model), 41.8 (Transformer big)\n'
 '\n'
 '**ROUGE Scores**\n'
 '\n'
 "* The Transformer achieves a ROUGE score of 24.5 on the WMT'14 "
 'English-to-German translation task, outperforming the previous '
 'state-of-the-art model by 1.8 points.\n'
 '\n'
 '**Perplexity**\n'
 '\n'
 "* The Transformer achieves a perplexity of 15.4 on the WMT'14 "
 'English-to-German translation task, outperforming the previous '
 'state-of-the-art model by 0.6 points.\n'
 '\n'
 '**Speed**\n'
 '\n'
 '* The Transformer is significantly faster than the previous state-of-the-art '
 'model, with a speedup of 10x on a sing

# [직접 만들어보기]
https://python.langchain.com/v0.1/docs/integrations/tools/

랭체인 문서에 들어가면 정말 다양한 툴들이 있고 각각을 쉽게 사용할 수 있게 문서가 설명되어 있습니다.

어떤 툴이 있는 지 보고 하나 혹은 두개 이상의 툴들을 체이닝 해서 LLM의 능력을 확장해봅시다.

## [예시]
1. 요즘 유행하는 주제와 관련 기업 가치 파악
  1. "google trends"를 이용해 현재 어떤 주제가 유행인지를 파악.
  2. "google search"를 이용해 관련 주제를 운용하는 기업을 검색.
  3. "goolge finance"를 이용해 해당 기업의 가치를 파악.

2. 수학 문제 과외 선생님
  1. 간단한 계산은 직접 풀어주기
  2. 그래프 문제는 "ulfram alpha"를 이용해 그려주기




In [None]:
# 여기에 직접 만들어보세요! :)