<a target="_blank" href="https://colab.research.google.com/github/UpstageAI/cookbook/blob/main/Solar-LLM-ZeroToAll/01_hello_solar.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [1]:
! pip3 install -qU langchain-upstage  python-dotenv getpass4

In [2]:
import os
import getpass
import warnings
warnings.filterwarnings("ignore")

UPSTAGE_API_KEY = getpass.getpass('Enter your API Key')
_ = os.environ.setdefault("UPSTAGE_API_KEY", UPSTAGE_API_KEY)

In [3]:
from langchain_upstage import ChatUpstage
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder


llm = ChatUpstage()

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an assistant for question-answering tasks. "),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

chain = qa_prompt | llm | StrOutputParser()

In [4]:
from langchain_core.messages import HumanMessage

question = "How about Korea?"
ai_msg_1 = chain.invoke({"input": question, "chat_history": []})
print(ai_msg_1)

Ah, Korea! A fascinating place with a rich history and vibrant culture. It's divided into two distinct regions: North Korea and South Korea. 

South Korea, officially the Republic of Korea, is a peninsula country in East Asia. It's known for its high-tech cities like Seoul, its beautiful landscapes, and its unique blend of traditional and modern culture. The capital, Seoul, is a bustling metropolis with skyscrapers, ancient palaces, and lively street markets. 

North Korea, officially the Democratic People's Republic of Korea, is a country in East Asia, constituting the northern part of the Korean Peninsula. It's a mysterious and isolated country, often in the news for political reasons. Despite this, it's also home to stunning natural beauty and a unique culture.

Both Koreas have a long history, with a shared heritage that dates back thousands of years. Their cuisine, language, and customs have many similarities, but they've developed differently due to their separate political paths

In [5]:
from langchain_core.messages import HumanMessage, AIMessage

chat_history = []

question = "Where is the capital of France?"
ai_msg_1 = chain.invoke({"input": question, "chat_history": chat_history})
print(ai_msg_1)
chat_history.extend([HumanMessage(question), AIMessage(ai_msg_1)])


second_question = "How about Korea?"
ai_msg_2 = chain.invoke({"input": second_question, "chat_history": chat_history})
chat_history.extend([HumanMessage(second_question), AIMessage(ai_msg_2)])

print(ai_msg_2)

Paris is the capital of France.
Seoul is the capital of South Korea.


In [6]:
llm = ChatUpstage()

qa_system_prompt = """You are an assistant for question-answering tasks. \
Use the following pieces of retrieved context to answer the question. \
If you don't know the answer, just say that you don't know. \
Use three sentences maximum and keep the answer concise.\

{context}"""

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", qa_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

chain = qa_prompt | llm | StrOutputParser()

In [7]:
context = """
We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, 
    demonstrating superior performance in various natural language processing (NLP) tasks. 
    Inspired by recent efforts to efficiently up-scale LLMs, 
    we present a method for scaling LLMs called depth up-scaling (DUS), 
    which encompasses depthwise scaling and continued pretraining.
    In contrast to other LLM up-scaling methods that use mixture-of-experts, 
    DUS does not require complex changes to train and inference efficiently. 
    We show experimentally that DUS is simple yet effective 
    in scaling up high-performance LLMs from small ones. 
    Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, 
    a variant fine-tuned for instruction-following capabilities, 
    surpassing Mixtral-8x7B-Instruct. 
    SOLAR 10.7B is publicly available under the Apache 2.0 license, 
    promoting broad access and application in the LLM field.
"""

In [8]:
from langchain_core.messages import HumanMessage, AIMessage

chat_history = []

question = "What is DUS?"
ai_msg_1 = chain.invoke({"input": question, "chat_history": chat_history, "context": context})
chat_history += [HumanMessage(question), AIMessage(ai_msg_1)]
print("A1", ai_msg_1)

second_question = "What's the benefit?"
ai_msg_2 = chain.invoke({"input": second_question, "chat_history": chat_history, "context": context})
chat_history += [HumanMessage(second_question), AIMessage(ai_msg_2)]

print("A2", ai_msg_2)

A1 DUS stands for depth up-scaling. It is a method for scaling large language models (LLMs) that encompasses depthwise scaling and continued pretraining. Unlike other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently. It is a simple yet effective approach to scaling up high-performance LLMs from smaller ones.
A2 The benefit of DUS is that it provides a simple and efficient way to scale up high-performance large language models (LLMs) from smaller ones. This scaling process, which involves depthwise scaling and continued pretraining, does not require complex changes to the training and inference processes, unlike other LLM up-scaling methods that use mixture-of-experts. As a result, DUS can help improve the performance of LLMs in various natural language processing (NLP) tasks.


In [9]:
for chat in chat_history:
    print(chat)

content='What is DUS?'
content='DUS stands for depth up-scaling. It is a method for scaling large language models (LLMs) that encompasses depthwise scaling and continued pretraining. Unlike other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently. It is a simple yet effective approach to scaling up high-performance LLMs from smaller ones.'
content="What's the benefit?"
content='The benefit of DUS is that it provides a simple and efficient way to scale up high-performance large language models (LLMs) from smaller ones. This scaling process, which involves depthwise scaling and continued pretraining, does not require complex changes to the training and inference processes, unlike other LLM up-scaling methods that use mixture-of-experts. As a result, DUS can help improve the performance of LLMs in various natural language processing (NLP) tasks.'
