<a target="_blank" href="https://colab.research.google.com/github/UpstageAI/cookbook/blob/main/Solar-Fullstack-LLM-101/11_summary_writing_translation.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# 11. Summary, Writing, Translation

## Overview  
In this exercise, we will use the Solar API for translation and summarization. By applying the LLM Chain techniques we have previously learned, we can generate summaries in the desired format and translate English to Korean using the Solar Translation API. This tutorial will guide you through the process of combining summarization and translation effectively.
 
## Purpose of the Exercise
The purpose of this exercise is to attain proficiency in utilizing the LLM Chain and modify output formats by applying prompt engineering techniques. By the end of this tutorial, users will be able to create summaries and translate text seamlessly, improving their ability to manage multilingual content using the Solar API.


In [2]:
! pip3 install -qU langchain-upstage  requests python-dotenv

In [4]:
# @title set API key
import os
import getpass
from pprint import pprint
import warnings

warnings.filterwarnings("ignore")

from IPython import get_ipython

if "google.colab" in str(get_ipython()):
    # Running in Google Colab. Please set the UPSTAGE_API_KEY in the Colab Secrets
    from google.colab import userdata
    os.environ["UPSTAGE_API_KEY"] = userdata.get("UPSTAGE_API_KEY")
else:
    # Running locally. Please set the UPSTAGE_API_KEY in the .env file
    from dotenv import load_dotenv

    load_dotenv()

if "UPSTAGE_API_KEY" not in os.environ:
    os.environ["UPSTAGE_API_KEY"] = getpass.getpass("Enter your Upstage API key: ")


In [5]:
solar_text = """
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, 
demonstrating superior performance in various natural language processing (NLP) tasks. 
Inspired by recent efforts to efficiently up-scale LLMs, 
we present a method for scaling LLMs called depth up-scaling (DUS), 
which encompasses depthwise scaling and continued pretraining.
In contrast to other LLM up-scaling methods that use mixture-of-experts, 
DUS does not require complex changes to train and inference efficiently. 
We show experimentally that DUS is simple yet effective 
in scaling up high-performance LLMs from small ones. 
Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, 
a variant fine-tuned for instruction-following capabilities, 
surpassing Mixtral-8x7B-Instruct. 
SOLAR 10.7B is publicly available under the Apache 2.0 license, 
promoting broad access and application in the LLM field.
"""

In [6]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_upstage import ChatUpstage


llm = ChatUpstage()

prompt_template = PromptTemplate.from_template(
    """
    Please do {action} on the following text: 
    ---
    TEXT: {text}
    """
)
chain = prompt_template | llm | StrOutputParser()

In [10]:
chain.invoke({"action": "three line summarize", "text": solar_text})

'\nSOLAR 10.7B, a 10.7 billion parameter large language model, demonstrates superior performance in various NLP tasks. The model uses depth up-scaling (DUS) method, which includes depthwise scaling and continued pretraining. DUS is a simple yet effective way to scale up high-performance LLMs without requiring complex changes to training and inference. Additionally, SOLAR 10.7B-Instruct, a variant fine-tuned for instruction-following capabilities, is available. SOLAR 10.7B is publicly available under the Apache 2.0 license.'

In [12]:
chain.invoke(
    {
        "action": "Please rewrite this so that children and grandma can understand it easily.",
        "text": solar_text,
    }
)

'We created a new computer program called SOLAR 10.7B that is very good at understanding and speaking different languages. This program can do many things, like answering questions, writing stories, and even helping with homework.\n\nTo make this program better, we used a special way called "depth up-scaling" or DUS. This means we made the program bigger and stronger, but we did it in a simple way that is easy to understand and use. We also made sure that the program can learn new things on its own, just like a child learning from their experiences.\n\nWe showed that our simple way of making the program bigger and stronger works really well. It is much better than other ways that use more complicated ideas.\n\nWe also made a special version of the program called SOLAR 10.7B-Instruct. This version is even better at following instructions and doing what it\'s told. It\'s so good that it\'s better than other programs that use even more complicated ideas.\n\nNow, everyone can use our progr

In [14]:
from langchain_core.prompts import ChatPromptTemplate

enko_translation = ChatUpstage(model="solar-1-mini-translate-enko")

chat_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{text}"),
    ]
)

chain = chat_prompt | enko_translation | StrOutputParser()

In [16]:
chain.invoke(
    {
        "text": """We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, 
demonstrating superior performance in various natural language processing (NLP) tasks. """
    }
)

'우리는 107억 개의 매개변수를 가진 대규모 언어 모델 (LLM)인 SOLAR 10.7B를 소개합니다.'

In [28]:
style_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "we present a method for scaling LLMs called depth up-scaling (DUS)"),
        ("ai", "DUS라는 방법에 대해 알려줄께. DUS는 LLMs를 확장하는 방법이야."),
        (
            "human",
            "In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently.",
        ),
        (
            "ai",
            "다른 LLM 확장 방법과는 달리, DUS는 복잡한 변경 사항 없이 효율적으로 훈련하고 추론할 수 있어.",
        ),
        ("human", "{text}"),
    ]
)

chain = style_prompt | enko_translation | StrOutputParser()

In [29]:
chain.invoke(
    {
        "text": """We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. """
    }
)

'우리는 107억 개의 매개 변수를 가진 대규모 언어 모델(LLM)인 SOLAR 10.7B를 소개하며, 다양한 자연어 처리(NLP) 작업에서 우수한 성능을 보여줘.'