In [2]:
from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate

llm = OllamaLLM(model="llama3")

summary_prompt = PromptTemplate(
    input_variables=["content", "max_words"],
    template="""
    Please summarize the following content in {max_words} words or fewer.
    Focus on the main points and key takeaways.

    Content: {content}

    Summary:
    """
)

content = """
The ability to learn effectively from raw text is crucial to alleviating the dependence on supervised
learning in natural language processing (NLP). Most deep learning methods require substantial
amounts of manually labeled data, which restricts their applicability in many domains that suffer
from a dearth of annotated resources. In these situations, models that can leverage linguistic
information from unlabeled data provide a valuable alternative to gathering more annotation, which
can be time-consuming and expensive. Further, even in cases where considerable supervision
is available, learning good representations in an unsupervised fashion can provide a significant
performance boost. The most compelling evidence for this so far has been the extensive use of pre-
trained word embeddings to improve performance on a range of NLP tasks.

Leveraging more than word-level information from unlabeled text, however, is challenging for two
main reasons. First, it is unclear what type of optimization objectives are most effective at learning
text representations that are useful for transfer. Recent research has looked at various objectives
such as language modeling, machine translation, and discourse coherence, with each
method outperforming the others on different tasks. Second, there is no consensus on the most
effective way to transfer these learned representations to the target task. Existing techniques involve
a combination of making task-specific changes to the model architecture, using intricate
learning schemes and adding auxiliary learning objectives. These uncertainties have made
it difficult to develop effective semi-supervised learning approaches for language processing.

In this paper, we explore a semi-supervised approach for language understanding tasks using a
combination of unsupervised pre-training and supervised fine-tuning. Our goal is to learn a universal
representation that transfers with little adaptation to a wide range of tasks. We assume access to
a large corpus of unlabeled text and several datasets with manually annotated training examples
(target tasks). Our setup does not require these target tasks to be in the same domain as the unlabeled
corpus. We employ a two-stage training procedure. First, we use a language modeling objective on
the unlabeled data to learn the initial parameters of a neural network model. Subsequently, we adapt
these parameters to a target task using the corresponding supervised objective.
"""

formatted_prompt = summary_prompt.format(
    content=content,
    max_words="50"
)

response = llm.invoke(formatted_prompt)
print(response)


Here is a summary of the content in 50 words or fewer:

The ability to learn from raw text without manual labeling is crucial for natural language processing (NLP). This paper explores a semi-supervised approach combining unsupervised pre-training and supervised fine-tuning to develop a universal representation that transfers well across various NLP tasks.
