# **Scholar me up**

This notebook shows how you can you use large language models to learn about virtually any topic, including large language models. It uses Llama 2 7B-chat (accessed from Hugging Face) and Google Colab's free GPU access to retrieve a list of relevant academic papers on a given topic before summarizing them. You can choose the language style, paper type, number of papers, period and word count. You can then get a more detailed summary of specific papers.

## **Define parameters**

Here we define the parameters we want our model to use.

In [1]:
topic = 'transformers in LLMs for NLP'  

language_style = 'easy to understand'  

paper_type = 'most influential'  

numer_of_papers = '12'  

period = '2010-2024'  

summary_words = '200-300'  

## **Load model**

### **Import libraries**

First we import the libraries we need.

In [2]:
import locale
def getpreferredencoding(do_setlocale = True):
    return "UTF-8"
locale.getpreferredencoding = getpreferredencoding

!pip install -q transformers einops accelerate langchain bitsandbytes

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/44.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m806.7/806.7 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 MB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m42.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m238.5/238.5 kB[0m [31m23.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.4/55.4 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━

### **Login to Hugging Face**

Then we login to Hugging Face. First you will need to get permission from Meta to use Llama 2 7B-chat and then from hugging face to use their version converted for Hugging Face Transformers. When running this cell, use your hugging face token to login then enter 'n' afterwards upon being asked to add token as git credential.

In [3]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) n
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


### **Download Llama 2 7B-chat**

Next we download the model and instantiate it as 'llm' before setting the temperature parameter to 0 for more truthful answers.

In [4]:
from langchain import HuggingFacePipeline
from transformers import AutoTokenizer
import transformers
import torch
import warnings
warnings.filterwarnings("ignore")

model = "meta-llama/Llama-2-7b-chat-hf"

tokenizer = AutoTokenizer.from_pretrained(model)

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    max_length=1000,
    eos_token_id=tokenizer.eos_token_id
)

llm = HuggingFacePipeline(pipeline = pipeline, model_kwargs = {'temperature':0})

tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

## **Get papers**

### **Define a prompt to get the papers**

Here we define our prompt format. The system information contains information on how to respond while the text conists of our chosen topic.

In [5]:
from langchain import PromptTemplate
import numpy as np

# calculates out how many of the papers should be after 2010 and 2023 respectively
# this is because the model sometimes tends to focus more on older papers
half_papers = int(np.floor(eval(numer_of_papers) / 2))
quarter_papers = int(np.floor(eval(numer_of_papers) / 4))

# strings to be combined to make our prompt template
start = """<s>[INST] <<SYS>>\n\n"""
line_1 = "You return " + numer_of_papers + " academic papers on a topic between " + period + " in ascending order of publication date.\n"
line_2 = str(half_papers) + " of the papers must be from 2010 onwards.\n"
line_3 = str(quarter_papers) + " of the papers must be from 2023 onwards.\n"
line_4 = "You return academic papers in 'name / author / date' format, nothing else.\n"
line_5 = "You do not describe the papers in any way, you only return the author, date and paper name.\n\n"
end = """<</SYS>>\n\n {text} [/INST]"""

template = start + line_1 + line_2 + line_3 + line_4 + line_5 + end

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

text = 'What are the ' + paper_type + ' academic papers on ' + topic + '?'
print(prompt.format(text=text))

<s>[INST] <<SYS>>

You return 12 academic papers on a topic between 2010-2024 in ascending order of publication date.
6 of the papers must be from 2010 onwards.
3 of the papers must be from 2023 onwards.
You return academic papers in 'name / author / date' format, nothing else.
You do not describe the papers in any way, you only return the author, date and paper name.

<</SYS>>

 What are the most influential academic papers on transformers in LLMs for NLP? [/INST]


### **Run a query to get the papers**

Now we run our query to get a list of papers on our topic.

In [6]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=prompt)
result = chain.invoke(text)

print()
print(result['text'])


  Sure, here are 12 academic papers on transformers in LLMs for NLP, sorted in ascending order of publication date:

1. Vaswani et al. (2017) - "Attention is All You Need"
2. Parikh et al. (2016) - "A Survey on Attention Mechanisms for NLP"
3. Wu et al. (2016) - "Sequence to Sequence Learning with Neural Networks"
4. Liu et al. (2019) - "A Survey of Transformer-Based Language Models for Natural Language Processing"
5. Auli et al. (2018) - "On the Importance of Initialization for Transformer-Based Language Models"
6. Chen et al. (2020) - "Reinforcement Learning for Language Modeling: A Survey"
7. Zhang et al. (2020) - "Understanding the Transformer"
8. Li et al. (2020) - "A Survey of Transformer-Based Models for NLP"
9. Zhao et al. (2020) - "Transformers for Natural Language Processing: A Survey"
10. Wang et al. (2023) - "Transformers for Language Modeling: A Survey"
11. Liu et al. (2023) - "A Survey of Transformer-Based Language Models for Low-Resource Languages"
12. Zhang et al. (202

### **Store the papers in a dataframe**

We then extract and store our papers in a dataframe, ordering again by date in case the model didn't return the papers in the desired date order.

In [7]:
import pandas as pd
import re

# extracts the date with regex by looking for a pattern with any 4 digit number
def extract_date(texts):

    pattern = r'\b\d{4}\b'

    for i, text in enumerate(texts):
        match = re.search(pattern, text)
        if match:
          if text[0] == '(':
            return text[1:5]
          else:
            return text[0:4]

    return None

# creates a pandas dataframe where the papers are sorted by date
seperator = ' '
papers = []
dates = []
for paper in range(eval(numer_of_papers)):
  index = 2 + paper
  text = result['text'].split('\n')[index].split(' ')[1:]
  papers.append( seperator.join(text) )
  dates.append( extract_date(text) )


papers_df = pd.DataFrame({'paper': papers,
                          'date': dates}).sort_values('date').reset_index(drop=True)
papers_df

Unnamed: 0,paper,date
0,"Parikh et al. (2016) - ""A Survey on Attention ...",2016
1,"Wu et al. (2016) - ""Sequence to Sequence Learn...",2016
2,"Vaswani et al. (2017) - ""Attention is All You ...",2017
3,"Auli et al. (2018) - ""On the Importance of Ini...",2018
4,"Liu et al. (2019) - ""A Survey of Transformer-B...",2019
5,"Chen et al. (2020) - ""Reinforcement Learning f...",2020
6,"Zhang et al. (2020) - ""Understanding the Trans...",2020
7,"Li et al. (2020) - ""A Survey of Transformer-Ba...",2020
8,"Zhao et al. (2020) - ""Transformers for Natural...",2020
9,"Wang et al. (2023) - ""Transformers for Languag...",2023


## **Summarize papers**

### **Define a prompt to summarize the papers**

Now we create another prompt template, this time asking the model to summarize a given paper.

In [10]:
# strings to be combined to make our prompt template
start = """<s>[INST] <<SYS>>\n\n"""
line_1 = "You summarize an academic paper in " + summary_words + " words.\n"
line_2 = "Your response only includes a summary and a detailed, bullet pointed list of what the article says which is different from other papers, nothing else.\n"
line_3 = "You use the titles '📝 Summary:', '💡 What's new:' for the headings with a newline between each.\n"
line_4 = "Do not use any other headings, only '📝 Summary:' and '💡 What's new:'.\n"
line_5 = "Use '*' for bullet points in the '💡 What's new:' section, do not write the '💡 What's new:' section as a paragraph.\n"
line_6 = "Write the summary on the same line as the '📝 Summary:' section, not on a newline."
line_7 = "Do not repeat information from the '📝 Summary:' in the '💡 What's new:' section, instead go into more detail about what the paper says.\n"
line_8 = "You do not use emojis for anything else.\n"
line_9 = "You write in " + language_style + " language.\n"
line_10 = "You begin your response with 'Paper:' followed by the name of the paper.\n\n"

end = """<</SYS>>\n\n {text} [/INST]"""

template = start + line_1 + line_2 + line_3 + line_4 + line_5 + line_6 + line_7 + line_8 + line_9 + end
0
prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

paper = papers_df.loc[0, 'paper']
text = 'Give me a summary of the paper: "' + paper + '"'
print(prompt.format(text=text))

<s>[INST] <<SYS>>

You summarize an academic paper in 200-300 words.
Your response only includes a summary and a detailed, bullet pointed list of what the article says which is different from other papers, nothing else.
You use the titles '📝 Summary:', '💡 What's new:' for the headings with a newline between each.
Do not use any other headings, only '📝 Summary:' and '💡 What's new:'.
Use '*' for bullet points in the '💡 What's new:' section, do not write the '💡 What's new:' section as a paragraph.
Do not repeat information from the '📝 Summary:' in the '💡 What's new:' section, instead go into more detail about what the paper says.
You do not use emojis for anything else.
You write in easy to understand language.
You begin your response with 'Paper:' followed by the name of the paper.

<</SYS>>

 Give me a summary of the paper: "Parikh et al. (2016) - "A Survey on Attention Mechanisms for NLP"" [/INST]


### **Run a query to summarize the papers**

We again use our prompt to run a query, this time returning broad summaries of our papers.

In [11]:
results = []
for i in range(eval(numer_of_papers)):
  paper = papers_df.loc[i, 'paper']
  text = 'Give me a summary of the paper: "' + paper + '"'
  prompt.format(text=text)

  chain = LLMChain(llm=llm, prompt=prompt)
  result = chain.invoke(text)

  print()
  print()
  print()
  print(result['text'])
  results.append(result)




  Paper: Parikh et al. (2016) - "A Survey on Attention Mechanisms for NLP"

📝 Summary: This paper provides a comprehensive survey of attention mechanisms in natural language processing (NLP). The authors discuss the different types of attention mechanisms, including hard attention, soft attention, and multi-head attention, and their applications in various NLP tasks such as language translation, question answering, and text summarization. They also highlight the benefits of attention mechanisms, including improved performance and interpretability, and discuss some of the challenges and limitations of using attention in NLP.

💡 What's new:

* The authors provide a detailed overview of the different attention mechanisms, including their mathematical formulations and the techniques used to implement them.
* They discuss the recent advances in attention mechanisms, such as the use of multi-head attention and the development of attention-based neural machine translation.
* The authors hi

## **Detailed summary**

### **Define a prompt to get a detailed summary**

We now turn our attention to a specific paper of interest by defining a prompt asking for a more detailed summary.

In [14]:
# strings to be combined to make our prompt template
start = """<s>[INST] <<SYS>>\n\n"""
line_1 = "You summarize an academic paper in 800-1000 words and go into significant technical detail in your response.\n"
line_2 = "You begin your response with 'Paper:' followed by the name of the paper and then, on a newline, '📝 Summary:' followed by a summary, nothing else.\n\n"

end = """<</SYS>>\n\n {text} [/INST]"""

template = start + line_1 + line_2 + end
0
prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

paper = papers_df.loc[2, 'paper']
text = 'Give me a summary of the paper: "' + paper + '"'
print(prompt.format(text=text))

<s>[INST] <<SYS>>

You summarize an academic paper in 800-1000 words and go into significant technical detail in your response.
You begin your response with 'Paper:' followed by the name of the paper and then, on a newline, '📝 Summary:' followed by a summary, nothing else.

<</SYS>>

 Give me a summary of the paper: "Vaswani et al. (2017) - "Attention is All You Need"" [/INST]


### **Run a query to get a detailed summary**

Finally, we run a query to get our more detailed summary on a specific paper.

In [15]:
chain = LLMChain(llm=llm, prompt=prompt)
result = chain.invoke(text)

print()
print(result['text'])


  Paper: Vaswani et al. (2017) - "Attention is All You Need"

📝 Summary:

In this seminal paper, Vaswani et al. propose a new neural network architecture for sequence-to-sequence tasks, called the Transformer model. Unlike traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs), which rely on recurrence and convolution respectively, the Transformer model relies entirely on self-attention mechanisms to process input sequences. This allows the model to parallelize the computation of self-attention across all positions in a sequence, making it much faster and more scalable than previous models.

The Transformer model is composed of an encoder and a decoder, each comprising multiple identical layers. Each layer in the encoder and decoder consists of a self-attention mechanism, followed by a feed-forward neural network (FFNN). The self-attention mechanism computes the weighted sum of the input tokens, where the weights are learned during training. The FFNN pro