## Example chat with pdf text as context


In [2]:
# Set up to use local modules
%load_ext autoreload
%autoreload 2
import os
import sys
module_path = os.path.abspath(os.path.join('..')) # Add parent directory to path
sys.path.insert(0, module_path)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [22]:
from pypdf import PdfReader
from src import utils

import pyprojroot

pdf_path = pyprojroot.here("data/vaswani_et_al_2017.pdf")

reader = PdfReader(pdf_path)  # creating a pdf reader object
print(len(reader.pages))  # number of pages in pdf file
page = reader.pages[0]  # getting a specific page from the pdf file
first_page_text = page.extract_text()  # extracting text from page
print(first_page_text)

15
Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Exper

In [66]:
# extracting text from all pages

entire_text = utils.extract_text_from_pdf(pdf_path, max_n_pages=None)
print(entire_text)

Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.comNoam Shazeer∗
Google Brain
noam@google.comNiki Parmar∗
Google Research
nikip@google.comJakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.comAidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.eduŁukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer,
based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experime

In [8]:
import textwrap
from IPython.display import Markdown


def to_markdown(text):
    text = text.replace("•", "  *")
    return Markdown(textwrap.indent(text, "> ", predicate=lambda _: True))

In [43]:
# Use pdf text as context in Gemini chat, with engineered prompt to ask for summary
import google.generativeai as genai

model = genai.GenerativeModel("gemini-pro")

prompt_context_text = utils.extract_text_from_pdf(pdf_path, max_n_pages=2)

prompt = f"""
    Explain and summarize the scientifc results in the following text
    which has been extracted from a publication.
    First, provide a brief summary of the results in the results,
    then provide a bullet point list of concepts and results that a user may
    want to ask about.
    ---
    {prompt_context_text}
    ---
    """
prompt_n_tokens = model.count_tokens(prompt)
print(f"Prompt has {prompt_n_tokens} tokens")
response = model.generate_content(prompt)
to_markdown(response.text)

Prompt has total_tokens: 2025
 tokens


> **Scientific Results**
> 
> The Transformer model outperforms existing models in machine translation tasks and requires less training time. 
> 
> Specifically, on the WMT 2014 English-to-German translation task, the Transformer achieves a BLEU score of 28.4, which is 2 BLEU points higher than the previous best result. On the WMT 2014 English-to-French translation task, the Transformer establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs. This training time is significantly shorter than the training time required for previous best models.
> 
> **Concepts and Results**
> 
> * **Transformer Model:** A new sequence transduction model that relies solely on attention mechanisms.
> * **Encoder:** A stack of layers that computes a representation of the input sequence.
> * **Decoder:** A stack of layers that generates the output sequence one element at a time.
> * **Multi-Head Attention:** A mechanism that allows the model to attend to different parts of the input sequence simultaneously.
> * **Positional Encoding:** A method for adding positional information to the input and output sequences.
> * **Residual Connections:** A technique for improving the training of deep neural networks.
> * **Layer Normalization:** A technique for stabilizing the training of deep neural networks.
> * **BLEU Score:** A measure of the quality of machine translation output.

In [27]:
response.candidates

[index: 0
content {
  parts {
    text: "**Summary:**\n\nGoogle grants permission to reproduce tables and figures from their paper titled \"Attention Is All You Need\" for use in journalistic or scholarly works, provided proper attribution is given.\n\n**Explanation:**\n\nThe provided text is an excerpt from a research paper describing the Transformer neural network architecture. The text briefly summarizes the paper\'s findings, highlighting the Transformer\'s superior performance in machine translation tasks and its ability to generalize well to other tasks such as English constituency parsing. The paper emphasizes the Transformer\'s simplicity, parallelizability, and reduced training time compared to existing models.\n\nThe text also acknowledges the significant contributions of the authors in designing, implementing, and evaluating the Transformer model."
  }
  role: "model"
}
finish_reason: STOP
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE

## gemini-pro-vision for pdf analysis?


In [63]:
model = genai.GenerativeModel("gemini-pro-vision")

prompt = """
You are a very professional document summarization specialist.
Please summarize the given document.
"""

from vertexai.generative_models import Part


pdf_file_uri = pdf_path
pdf_file = Part.from_uri(pdf_file_uri, mime_type="application/pdf")
contents = [pdf_file, prompt]

# prompt_n_tokens = model.count_tokens(prompt)
# print(f"Prompt has {prompt_n_tokens} tokens")
response = model.generate_content(contents)
to_markdown(response.text)

TypeError: bad argument type for built-in operation

## Use preceding messages as context


In [67]:
model = genai.GenerativeModel("gemini-pro")
chat = model.start_chat(history=[])
chat

ChatSession(
    model=genai.GenerativeModel(
        model_name='models/gemini-pro',
        generation_config={},
        safety_settings={},
        tools=None,
    ),
    history=[]
)

In [68]:
response = chat.send_message(
    "In one sentence, explain how a computer works to a young child."
)
to_markdown(response.text)

> A computer is like a magic machine that understands our words, can remember things, and can follow our instructions to do many different things, like playing games, watching movies, and helping us learn.

In [69]:
chat.history

[parts {
   text: "In one sentence, explain how a computer works to a young child."
 }
 role: "user",
 parts {
   text: "A computer is like a magic machine that understands our words, can remember things, and can follow our instructions to do many different things, like playing games, watching movies, and helping us learn."
 }
 role: "model"]

In [70]:
for message in chat.history:
    display(to_markdown(f"**{message.role}**: {message.parts[0].text}"))

> **user**: In one sentence, explain how a computer works to a young child.

> **model**: A computer is like a magic machine that understands our words, can remember things, and can follow our instructions to do many different things, like playing games, watching movies, and helping us learn.

## Context, with generate_content


In [81]:
model = genai.GenerativeModel("gemini-pro")
pre_prompt = """
    Instructions: Always answer like a pirate. Arr matey?
    """
messages = []
messages.append({"role": "user", "parts": [pre_prompt]})
messages.append({"role": "model", "parts": ["Arr matey!"]})
messages.append(
    {
        "role": "user",
        "parts": ["In one sentence, explain how a computer works to a young child."],
    },
)
response = model.generate_content(messages)
messages.append({"role": "model", "parts": [response.text]})
messages

[{'role': 'user',
  'parts': ['\n    Instructions: Always answer like a pirate. Arr matey?\n    ']},
 {'role': 'model', 'parts': ['Arr matey!']},
 {'role': 'user',
  'parts': ['In one sentence, explain how a computer works to a young child.']},
 {'role': 'model',
  'parts': ["Avast there, young bilge rat! A computer be like a magic box that can do what ye tell it to, 'cause it has a wee captain inside that follows yer orders and makes things happen on the screen!"]}]