# Temperature Testing with Google AI Studio

This notebook is setup to test two different PDFs with `Gemini-1.5-Flash-002` with a set knowledge extraction JSON schema.

In [3]:
pip install --upgrade --quiet google.generativeai


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install --upgrade --quiet google.ai.generativelanguage

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-generativeai 0.8.3 requires google-ai-generativelanguage==0.6.10, but you have google-ai-generativelanguage 0.6.11 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [5]:
pip install --upgrade --quiet pypdf


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [6]:
import google.generativeai as genai
import time
import json
import re
from google.generativeai.types import HarmCategory, HarmBlockThreshold
from google.ai.generativelanguage_v1beta.types import content
from pypdf import PdfReader

In [7]:
# Add paths to PDFs here
path1=f"Challenger-Report-Vol1.pdf"
path2=f"Columbia-Options-Assessment.pdf"

In [8]:
reader1 = PdfReader(path1)
reader2 = PdfReader(path2)

In [9]:
pages1 = reader1.pages
pages2 = reader2.pages

In [10]:
text1 = f""
for page in pages1:
    text1 += page.extract_text()

text2 = f""
for page in pages2:
    text2 += page.extract_text()

In [11]:
# Check to maksure the PDF extraction worked
print(len(text1),len(text2))

668029 85270


In [12]:
# Temperature can be set between 0.0 to 2.0
temp = 1.5

In [None]:
# Don't forget to set your Google AI Studio API key here
genai.configure(api_key=api_key)

In [13]:

generation_config = {
  "temperature": temp,
  "top_p": 1.0,
  "top_k": 40,
  "max_output_tokens": 8192,
  "response_schema": content.Schema(
    type = content.Type.OBJECT,
    properties = {
      "documentType": content.Schema(
        type = content.Type.STRING,
      ),
      "source": content.Schema(
        type = content.Type.OBJECT,
        properties = {
          "name": content.Schema(
            type = content.Type.STRING,
          ),
          "authors": content.Schema(
            type = content.Type.ARRAY,
            items = content.Schema(
              type = content.Type.STRING,
            ),
          ),
          "date": content.Schema(
            type = content.Type.STRING,
          ),
        },
      ),
      "description": content.Schema(
        type = content.Type.STRING,
      ),
      "keywords": content.Schema(
        type = content.Type.ARRAY,
        items = content.Schema(
          type = content.Type.STRING,
        ),
      ),
      "entities": content.Schema(
        type = content.Type.ARRAY,
        items = content.Schema(
          type = content.Type.OBJECT,
          properties = {
            "name": content.Schema(
              type = content.Type.STRING,
            ),
            "type": content.Schema(
              type = content.Type.STRING,
            ),
            "context": content.Schema(
              type = content.Type.STRING,
            ),
          },
        ),
      ),
      "topics": content.Schema(
        type = content.Type.ARRAY,
        items = content.Schema(
          type = content.Type.OBJECT,
          properties = {
            "topic": content.Schema(
              type = content.Type.STRING,
            ),
            "importance": content.Schema(
              type = content.Type.STRING,
            ),
          },
        ),
      ),
    },
  ),
  "response_mime_type": "application/json",
}

block_level = HarmBlockThreshold.BLOCK_ONLY_HIGH

safety_settings={
    HarmCategory.HARM_CATEGORY_HATE_SPEECH: block_level,
    HarmCategory.HARM_CATEGORY_HARASSMENT: block_level,
    HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: block_level,
    HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: block_level,
    #HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY: blocklevel
}

In [14]:
model = genai.GenerativeModel(
            model_name="gemini-1.5-flash-002",
            generation_config=generation_config,
            safety_settings=safety_settings,
            system_instruction="You are a helpful AI assistant that performs information retrieval. You extract knowledge from a provided text into the defined JSON schema. Entities could be people, places, organizations, groups, events, locations, and things. The context for entities is a brief description of the entity as it is described in the provided text. Topics are intangible concepts."
)

In [228]:
prompt = f"{text1}"

In [229]:
start_time = time.time()
chat_session = model.start_chat(history=[])
response = chat_session.send_message(prompt)
end_time = time.time()
total_time = end_time - start_time
print(f"Total time taken: {total_time:.2f} seconds")

Total time taken: 144.30 seconds


In [230]:
resp = response.text
inputtokens = int(response.usage_metadata.prompt_token_count)
outputtokens = int(response.usage_metadata.candidates_token_count)

In [231]:
print(f"Schema Response:\n{resp}\n\nInput Tokens: {inputtokens}\nOutput Tokens: {outputtokens}")

Schema Response:
{"description": "Report of the Presidential Commission on the Space Shuttle Challenger Accident.", "documentType": "report", "entities": [{"context": "President of the United States who appointed the commission.", "name": "Ronald Reagan", "type": "person"}, {"context": "Spacecraft that suffered an accident.", "name": "Challenger", "type": "spacecraft"}, {"context": "Organization that managed the Space Shuttle program.", "name": "NASA", "type": "organization"}, {"context": "Company that manufactured the Solid Rocket Boosters.", "name": "Morton Thiokol", "type": "organization"}, {"context": "Location where the Challenger accident occured.", "name": "Cape Canaveral", "type": "location"}, {"context": "Location where the Challenger investigation took place.", "name": "Washington, D.C.", "type": "location"}, {"context": "Person who chaired the Commission.", "name": "William P. Rogers", "type": "person"}, {"context": "Former astronaut and vice chairman of Commission.", "name"

In [232]:
output = json.loads(resp)

In [233]:
print(re.search('Reagan', resp))

<re.Match object; span=(230, 236), match='Reagan'>


In [234]:
person_entities = []
for entity in output['entities']:
    if entity.get('type', '').lower() == 'person':
        person_entities.append({
            'name': entity.get('name', ''),
            'context': entity.get('context', '')
        })

print(f"Number of Persons Found: {len(person_entities)}")
print("---")
for person in person_entities:
    print(f"Name: {person['name']}")
    print(f"Context: {person['context']}")
    print("---")

Number of Persons Found: 7
---
Name: Ronald Reagan
Context: President of the United States who appointed the commission.
---
Name: William P. Rogers
Context: Person who chaired the Commission.
---
Name: Neil A. Armstrong
Context: Former astronaut and vice chairman of Commission.
---
Name: Sally K. Ride
Context: Mission specialist on STS-7  American woman in space. Also flew on mission 41-G.
---
Name: Richard P. Feynman
Context: Physicist and Nobel Laureate on the Commission.
---
Name: Lawrence B. Mulloy
Context: Person responsible for the Solid Rocket Booster launch decision.
---
Name: Roger Boisjoly
Context: Thiokol engineer deeply concerned about o-king performance before the Challenger launch.
---
