<a href="https://www.kaggle.com/code/gpreda/gemini-1-5-q-a-from-a-large-romanian-book?scriptVersionId=209251317" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Load a pdf from a dataset on Kaggle

## Step 1: Import Python Packages

In [1]:
import os
import time
import google.generativeai as genai
from kaggle_secrets import UserSecretsClient
from time import time

## Step 2: Authenticate with Google Generative AI

In [2]:
user_secrets = UserSecretsClient()
ai_studio_token = user_secrets.get_secret("GEMINI_API_KEY_SECOND")
genai.configure(api_key=ai_studio_token)

## Step 3: Define helper functions

In [3]:
def upload_to_gemini(path, mime_type=None):
  """Uploads the given file to Gemini.

  See https://ai.google.dev/gemini-api/docs/prompting_with_media
  """
  file = genai.upload_file(path, mime_type=mime_type)
  print(f"Uploaded file '{file.display_name}' as: {file.uri}")
  return file

def wait_for_files_active(files):
  """Waits for the given files to be active.

  Some files uploaded to the Gemini API need to be processed before they can be
  used as prompt inputs. The status can be seen by querying the file's "state"
  field.

  This implementation uses a simple blocking polling loop. Production code
  should probably employ a more sophisticated approach.
  """
  print("Waiting for file processing...")
  for name in (file.name for file in files):
    file = genai.get_file(name)
    while file.state.name == "PROCESSING":
      print(".", end="", flush=True)
      time.sleep(10)
      file = genai.get_file(name)
    if file.state.name != "ACTIVE":
      raise Exception(f"File {file.name} failed to process")
  print("...all files ready")
  print()

## Step 4: Load the Gemini 1.5 model

In [4]:
# Create the model
generation_config = {
  "temperature": 1,
  "top_p": 0.95,
  "top_k": 64,
  "max_output_tokens": 8192,
  "response_mime_type": "text/plain",
}

model = genai.GenerativeModel(
  model_name="gemini-1.5-flash",
  generation_config=generation_config,
)

## Step 5: Upload your large file to Gemini 1.5

In [5]:
romanian_book = "/kaggle/input/scrisori-ctre-vasile-alecsandri-de-ion-ghica/Scrisori_catre_Vasile_Alecsandri.pdf"

In [6]:
files = [
  upload_to_gemini(romanian_book, mime_type="application/pdf"),
]

wait_for_files_active(files)

chat_session = model.start_chat(
  history=[
    {
      "role": "user",
      "parts": [
        files[0],
      ],
    }
  ]
)

Uploaded file 'Scrisori_catre_Vasile_Alecsandri.pdf' as: https://generativelanguage.googleapis.com/v1beta/files/njl6x85wzdf8
Waiting for file processing...
...all files ready



## Step 6: Ask Gemini 1.5 questions about your large file

In [7]:
def query_model(prompt):
    start_time = time()
    response = chat_session.send_message(prompt)
    print(response.text)
    print(response.usage_metadata)
    end_time = time()
    print(f"Total query time: {round(end_time - start_time, 2)} sec.")

In [8]:
query_model("Make a summary of this book, and write this summary in French. Do it in less than 1000 typograpical signs")

Voici un résumé du livre "Scrisori către Vasile Alecsandri" de Ion Ghica, en français et en moins de 1000 signes :

Ce recueil de lettres, adressées à Vasile Alecsandri, couvre la période 1835-1880.  Ghica évoque la vie politique et sociale de la Roumanie,  son évolution entre l'influence ottomane et russe. Il décrit des anecdotes, les personnalités rencontrées, et reflète les bouleversements de son époque. L’ouvrage aborde des thèmes comme la liberté, l'égalité, les luttes pour l'indépendance et les difficultés d'une transition politique.  On y trouve aussi des observations sur l'éducation et la société roumaine.

prompt_token_count: 421698
candidates_token_count: 159
total_token_count: 421857

Total query time: 80.12 sec.


In [9]:
query_model("Make the summary of the first chapter in the book. Do it in maximum 3 phrases. Write this answer in German.")

Das erste Kapitel der Briefsammlung von Ion Ghica an Vasile Alecsandri beschreibt die Einführung des 19. Jahrhunderts in Rumänien, die Modernisierung des Landes und die damit verbundenen gesellschaftlichen und politischen Veränderungen.  Es wird die Entwicklung der Bildung und die Ablösung der Phanarioten hervorgehoben.  Der Autor beleuchtet die schwierige Übergangszeit des Landes zwischen dem alten und neuen System.

prompt_token_count: 421884
candidates_token_count: 84
total_token_count: 421968

Total query time: 39.19 sec.


In [10]:
query_model("Summarize the chapter where is described a journey from Bucharest, the capital of Valachia to Iassi, the capital of Moldavia. Do it in maximum 10 phrases. Do it in Italian.")

Il capitolo descrive un viaggio da Bucarest a Iași, prima del 1848.  Il percorso era lungo e faticoso,  attraversando  zone paludose e impervie. L'autore incontra truppe ottomane di stanza lungo il Danubio.  Le strade erano in pessime condizioni.  Il viaggio mette in luce la presenza di guarnigioni turche. Si descrivono le fortificazioni lungo il fiume. Si notano le diverse denominazioni dei villaggi, derivanti dalla presenza turca. Il racconto evidenzia le differenze tra insegnanti greci e romeni. Il viaggio evidenzia le difficoltà e i pericoli del viaggio nell'epoca pre-moderna.  Infine, il capitolo descrive la presenza di fortezze turche lungo il Danubio.

prompt_token_count: 422012
candidates_token_count: 172
total_token_count: 422184

Total query time: 39.49 sec.


Credit:
 - Adapted from https://aistudio.google.com/app/prompts/video-qa