<a href="https://www.kaggle.com/code/gpreda/gemini-1-5-q-a-from-a-large-romanian-book?scriptVersionId=209250499" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Load a pdf from a dataset on Kaggle

## Step 1: Import Python Packages

In [1]:
import os
import time
import google.generativeai as genai
from kaggle_secrets import UserSecretsClient
from time import time

## Step 2: Authenticate with Google Generative AI

In [2]:
user_secrets = UserSecretsClient()
ai_studio_token = user_secrets.get_secret("GEMINI_API_KEY_SECOND")
genai.configure(api_key=ai_studio_token)

## Step 3: Define helper functions

In [3]:
def upload_to_gemini(path, mime_type=None):
  """Uploads the given file to Gemini.

  See https://ai.google.dev/gemini-api/docs/prompting_with_media
  """
  file = genai.upload_file(path, mime_type=mime_type)
  print(f"Uploaded file '{file.display_name}' as: {file.uri}")
  return file

def wait_for_files_active(files):
  """Waits for the given files to be active.

  Some files uploaded to the Gemini API need to be processed before they can be
  used as prompt inputs. The status can be seen by querying the file's "state"
  field.

  This implementation uses a simple blocking polling loop. Production code
  should probably employ a more sophisticated approach.
  """
  print("Waiting for file processing...")
  for name in (file.name for file in files):
    file = genai.get_file(name)
    while file.state.name == "PROCESSING":
      print(".", end="", flush=True)
      time.sleep(10)
      file = genai.get_file(name)
    if file.state.name != "ACTIVE":
      raise Exception(f"File {file.name} failed to process")
  print("...all files ready")
  print()

## Step 4: Load the Gemini 1.5 model

In [4]:
# Create the model
generation_config = {
  "temperature": 1,
  "top_p": 0.95,
  "top_k": 64,
  "max_output_tokens": 8192,
  "response_mime_type": "text/plain",
}

model = genai.GenerativeModel(
  model_name="gemini-1.5-flash",
  generation_config=generation_config,
)

## Step 5: Upload your large file to Gemini 1.5

In [5]:
romanian_book = "/kaggle/input/scrisori-ctre-vasile-alecsandri-de-ion-ghica/Scrisori_catre_Vasile_Alecsandri.pdf"

In [6]:
files = [
  upload_to_gemini(romanian_book, mime_type="application/pdf"),
]

wait_for_files_active(files)

chat_session = model.start_chat(
  history=[
    {
      "role": "user",
      "parts": [
        files[0],
      ],
    }
  ]
)

Uploaded file 'Scrisori_catre_Vasile_Alecsandri.pdf' as: https://generativelanguage.googleapis.com/v1beta/files/ebb7xlmsuqim
Waiting for file processing...
...all files ready



## Step 6: Ask Gemini 1.5 questions about your large file

In [7]:
def query_model(prompt):
    response = chat_session.send_message(prompt)
    print(response.text)
    print(response.usage_metadata)

In [8]:
query_model("Make a summary of this book, and write this summary in French. Do it in less than 1000 typograpical signs")

Bien sûr, voici un résumé du livre « Scrisori către Vasile Alecsandri » d'Ion Ghica, en français et en moins de 1000 signes :

Ce recueil de lettres, adressées à Vasile Alecsandri par Ion Ghica, offre un témoignage précieux sur la Roumanie du XIXe siècle.  On y retrouve des portraits de personnages clés, des anecdotes sur la vie quotidienne, des réflexions sur la politique et la société de l’époque,  ainsi que des descriptions pittoresques de voyages.  Ghica évoque notamment les défis de la modernisation et de la lutte pour l’indépendance nationale. L’ouvrage aborde des thèmes comme la liberté, l’égalité et l’impact de la Révolution française.  Il est ponctué d’anecdotes savoureuses et de réflexions personnelles de Ghica sur son temps.

prompt_token_count: 421698
candidates_token_count: 181
total_token_count: 421879



In [9]:
query_model("Make the summary of the first chapter in the book. Do it in maximum 3 phrases. Write this answer in German.")

Das erste Kapitel bietet eine Einleitung zu Ion Ghicas Briefen an Vasile Alecsandri, geschrieben von Alecsandri selbst. Es skizziert den historischen Kontext der Korrespondenz, beginnend mit dem Ende der Phanariotenherrschaft und der Entstehung des modernen Rumänien.  Schliesslich werden die Themen der Briefe, die soziale und politische Situation Rumäniens, vorgestellt.

prompt_token_count: 421906
candidates_token_count: 79
total_token_count: 421985



In [10]:
query_model("Summarize the chapter where is described a journey from Bucharest, the capital of Valachia to Iassi, the capital of Moldavia. Do it in maximum 10 phrases. Do it in Italian.")

Il capitolo descrive un viaggio da Bucarest a Iași, prima del 1848.  Il viaggio è lungo e faticoso, con strade dissestate e pericoli vari.  L'autore incontra truppe ottomane di stanza in varie località.  Si descrivono dettagliatamente le condizioni delle strade e le difficoltà incontrate.  Il viaggio evidenzia il contrasto tra le aree urbane e quelle rurali.  Si sottolinea la presenza di fortificazioni lungo il Danubio.  Vengono menzionate le diverse tipologie di insegnanti presenti nel paese.  Il viaggio termina a Iași, sottolineando il contrasto con Bucarest.  La descrizione offre uno spaccato della vita e della geografia della Romania dell'epoca. Si mette in evidenza la presenza di guarnigioni turche nel paese.

prompt_token_count: 422029
candidates_token_count: 177
total_token_count: 422206



Credit:
 - Adapted from https://aistudio.google.com/app/prompts/video-qa