# Getting question via Gemini

## **Loading Gemini**

In [None]:
!pip install -q -U google-generativeai

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/142.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m142.2/142.2 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/664.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.5/664.5 kB[0m [31m47.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import google.generativeai as genai

# Used to securely store your API key
from google.colab import userdata

In [None]:
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

In [None]:
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-1.5-pro-latest
models/gemini-pro
models/gemini-pro-vision


In [None]:
safety_settings = [
  {
    "category": "HARM_CATEGORY_HARASSMENT",
    "threshold": "BLOCK_NONE"
  },
  {
    "category": "HARM_CATEGORY_HATE_SPEECH",
    "threshold": "BLOCK_NONE"
  },
  {
    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
    "threshold": "BLOCK_NONE"
  },
  {
    "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
    "threshold": "BLOCK_NONE"
  },
]

model = genai.GenerativeModel(model_name="gemini-pro",
                              #generation_config=generation_config,
                              safety_settings=safety_settings)

## **Book name + Summary**

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv("/content/drive/MyDrive/Coding/AI-Builder/TheActualProject/DataCollection/booksummaries.txt",
                              header=None,sep="\t",
                              names=["ID", "Freebase ID", "Book Name", "Book Author", "Pub date","Genres","Summary"])
df = df[['ID', 'Book Name', 'Summary']]
df

Unnamed: 0,ID,Book Name,Summary
0,620,Animal Farm,"Old Major, the old boar on the Manor Farm, ca..."
1,843,A Clockwork Orange,"Alex, a teenager living in near-future Englan..."
2,986,The Plague,The text of The Plague is divided into five p...
3,1756,An Enquiry Concerning Human Understanding,The argument of the Enquiry proceeds by a ser...
4,2080,A Fire Upon the Deep,The novel posits that space around the Milky ...
...,...,...,...
16554,36934824,Under Wildwood,"Prue McKeel, having rescued her brother from ..."
16555,37054020,Transfer of Power,The reader first meets Rapp while he is doing...
16556,37122323,Decoded,The book follows very rough chronological ord...
16557,37132319,America Again: Re-becoming The Greatness We Ne...,Colbert addresses topics including Wall Stree...


## **Asking question & Collecting data**

### **Test set: Unseen + seen (100-100)(head(202))**

In [None]:
%%time

limited_df = df.head(202) #Trying smaller df first

data = []
error = []
for index, row in limited_df.iterrows():
  ID = row['ID']
  name = row['Book Name']
  summary = row['Summary']
  try:
    response = model.generate_content(f"""Book name: {name}
      Summary: {summary}
      If you were to ask some question to get a librarian to know what book you wanted to buy, without mentioning the title or any specific detail. (ask only 8 questions)""")
    data.append([ID, name, response.text])
  except Exception as e:
    print(f"Error for book {name}: {e}")
    error.append(name)

new_df = pd.DataFrame(data, columns=['ID', 'Book Name', 'Questions'])
new_df.to_csv('TESTUNCLEANbookquestions.csv', index=False)

Error for book A Clockwork Orange: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book The World According to Garp: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
CPU times: user 13.8 s, sys: 1.44 s, total: 15.3 s
Wall time: 16min 27s


In [None]:
new_df

Unnamed: 0,ID,Book Name,Questions
0,620,Animal Farm,1. Do you have any books that explore themes o...
1,986,The Plague,1. Can you recommend a book that explores the ...
2,1756,An Enquiry Concerning Human Understanding,1. Can you recommend a book on the philosophy ...
3,2080,A Fire Upon the Deep,1. Does the book explore the impact of technol...
4,2152,All Quiet on the Western Front,1. Can you recommend a book that portrays the ...
...,...,...,...
195,58622,Consider Phlebas,1. Can you recommend a science fiction novel t...
196,58665,Inversions,1. Can you recommend a captivating historical ...
197,58888,An Inspector Calls,1. Is there a play that explores the themes of...
198,58901,Ender's Game,1. Is there a book that explores the complexit...


In [None]:
error

['A Clockwork Orange', 'The World According to Garp']

### **Cleaning Test Set**

In [None]:
testdata = pd.read_csv('/content/drive/MyDrive/Coding/AI-Builder/TheActualProject/DataCollection/TESTUNCLEANbookquestions.csv')
testdf = pd.DataFrame(testdata)
testdf

Unnamed: 0,ID,Book Name,Questions
0,620,Animal Farm,1. Do you have any books that explore themes o...
1,986,The Plague,1. Can you recommend a book that explores the ...
2,1756,An Enquiry Concerning Human Understanding,1. Can you recommend a book on the philosophy ...
3,2080,A Fire Upon the Deep,1. Does the book explore the impact of technol...
4,2152,All Quiet on the Western Front,1. Can you recommend a book that portrays the ...
...,...,...,...
195,58622,Consider Phlebas,1. Can you recommend a science fiction novel t...
196,58665,Inversions,1. Can you recommend a captivating historical ...
197,58888,An Inspector Calls,1. Is there a play that explores the themes of...
198,58901,Ender's Game,1. Is there a book that explores the complexit...


In [None]:
%%time

data = []
for index, row in testdf.iterrows():
  ID = row['ID']
  name = row['Book Name']
  question = row['Questions']
  response = model.generate_content(f"""From these choices, give me one that most resembles a customer buying a book.
  {question}""")
  data.append([ID, name, response.text])

new_testdf = pd.DataFrame(data, columns=['ID', 'Book Name', 'Questions'])
new_testdf.to_csv('TESTCLEANbookquestions.csv', index=False)

CPU times: user 5.63 s, sys: 549 ms, total: 6.18 s
Wall time: 5min 55s


In [None]:
new_testdf

Unnamed: 0,ID,Book Name,Questions
0,620,Animal Farm,3. Which book features a character named Old M...
1,986,The Plague,1. Can you recommend a book that explores the ...
2,1756,An Enquiry Concerning Human Understanding,1. Can you recommend a book on the philosophy ...
3,2080,A Fire Upon the Deep,None of the choices most resembles a customer ...
4,2152,All Quiet on the Western Front,1. Can you recommend a book that portrays the ...
...,...,...,...
195,58622,Consider Phlebas,1. Can you recommend a science fiction novel t...
196,58665,Inversions,1. Can you recommend a captivating historical ...
197,58888,An Inspector Calls,2. Do you have any books about a family gather...
198,58901,Ender's Game,1. Is there a book that explores the complexit...


## **Train set (run in vscode)**

In [None]:
%%time

limited_df = df[202:]   #Get df after the test set

data = []
error = []
for index, row in limited_df.iterrows():
  ID = row['ID']
  name = row['Book Name']
  summary = row['Summary']
  try:
    response = model.generate_content(f"""Book name: {name}
      Summary: {summary}
      If you were to ask some question to get a librarian to know what book you wanted to buy, without mentioning the title or any specific detail. (ask only 8 questions)""")
    data.append([ID, name, response.text])
  except Exception as e:
    print(f"Error for book {name}: {e}")
    error.append(name)

new_df = pd.DataFrame(data, columns=['ID', 'Book Name', 'Question'])
new_df.to_csv('TRAINbookquestions.csv', index=False)

Error for book Choke: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book Lost Girls: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book The Tale of Genji: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book East of Eden: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book Equus: The `response.parts` quick accessor only works for a single candidate, but none were returned. Check the `response.prompt_feedback` to see if the prompt was blocked.
Error for book Fanny Hill: The `resp

## **Making it asynchronous(didn't work:💀)**

In [None]:
import asyncio

limited_df = df.head(10)

# Define a coroutine to generate content for a single book
async def generate_content(name, summary):
  try:
    result = await model.generate_content_async(f"""Book name: {name}
      Summary: {summary}
      From this information if you were to buy this book but you don't know the name of this book. How would you ask questions to get other people to know what book you wanted.(You can ask only 8 question)""")
    return [name, result.text]
  except Exception as e:
    print(f"Error for book {name}: {e}")
    return [name, None]

# Create a list of coroutines
coroutines = [generate_content(name, summary) for name, summary in zip(limited_df['Book Name'], limited_df['Summary'])]

# Run the coroutines concurrently using asyncio.gather
new_data = await asyncio.gather(*coroutines)

# Filter out any errors
new_df = pd.DataFrame([d for d in new_data if d[1] is not None], columns=['Book Name', 'Summary'])

Error for book Animal Farm: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Clockwork Orange: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book The Plague: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book An Enquiry Concerning Human Understanding: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Fire Upon the Deep: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book All Quiet on the Western Front: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Wizard of Earthsea: Unexpected type o

In [None]:
import asyncio

limited_df = df.head(10)
sem = asyncio.Semaphore(10)

async def generate_content(name, summary):
  async with sem:
    try:
      result = await model.generate_content_async(f"""Book name: {name}
        Summary: {summary}
        From this information if you were to buy this book but you don't know the name of this book. How would you ask questions to get other people to know what book you wanted.(You can ask only 8 question)""")
      return [name, result.text]
    except Exception as e:
      print(f"Error for book {name}: {e}")
      return [name, None]

async def main():
  tasks = [generate_content(name, summary) for name, summary in zip(limited_df['Book Name'], limited_df['Summary'])]
  result = await asyncio.gather(*tasks)
  new_df = pd.DataFrame([d for d in result if d[1] is not None], columns=['Book Name', 'Summary'])

await main()

Error for book Animal Farm: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Clockwork Orange: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book The Plague: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book An Enquiry Concerning Human Understanding: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Fire Upon the Deep: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book All Quiet on the Western Front: Unexpected type of call <class 'google.ai.generativelanguage_v1beta.types.generative_service.GenerateContentResponse'>
Error for book A Wizard of Earthsea: Unexpected type o

In [None]:
new_df

Unnamed: 0,Book Name,Summary
0,Animal Farm,1. Is there a book where animals rebel against...
1,The Plague,1. Is there a book where a town is quarantined...
2,An Enquiry Concerning Human Understanding,1. Is it a philosophy book?\n2. Does it discus...
3,A Fire Upon the Deep,1. Can you recommend a science fiction novel a...
4,All Quiet on the Western Front,1. Can you recommend a book that portrays the ...
5,A Wizard of Earthsea,1. Is there a novel that features a young boy ...
6,Anyone Can Whistle,1. Can you recommend a play set in an imaginar...
7,Blade Runner 3: Replicant Night,1. Is it a sci-fi book set in a futuristic wor...
8,Blade Runner 2: The Edge of Human,1. Is there a book that continues the story of...
