In [7]:
import google.generativeai as genai
import pandas as pd
import json
from tqdm import tqdm
from langchain.prompts.chat import (ChatPromptTemplate, HumanMessagePromptTemplate)
import time

N_GENERATIONS = 10

In [8]:
with open("api_keys.json") as f:
    api_keys = json.load(f)
    gemini_api_key = api_keys["gemini"]
genai.configure(api_key=gemini_api_key)
model = genai.GenerativeModel(model_name='gemini-1.5-flash')

In [9]:
QA_generation_prompt = """
Your task is to write a factoid question and an answer based on the given Terms and Conditions (T&C) context.
Your factoid question should target specific clauses, rights, obligations, or policies from the context, focusing on key details that a user may seek clarity on.
Make sure your factoid question resembles the type of questions a user might ask when seeking information about T&C, and make sure you incorporate the company name in the question such as "What is the policy on returns in <company name>?" or "How are user data stored in <company name>?"

Avoid references to "the passage" or "context" in your question and avoid directing reader to the documents for more information.

Provide your answer as follows:

Factoid question: (your factoid question)
Answer: (your detailed and precise answer to the factoid question)

Now here is the context.

Context: {context}
"""

prompt_template = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(QA_generation_prompt)])

In [10]:
def call_llm(model, prompt):
    response = model.generate_content(prompt)
    return response.text

In [11]:
documents = pd.read_csv('documents.csv')

In [42]:
outputs = []
for doc in tqdm(documents.sample(n=N_GENERATIONS, random_state=48).iterrows()):
    # Generate QA couple
    context = " Company:" + doc[1]['company_names'] + doc[1]['documents']
    output_QA_couple = call_llm(model, prompt_template.format(context=context))
    time.sleep(0.5)
    print(output_QA_couple)
    try:
        question = output_QA_couple.split("Factoid question:")[-1].split("Answer: ")[0]
        answer = output_QA_couple.split("Answer: ")[-1]
        outputs.append(
            {
                "company": doc[1]['company_names'],
                "question": question,
                "right answer": answer,
            }
        )
    except:
        print("Error")
        continue

1it [00:02,  2.57s/it]

Factoid question: What is the policy regarding the posting of ads for real estate on Bazaraki?
Answer: Bazaraki allows ads for properties located only within the Republic of Cyprus. They act as an advertising platform and are not a licensed real estate agent, meaning they do not assist with any transactions related to real estate. Only owners of real estate, developers, agents, legal representatives of owners with a Power of Attorney, and first-degree relatives of owners can list real estate on Bazaraki. Bazaraki is not obligated to review or control any person or organization posting ads on their platform. They do not arrange viewings for listed properties and the information provided is purely for informational purposes, not constituting an offer or invitation to sell or rent real estate. Additionally, Bazaraki may, at their discretion, remove or decline to display any information or ad that does not comply with their terms and conditions. They may require the information to be amend

2it [00:04,  1.94s/it]

Factoid question: What is the policy regarding returns and refunds for items purchased through ZVAB?
Answer: ZVAB sellers are obligated to accept returns from customers in accordance with the return policies outlined in the Seller Policies.  Refunds for returned items paid for through the External Payment Service Provider are managed by AbeBooks.  ZVAB sellers will either be invoiced for refunds or AbeBooks will debit refunds against the proceeds due from the External Payment Service Provider. ZVAB sellers are prohibited from charging restocking or similar fees in connection with returns and/or refunds. 


3it [00:05,  1.64s/it]

Factoid question: What happens to my Xiaomi Account if I don't use it for two years?
Answer: If you haven't used your Xiaomi Account for 24 consecutive months or haven't signed in using other approved methods, Xiaomi has the right to cancel your account. This means you'll lose access to the account and related services. However, Xiaomi will provide reasonable assistance with pending transactions or balances associated with the account. You should follow Xiaomi's instructions and notifications for this process. 


4it [00:06,  1.58s/it]

Factoid question: What is Eatwith's policy on hosts offering discounts or cash back to guests outside of the Eatwith platform?

Answer: Eatwith prohibits hosts from offering discounts or cash back to guests outside of the Eatwith platform. Hosts are not allowed to bypass the Eatwith platform for payments or to offer any portion of their commission or turnover to guests or potential guests.  This policy is in place to protect the integrity of the Eatwith platform and ensure that all transactions are properly managed and accounted for. 


5it [00:08,  1.56s/it]

Factoid question: What are the insurance requirements for publishing an Experience on the Airbnb platform?

Answer: Airbnb may require Experience Hosts to obtain their own insurance to publish an Experience. The company will notify hosts of any changes to the insurance requirements. Hosts are expected to acquire and maintain insurance for themselves, their team, and their experience with the coverage and amounts specified by Airbnb. Hosts must cooperate with Airbnb to verify their insurance coverage. If Airbnb has its own liability insurance covering Experiences, the host's insurance will be the primary source of coverage, and Airbnb's insurance will function as excess or secondary insurance for amounts exceeding the host's coverage. However, procuring secondary insurance by Airbnb does not relieve hosts of their obligation to obtain insurance in amounts required by the company. 


6it [00:09,  1.59s/it]

Factoid question: What is Tripadvisor's policy regarding the posting of content that infringes on someone's copyright?
Answer: Tripadvisor operates on a "notice and takedown" basis. If you believe that material or content posted on the Services infringes a copyright that you hold, you can contact them by following their notice and takedown procedure. Tripadvisor will then make all reasonable efforts to remove manifestly illegal content within a reasonable time. 


7it [00:11,  1.50s/it]

Factoid question: What is the policy on recording lessons in Preply?
Answer: Preply may record lessons to ensure quality and may use those recordings without compensation to the user. Users can opt out of recordings or request removal of existing recordings by contacting support@preply.com. 


8it [00:13,  1.56s/it]

Factoid question: What is the policy regarding payments for Cash Orders placed with Thuisbezorgd.nl? 
Answer: Thuisbezorgd.nl does not accept Cash Orders when they are procuring the Delivery Services. If Thuisbezorgd.nl is not procuring the Delivery Services, then the Restaurant will receive the payment from Customers for Cash Orders. 


9it [00:14,  1.47s/it]

Factoid question: What is BeReal's policy on retaining data related to user interactions, such as friend invitations and comments?

Answer: BeReal retains interaction data indefinitely, meaning for as long as a user has an active account or until the interaction is deleted by the user. This includes data like friend invitations, reactions, and comments left on content shared by others. 


10it [00:15,  1.59s/it]

Factoid question: What happens to my TikTok For Business account if I violate the TikTok For Business Commercial Terms of Service?
Answer: TikTok reserves the right to suspend, terminate, or restrict access to your TikTok For Business account if they determine you have violated the Commercial Terms of Service, are about to materially breach the terms, or are causing harm to TikTok, its users, or other third parties. TikTok may also terminate your account for convenience with 30 days' prior written notice. 





In [12]:
new_testset = pd.concat([new_testset, pd.DataFrame(outputs)])
new_testset

Unnamed: 0,company,question,right answer
0,Bolt,What is the policy on retaining driver data a...,"After a driver's account is closed, Bolt Head..."
1,Shopify,How long does Shopify keep store information ...,Shopify retains store information for two year...
2,Amazon,What is Amazon's policy regarding the storage...,"Amazon states that they will not retain, use, ..."
3,Uber,What is the policy on returns in Uber?\n,The Uber Terms of Service do not explicitly m...
4,TikTok,What company am I contracting with when I use...,"If you are resident in the United Kingdom, you..."
5,Google Search,What happens to my Google One membership afte...,After the promotional period (Offer Period) en...
6,Instagram,What is the maximum time it can take for Inst...,"According to Instagram's policies, it can take..."
7,AliExpress,What is the policy on the number of product l...,AliExpress reserves the right to place restri...
8,azuremarketplace.microsoft.com,What are the governing terms and conditions f...,If you purchase Azure services through a Micr...
9,Tradera,What kind of data does Tradera collect automa...,Tradera automatically collects information sen...


In [None]:
new_testset.to_excel('new_testset.xlsx', index=False)