In [1]:
system_prompt = """
## Task description

You are an expert at generating relevant questions given some chunk of text. Your task is to read that text and create one or more questions that accurately reflect the key points, or information contained within that chunk.

This task is important for creating question-answer pairs that will be used to benchmark embedding models.

## Output format

You will output a JSON array of objects. Each object will have the following structure:

{
    "question": "A question generated from the chunk of text.",
}

## Examples

### Example 1

#### Text:

bitsandbytes enables accessible large language models via k-bit quantization for PyTorch. bitsandbytes provides three main features for dramatically reducing memory consumption for inference and training:

- 8-bit optimizers uses block-wise quantization to maintain 32-bit performance at a small fraction of the memory cost.
- LLM.int8() or 8-bit quantization enables large language model inference with only half the required memory and without any performance degradation. This method is based on vector-wise quantization to quantize most features to 8-bits and separately treating outliers with 16-bit matrix multiplication.
- QLoRA or 4-bit quantization enables large language model training with several memory-saving techniques that don’t compromise performance. This method quantizes a model to 4-bits and inserts a small set of trainable low-rank adaptation (LoRA) weights to allow training.

#### Output:

[
    {
        "question": "What is the primary purpose of the bitsandbytes library?"
    },
    {
        "question": "What are the three main features bitsandbytes provides for reducing memory consumption?"
    },
    {
        "question": "How does the LLM.int8() feature work to enable 8-bit inference without performance degradation?"
    },
    {
        "question": "What is QLoRA and how does it utilize 4-bit quantization for training?"
    },
    {
        "question": "What quantization technique do the 8-bit optimizers in bitsandbytes use?"
    },
    {
        "question": "Why is vector-wise quantization used in the LLM.int8() method, and how are outliers handled?"
    }
]

### Example 2

#### Text:

Creating your first star trail image
How it works

You’ve captured hundreds of photos, and now it’s time to blend them together to create your first star trail image. Each photo can be thought of as having two parts: the stars and the background.

The background remains still, while the stars appear to move from frame to frame due to Earth’s rotation. Our goal is to keep the background consistent while revealing the stars’ motion across the sky.

To do this, we gradually blend the images together to trace the path of each star. The pixels representing the background are mostly dark, while the ones representing the stars are bright.

We use the lighten blending mode for this process, it takes the brighter value between a pixel in one image and the corresponding pixel in the next. This simple rule creates the illusion of continuous trails.

If you can’t quite visualize it yet, don’t worry, I’ve included some illustrations to show exactly how this blending algorithm works.

To demonstrate how the lighten blending mode works, I created five small 8×8 images. Each cell is colored either black (background) or white (stars).

In the white cells, I’ve placed the number 1, which represents full brightness. In practice, each pixel in an 8-bit image stores a value between 0 and 255, where 255 corresponds to the maximum brightness. So, in this simplified example, 1 stands for 255.

The black cells correspond to a brightness value of 0. For clarity, they are left empty in the diagram because they are greater in number than the white cells.

In the first image, there are three stars. From one frame to the next, each star moves one pixel to the right and one pixel down. By the final frame, only one star remains visible, as the other two have moved outside the 8×8 grid.

Now, let’s apply the lighten blending mode to the first two images. In the illustration, you’ll see them represented as inputs to the max() function. This function compares the pixel values from both images and keeps the brighter one for each position. The resulting image is a blend of the two, showing all the stars that were visible in either frame.

The blended image becomes the new input for the next iteration of the max() function. The third image is then used as the second argument. Blend these two together, and repeat the process with the remaining images until you reach the fifth one.

The final blended result is your complete star trail image, showing the continuous paths traced by the stars over time.

#### Output:

[
    {
        "question": "What is the primary goal when blending photos to create a star trail image?"
    },
    {
        "question": "Why do the stars appear to move from one photo to the next?"
    },
        {
        "question": "What are the two main components of each image used to create a star trail photo?"
    },
    {
        "question": "What specific blending mode is used to create the star trail effect?"
    },
    {
        "question": "In the simplified 8×8 grid example, what do the white cells and black cells represent in terms of pixel brightness?"
    },
    {
        "question": "What mathematical function is used in the illustration to represent the lighten blending mode?"
    },
    {
        "question": "How does on a pixel level?"
    },
    {
        "question": "Describe the iterative process of blending the images to create the final star trail."
    },
]

## Notes

- Don't generate questions that are similar to each other.
- Questions should be relevant to the content of the chunk of text.
- Generate only a JSON array as specified in the output format.
"""

In [2]:
from pydantic import BaseModel


class Question(BaseModel):
    question: str

In [None]:
from google import genai
google_client = genai.Client(api_key="")

def generate_questions(text_chunk:str,system_prompt:str,model_name:str = "gemini-2.5-flash")-> list[Question]:
    try:
        response = google_client.models.generate_content(model = model_name,contents=[system_prompt,text_chunk,],config={"response_mime_type": "application/json",
                "response_schema": list[Question],
                "max_output_tokens": 65_536,},)
        questions:list[Question] = response.parsed
        return questions
    except Exception as e:
        print(f"Error generating questions:{e}")
        return []
                       


In [7]:
import json

file_path = "/content/drive/MyDrive/data/chunks/rog_strix_gaming_notebook_pc_unscanned_file_chunks.json"
with open(file_path, "r") as f:
    text_chunks = json.load(f)

len(text_chunks)

30

In [13]:
!pip install tqdm



In [17]:
import time
from tqdm import tqdm

question_answer_pairs = []

for text_chunk_object in tqdm(iterable=text_chunks, total=len(text_chunks)):
    text_chunk = text_chunk_object["text_chunk"]
    questions =generate_questions(text_chunk=text_chunk,system_prompt=system_prompt)
    if not questions:
        continue
    question_answer_pairs.extend([
            {"chunk_id": text_chunk_object["id"], "question": question_object.question}
            for question_object in questions
        ])
    time.sleep(5)


 70%|███████   | 21/30 [03:28<00:57,  6.38s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 46.734393718s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 73%|███████▎  | 22/30 [03:28<00:36,  4.53s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 46.515394601s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 77%|███████▋  | 23/30 [03:28<00:22,  3.23s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 46.298797686s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 80%|████████  | 24/30 [03:28<00:13,  2.33s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 46.082644684s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 83%|████████▎ | 25/30 [03:28<00:08,  1.70s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 45.857180502s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 87%|████████▋ | 26/30 [03:29<00:05,  1.25s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 45.630138197s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 90%|█████████ | 27/30 [03:29<00:02,  1.06it/s]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 45.417384234s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

 93%|█████████▎| 28/30 [03:29<00:01,  1.38it/s]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 45.195801123s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'model': 'gemini-2.5-flash', 'locatio

 97%|█████████▋| 29/30 [03:29<00:00,  1.74it/s]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 44.980093055s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin

100%|██████████| 30/30 [03:29<00:00,  7.00s/it]

Error generating questions:429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash\nPlease retry in 44.754514998s.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerDayPerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemin




In [None]:
len(question_answer_pairs)

In [None]:
import os
import json

output_directory = "../data/question_answer_pairs/"
os.makedirs(output_directory, exist_ok=True)

file_name = file_path.split("/")[-1].replace(".json", "_qa_pairs.json")
output_file_path = os.path.join(output_directory, file_name)

with open(output_file_path, "w") as f:
    json.dump(question_answer_pairs, f, indent=2)