In [67]:
import requests
from loguru import logger
from pathlib import Path
import os
import uuid
from typing import Generator
import numpy as np
import simplejson as json
import torch
from justatom.tooling.dataset import source_from_dataset
from justatom.etc.schema import Document
from more_itertools import chunked
import json_repair
import polars as pl

from justatom.storing.weaviate import Finder as WeaviateApi

from tqdm import tqdm

### ✔️ANN document store backed by <a href="https://github.com/weaviate/weaviate">weaviate</a>

> First, let'c make sure you have docker up and running. From the root of directory run:
```bash
docker-compose up -d
```

❗️ By default weavaite will run on port `2211`

In [68]:
collection_name = "JUSTATOM_COLLECTION"
weaviate_host, weaviate_port, weaviate_grpc_port = "localhost", 2211, 50051

In [69]:
store = await WeaviateApi.find(collection_name, WEAVIATE_HOST=weaviate_host, WEAVIATE_PORT=weaviate_port, WEAVIATE_GRPC_PORT=weaviate_grpc_port)

[32m2025-04-01 20:02:19.254[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36mfind[0m:[36m597[0m - [1mFINDER | collection_name=[JUSTATOM_COLLECTION][0m
[32m2025-04-01 20:02:19.254[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36mconnect[0m:[36m185[0m - [1mFINDER | collection_schema_name=[JUSTATOM_COLLECTION][0m


In [70]:
n_docs = await store.count_documents()

In [71]:
logger.info(f"For the collection=[{collection_name}] you have N=[{n_docs}] documents")

[32m2025-04-01 20:02:22.208[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m1[0m - [1mFor the collection=[JUSTATOM_COLLECTION] you have N=[4992] documents[0m


In [72]:
async with store._client:
    all_collections = await store._client.collections.list_all(simple=True)

In [73]:
logger.info(f"COLLECTION | [{', '.join(all_collections)}]")

[32m2025-04-01 20:02:29.437[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m1[0m - [1mCOLLECTION | [Sber_collection, X, Just_collection, Justatom_collection][0m


### ✔️ Prepare datasets

> For this tutorial we will use built-in dataset `polaroids.ai`. This is the dataset from movies, games and books containing paragraphs from various moments

In [74]:
dataset_name_or_path = Path(os.getcwd()) / ".data" / "polaroids.ai.data.json"

In [75]:
pl_docs = source_from_dataset(dataset_name_or_path)

In [76]:
logger.info(f"Columns=[{' | '.join(pl_docs.columns)}]")

[32m2025-04-01 20:02:39.638[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m1[0m - [1mColumns=[title | author | type | has_image | img_path | speaker | keywords_or_phrases | chunk_id | content | queries | answers | are_contexts_present | content_right][0m


>❗️Please do note, that `content` and `id` columns are must have. They describe each "chunk". All the rest fields are optional and would be added to `meta`

In [77]:
# We have `chunk_id` but not `id`. Let's add it as well.

pl_docs = pl_docs.with_columns([
    pl.col("chunk_id").alias("id")
])

> ❗️Let's filter out those chunks having `null` on any of "must-have" columns otherwise pipeline will fail

In [78]:
pl_docs = pl_docs.filter((pl.col("content") != None) & (pl.col("id") != None))

In [79]:
logger.info(f"There are D=[{pl_docs.shape[0]}] unique documents")

[32m2025-04-01 20:02:43.992[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m1[0m - [1mThere are D=[4992] unique documents[0m


>❗️We would like to keep `keywords_or_phrases` and relevant `queries` for each chunk. Let's declare that as well as original `chunk_id` to keep the structure outside of weaviate internal generated UUID-s.

> ❗️❗️ Each chunk is associated with an array of relevant queries to describe it

<small>

|  queries (list[str])  |     content: str     |   chunk_id: str   |
|:---------------------:|:--------------------:|:-----------------:|
| 1. ...thinking about 'The Hunger Games' mechanics, if you were in the same shoes as Gale, entering your name forty-two times to feed your fam, how would you strategize your game in the actual Arena? Would you team up or go solo based on these high stakes? <br><br>2.In the universe of 'The Hunger Games', what are tesserae and what do they offer to the participants in the Harvest?    | And here's where the real interest begins. Suppose you're poor and starving. Then you can ask to be included in the Harvest more times than you're entitled to, and in return you'd get tesserae. They give you grain and oil for a whole year for one tessera per person. You won't be full, but it's better than nothing. You can take tesserae for the whole family. When I was twelve, my name was entered four times. Once by law, and once more for tesserae for Prim, my mother, and myself. The next years had to do the same. And since the price of a tessera increases by one entry each year, now that I've turned sixteen, my name will be on twenty cards. Gale is eighteen, and he's been feeding a family of five for seven years. His name will be entered forty two times! It's clear that people like Madge, who has never had to risk because of tesserae, annoy Gale. Next to us, the inhabitants of the slag heap, she simply has no chance of getting into the games. Well, almost no chance. Of course, the rules are set by the Capitol, not the districts, let alone Madge's relatives, and it's still hard to sympathize with those who, like you, don't have to trade their own skin for a piece of bread.  | 80504cd8-9b21-514c-b001-4761d8c71044         |
|-----------------------|----------------------|-------------------|
| 1.In 'Harry Potter and the Philosopher's Stone', what misconception had Harry and Hermione initially had about Snape's intentions before learning the truth? <br><br>2. Hey peeps, why is Harry all jittery and pacing around the room even after telling Hermione about the whole Snape and Voldemort situation?        | Ron was asleep in the common room - apparently, he had been waiting for their return and had dozed off unnoticed. When Harry roughly shook him, Ron began to yell something about breaking the rules of a game, as if he were dreaming about a Quidditch match. However, after a few seconds, Ron completely woke up and, with his eyes wide open, listened to the story of Hermione and Harry. Harry was so excited that he could not sit still and paced back and forth across the room, trying to stay as close to the fireplace as possible. He was still shaking with cold. 'Snape wants to steal the stone for Voldemort. And Voldemort is waiting in the forest... And all this time we thought Snape wanted to steal the stone to become rich... And Voldemort...'  | 5ad25a92-28d9-5971-a81b-4f795898eeab         |
|-----------------------|----------------------|-------------------|
| 1. Hey fellow gamers, in The Hunger Games universe, if you were in a match where your ally was taken down first like Rue, how would you strategize your next move to survive against top opponents like Cato?<br><br> 2. In the 'Hunger Games' novel, why does Cato decide to spare Katniss's life after their encounter?    | What was she babbling about? You're Rue's ally? - I... I... we teamed up. We blew up the food of the Pros. I wanted to save her. Really did. But he found her first, the guy from District One - I say. Perhaps if Cato knows I helped Rue, he will kill me quickly and painlessly. - Did you kill him? - he asks grimly. - Yes. I killed him. And I covered her body with flowers. I sang to her till she fell asleep. Tears well up in my eyes. Will and strength are leaving me. There's only Rue, the pain in my head, fear of Cato and the moan of the dying girl. - Fell asleep? - mocks Cato. - Died. I sang to her till she died - I say. - Your district... sent me bread. I raise my hand - not for an arrow; I won't have time anyway. I just blow my nose. - Cato, make it quick, okay? His face shows conflicting emotions. Cato puts down the rock and says with almost a reproach: - This time, only this time, I'm letting you go. For the girl. We are even. No one owes anything to anyone anymore, understand? I nod, because I do understand. Understand about debts. About how bad it is to have them. Understand that if Cato wins, he will return to a district that has forgotten the rules to thank me. And Cato is neglecting them, too. Right now, he's not going to crack my head with a stone.  | b317200c-7fd3-5804-bbe4-bff33432ad0e         |
|-----------------------|----------------------|-------------------|

</small>

In [80]:
columns_to_include = [
    "keywords_or_phrases",
    "chunk_id",
    "queries",
]

In [81]:
def wrapper_for_docs(
    pl_data: pl.DataFrame,
    content_field: str,
    keywords_or_phrases_field: str = None,
    batch_size: int = 128,
    dataframe_field: str = None,
    id_field: str = None,
    columns_to_include: list[str] | None = None,
    filters: dict | None = None,
):
    js_data = pl_data.to_dicts()
    for js_chunk in tqdm(js_data):
        js_meta = {k: js_chunk[k] for k in columns_to_include}
        if dataframe_field is None and id_field is None:
            yield dict(content=js_chunk[content_field], meta=js_meta)
        elif dataframe_field is None:
            yield dict(content=js_chunk[content_field], id=js_chunk[id_field], meta=js_meta)
        elif id_field is None:
            yield dict(content=js_chunk[content_field], dataframe=js_chunk[dataframe_field], meta=js_meta)
        else:
            yield dict(content=js_chunk[content_field], id=js_chunk[id_field], dataframe=js_chunk[dataframe_field], meta=js_meta)

In [82]:
js_docs = list(wrapper_for_docs(
    pl_docs,
     content_field="content",
     dataframe_field="title",
     id_field="chunk_id",
     columns_to_include=columns_to_include
))

100%|██████████| 4992/4992 [00:00<00:00, 1890390.54it/s]


### Modeling

> See <a href="https://huggingface.co/intfloat/multilingual-e5-large">E5 large</a> , <a href="https://huggingface.co/intfloat/multilingual-e5-base">E5 base</a>, <a href="https://huggingface.co/intfloat/multilingual-e5-small">E5 small</a> family of encoder models. More coming soon
 
> 📎 <a href="https://arxiv.org/abs/2212.03533">paper</a>

> ❗️For this tutorial we pick the base one `intfloat/multilingual-e5-base` as a trade-off between performance and precision

In [83]:
model_name_or_path = "intfloat/multilingual-e5-base"

from justatom.modeling.mask import ILanguageModel
from justatom.running.m1 import M1LMRunner
from justatom.processing import INFERProcessor, ITokenizer
lm_model = ILanguageModel.load(model_name_or_path)

[32m2025-04-01 20:02:55.625[0m | [1mINFO    [0m | [36mjustatom.modeling.mask[0m:[36mload[0m:[36m144[0m - [1mLoading from huggingface hub via "intfloat/multilingual-e5-base"[0m


In [84]:
def maybe_cuda_or_mps():
    if torch.backends.mps.is_built():
        return "mps"
    elif torch.cuda.is_available():
        return "cuda:0"
    else:
        return "cpu"

In [85]:
device = maybe_cuda_or_mps()

In [86]:
runner = M1LMRunner(model=lm_model, prediction_heads=[], device=device)

[32m2025-04-01 20:03:00.030[0m | [1mINFO    [0m | [36mjustatom.running.m1[0m:[36mto[0m:[36m33[0m - [1mMoving to device mps[0m


In [87]:
processor = INFERProcessor(ITokenizer.from_pretrained(model_name_or_path))

❗️According to the <a href="https://arxiv.org/abs/2212.03533">paper</a> E5 family is trained in assymetric way meaning:

> Use `"query: "` and `"passage: "` correspondingly for asymmetric tasks such as passage retrieval in open QA, ad-hoc information retrieval.

> Use `"query: "` prefix for symmetric tasks such as semantic similarity, bitext mining, paraphrase retrieval.

> Use `"query: "` prefix if you want to use embeddings as features, such as linear probing classification, clustering.

In [88]:
processor.prefix = "passage: "

> Let's put everything together in one simple abstraction - `Indexer`

In [89]:
from justatom.running.indexer import API as IndexerAPI

# 1. "embedding" is the way to index the given ANN store (weaviate)
# 2. runner is responsible for mapping docs to embeddings
# 3. processor is responsible for tokenizing given chunks
# 4. device - compute everything on selected `device`

ix_runner = IndexerAPI.named("embedding", runner=runner, store=store, processor=processor, device=device)

[32m2025-04-01 20:03:03.239[0m | [1mINFO    [0m | [36mjustatom.running.m1[0m:[36mto[0m:[36m33[0m - [1mMoving to device mps[0m


In [30]:
async for js_batch_docs in ix_runner.index(js_docs, batch_size=512, batch_size_per_request=128):
    pass

Preprocessing dataset:   0%|          | 0/10 [00:00<?, ? Dicts/s]

0it [00:00, ?it/s]

❗️According to the <a href="https://arxiv.org/abs/2212.03533">paper</a> E5 family is trained in assymetric way meaning we have to set `prefix` back to `query: `

In [90]:
processor.prefix = "query: "

In [91]:
queries = [
    "thinking about 'The Hunger Games' mechanics, if you were in the same shoes as Gale, entering your name forty-two times to feed your fam, how would you strategize your game in the actual Arena? Would you team up or go solo based on these high stakes?",
    "In the universe of 'The Hunger Games', what are tesserae and what do they offer to the participants in the Harvest?",
    "In 'Harry Potter and the Philosopher's Stone', what misconception had Harry and Hermione initially had about Snape's intentions before learning the truth?",
    "Hey peeps, why is Harry all jittery and pacing around the room even after telling Hermione about the whole Snape and Voldemort situation?",
    "Hey fellow gamers, in The Hunger Games universe, if you were in a match where your ally was taken down first like Rue, how would you strategize your next move to survive against top opponents like Cato?",
    "In the 'Hunger Games' novel, why does Cato decide to spare Katniss's life after their encounter?"
] * 1

In [92]:
from justatom.running.retriever import API as RetrieverApi

#### Pure keywords search

In [93]:
retriever = RetrieverApi.named("keywords", store=store)

In [95]:
filters = {
    "operator": "AND",
    "conditions": [
        {
            "field": "dataframe",
            "operator": "==",
            "value": "Сойка-пересмешница"
        }
    ]
}

In [96]:
await store.count_documents()

4992

In [97]:
for pos, query in enumerate(queries):
    response = await retriever.retrieve_topk(query, top_k=1, filters=filters)
    print(response[0][0].content)
    if pos < len(queries) - 1:
        print("\n")

[32m2025-04-01 20:03:35.795[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m
[32m2025-04-01 20:03:35.968[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m
[32m2025-04-01 20:03:36.131[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m


Wearing headphones, I heard Gale's voice telling me to come back. However, the Hunger Games backpack reminded me of something else. Hooking the bag's strap over the back of the chair, I sprinted up the steps to my bedroom. Inside the closet, there was my father's hunting jacket. Before the Suppression, I had brought it here from our old home, thinking that its presence would calm my mother and sister when I die.


Now, after all the hustle and bustle is over and we've reached our goal, I realize I have no idea what I'm going to face in District 8. In fact, I know nothing about the state of the war or what victory will cost. Or, what will happen if we win. Plutarch tries to explain everything to me in simple words. First of all, every district is now at war with the Capitol, except for the second one, which has always been under the patronage of our enemies, despite their participation in the Hunger Games. They received more food and better living conditions. After the Dark Days and the

[32m2025-04-01 20:03:36.403[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m
[32m2025-04-01 20:03:36.557[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m


Extreme measures were applied in District Twelve after I intervened in Gale's punishment. My stylist, Cinna, covered in blood, beaten and unaware of what was happening, in the Launch Room before the start of the Games. Plutarch's sources believed he had been killed during interrogation. The beautiful, enigmatic, and wonderful Cinna was dead because of me. I pushed this thought away, as it was too painful to continue thinking about it without losing my fragile composure. What am I going to do?


What am I going to do?- I whisper to the walls. Because I really don't know that. People keep telling me, telling, telling, telling. Plutarch Heavensbee. His right hand Flavia Cardia. All the gathering of District leaders. Military officials. But not Alma Coin, the president of Thirteen, who is always as punctual as a clock. She was about fifty, and she had gray hair that fell freely onto her shoulders. I'm somewhat charmed by her hair, as she can arrange it into a very tidy bun. Her eyes are gr

[32m2025-04-01 20:03:36.816[0m | [1mINFO    [0m | [36mjustatom.storing.weaviate[0m:[36msearch_by_keywords[0m:[36m506[0m - [1mSEARCH | algo=[BM25] | collection_name=[Justatom_collection][0m


The melody of her death, playing in the background of her murder. The birds didn’t know this. They immediately picked up the simple tune and started hopping back and forth, creating a sweet harmony. Like they did during the Hunger Games, before the killer bees were dropped from the tree, before the race to the Cornucopia, and before Cato was slowly and bloodily gnawed away... "Want to hear them sing a real song?" I blurted out. Nothing can stop these memories. I’m on my feet, moving back into the forest, placing my hand on the rough trunk of the maple where the birds are sitting.


The melody of her death, playing in the background of her murder. The birds didn’t know this. They immediately picked up the simple tune and started hopping back and forth, creating a sweet harmony. Like they did during the Hunger Games, before the killer bees were dropped from the tree, before the race to the Cornucopia, and before Cato was slowly and bloodily gnawed away... "Want to hear them sing a real s

#### Search by embedding

In [98]:
retriever = RetrieverApi.named("embedding", store=store, runner=runner, processor=processor, device=device)

In [99]:
len(queries)

6

In [100]:
for pos, query in enumerate(queries):
    response = await retriever.retrieve_topk(query, top_k=1)
    content = response[0][0].content if len(response[0]) > 0 else "<EMPTY>"
    print(content)
    if pos < len(queries) - 1:
        print("\n")

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

And here's where the real interest begins. Suppose you're poor and starving. Then you can ask to be included in the Harvest more times than you're entitled to, and in return you'd get tesserae. They give you grain and oil for a whole year for one tessera per person. You won't be full, but it's better than nothing. You can take tesserae for the whole family. When I was twelve, my name was entered four times. Once by law, and once more for tesserae for Prim, my mother, and myself. The next years had to do the same. And since the price of a tessera increases by one entry each year, now that I've turned sixteen, my name will be on twenty cards. Gale is eighteen, and he's been feeding a family of five for seven years. His name will be entered forty two times! It's clear that people like Madge, who has never had to risk because of tesserae, annoy Gale. Next to us, the inhabitants of the slag heap, she simply has no chance of getting into the games. Well, almost no chance. Of course, the rule

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

And here's where the real interest begins. Suppose you're poor and starving. Then you can ask to be included in the Harvest more times than you're entitled to, and in return you'd get tesserae. They give you grain and oil for a whole year for one tessera per person. You won't be full, but it's better than nothing. You can take tesserae for the whole family. When I was twelve, my name was entered four times. Once by law, and once more for tesserae for Prim, my mother, and myself. The next years had to do the same. And since the price of a tessera increases by one entry each year, now that I've turned sixteen, my name will be on twenty cards. Gale is eighteen, and he's been feeding a family of five for seven years. His name will be entered forty two times! It's clear that people like Madge, who has never had to risk because of tesserae, annoy Gale. Next to us, the inhabitants of the slag heap, she simply has no chance of getting into the games. Well, almost no chance. Of course, the rule

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

There have been many rumors over the centuries that the Philosopher's Stone has already been created, but the only existing stone today belongs to Mr. Nicholas Flamel, a distinguished alchemist and opera fanatic. Mr. Flamel, who celebrated his six hundred and sixty-fifth birthday last year, enjoys the peace and solitude in Devon with his wife Perenelle (six hundred and fifty-eight years old). 
'Understood?' Hermione asked when Harry and Ron finished reading. 'It must be, the dog safeguards Flamel's philosopher's stone! I have no doubt that he asked Dumbledore to do this, because they are friends and also because Flamel knew that someone was hunting for his stone. That's why he wanted the stone to be withdrawn from Gringotts! 
'The stone that turns everything into gold and guarantees you immortality!' Harry exclaimed. 'No wonder Snape wants to steal it. Anyone would want such a stone.




Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

Ron was asleep in the common room - apparently, he had been waiting for their return and had dozed off unnoticed. When Harry roughly shook him, Ron began to yell something about breaking the rules of a game, as if he were dreaming about a Quidditch match. However, after a few seconds, Ron completely woke up and, with his eyes wide open, listened to the story of Hermione and Harry. Harry was so excited that he could not sit still and paced back and forth across the room, trying to stay as close to the fireplace as possible. He was still shaking with cold. 'Snape wants to steal the stone for Voldemort. And Voldemort is waiting in the forest... And all this time we thought Snape wanted to steal the stone to become rich... And Voldemort...'




Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

What was she babbling about? You're Rue's ally? - I... I... we teamed up. We blew up the food of the Pros. I wanted to save her. Really did. But he found her first, the guy from District One - I say. Perhaps if Cato knows I helped Rue, he will kill me quickly and painlessly. - Did you kill him? - he asks grimly. - Yes. I killed him. And I covered her body with flowers. I sang to her till she fell asleep. Tears well up in my eyes. Will and strength are leaving me. There's only Rue, the pain in my head, fear of Cato and the moan of the dying girl. - Fell asleep? - mocks Cato. - Died. I sang to her till she died - I say. - Your district... sent me bread. I raise my hand - not for an arrow; I won't have time anyway. I just blow my nose. - Cato, make it quick, okay? His face shows conflicting emotions. Cato puts down the rock and says with almost a reproach: - This time, only this time, I'm letting you go. For the girl. We are even. No one owes anything to anyone anymore, understand? I nod,

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

What was she babbling about? You're Rue's ally? - I... I... we teamed up. We blew up the food of the Pros. I wanted to save her. Really did. But he found her first, the guy from District One - I say. Perhaps if Cato knows I helped Rue, he will kill me quickly and painlessly. - Did you kill him? - he asks grimly. - Yes. I killed him. And I covered her body with flowers. I sang to her till she fell asleep. Tears well up in my eyes. Will and strength are leaving me. There's only Rue, the pain in my head, fear of Cato and the moan of the dying girl. - Fell asleep? - mocks Cato. - Died. I sang to her till she died - I say. - Your district... sent me bread. I raise my hand - not for an arrow; I won't have time anyway. I just blow my nose. - Cato, make it quick, okay? His face shows conflicting emotions. Cato puts down the rock and says with almost a reproach: - This time, only this time, I'm letting you go. For the girl. We are even. No one owes anything to anyone anymore, understand? I nod,

#### Search by embedding AND keywords
> ❓How do we combine them? First, introduce a parameter called `alpha`, which can be any value from 0.0 to 1.0. 

> When `alpha = 0.0`, the search relies entirely on keywords (pure keyword search). 

> When `alpha = 1.0`, it uses only semantic embeddings.

In [101]:
retriever = RetrieverApi.named("hybrid", store=store, processor=processor, runner=runner)

[32m2025-04-01 20:03:55.691[0m | [1mINFO    [0m | [36mjustatom.running.retriever[0m:[36m__init__[0m:[36m212[0m - [1mMoving [M1LMRunner] to the new device = cpu. Old device = mps[0m
[32m2025-04-01 20:03:55.692[0m | [1mINFO    [0m | [36mjustatom.running.m1[0m:[36mto[0m:[36m33[0m - [1mMoving to device cpu[0m


In [102]:
for pos, query in enumerate(queries):
    response = await retriever.retrieve_topk(query, top_k=1, alpha=0.78)
    content = response[0][0].content if len(response[0]) > 0 else "<EMPTY>"
    print(content)
    if pos < len(queries) - 1:
        print("\n")

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

And here's where the real interest begins. Suppose you're poor and starving. Then you can ask to be included in the Harvest more times than you're entitled to, and in return you'd get tesserae. They give you grain and oil for a whole year for one tessera per person. You won't be full, but it's better than nothing. You can take tesserae for the whole family. When I was twelve, my name was entered four times. Once by law, and once more for tesserae for Prim, my mother, and myself. The next years had to do the same. And since the price of a tessera increases by one entry each year, now that I've turned sixteen, my name will be on twenty cards. Gale is eighteen, and he's been feeding a family of five for seven years. His name will be entered forty two times! It's clear that people like Madge, who has never had to risk because of tesserae, annoy Gale. Next to us, the inhabitants of the slag heap, she simply has no chance of getting into the games. Well, almost no chance. Of course, the rule

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

And here's where the real interest begins. Suppose you're poor and starving. Then you can ask to be included in the Harvest more times than you're entitled to, and in return you'd get tesserae. They give you grain and oil for a whole year for one tessera per person. You won't be full, but it's better than nothing. You can take tesserae for the whole family. When I was twelve, my name was entered four times. Once by law, and once more for tesserae for Prim, my mother, and myself. The next years had to do the same. And since the price of a tessera increases by one entry each year, now that I've turned sixteen, my name will be on twenty cards. Gale is eighteen, and he's been feeding a family of five for seven years. His name will be entered forty two times! It's clear that people like Madge, who has never had to risk because of tesserae, annoy Gale. Next to us, the inhabitants of the slag heap, she simply has no chance of getting into the games. Well, almost no chance. Of course, the rule

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

There have been many rumors over the centuries that the Philosopher's Stone has already been created, but the only existing stone today belongs to Mr. Nicholas Flamel, a distinguished alchemist and opera fanatic. Mr. Flamel, who celebrated his six hundred and sixty-fifth birthday last year, enjoys the peace and solitude in Devon with his wife Perenelle (six hundred and fifty-eight years old). 
'Understood?' Hermione asked when Harry and Ron finished reading. 'It must be, the dog safeguards Flamel's philosopher's stone! I have no doubt that he asked Dumbledore to do this, because they are friends and also because Flamel knew that someone was hunting for his stone. That's why he wanted the stone to be withdrawn from Gringotts! 
'The stone that turns everything into gold and guarantees you immortality!' Harry exclaimed. 'No wonder Snape wants to steal it. Anyone would want such a stone.




Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

Ron was asleep in the common room - apparently, he had been waiting for their return and had dozed off unnoticed. When Harry roughly shook him, Ron began to yell something about breaking the rules of a game, as if he were dreaming about a Quidditch match. However, after a few seconds, Ron completely woke up and, with his eyes wide open, listened to the story of Hermione and Harry. Harry was so excited that he could not sit still and paced back and forth across the room, trying to stay as close to the fireplace as possible. He was still shaking with cold. 'Snape wants to steal the stone for Voldemort. And Voldemort is waiting in the forest... And all this time we thought Snape wanted to steal the stone to become rich... And Voldemort...'




Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

What was she babbling about? You're Rue's ally? - I... I... we teamed up. We blew up the food of the Pros. I wanted to save her. Really did. But he found her first, the guy from District One - I say. Perhaps if Cato knows I helped Rue, he will kill me quickly and painlessly. - Did you kill him? - he asks grimly. - Yes. I killed him. And I covered her body with flowers. I sang to her till she fell asleep. Tears well up in my eyes. Will and strength are leaving me. There's only Rue, the pain in my head, fear of Cato and the moan of the dying girl. - Fell asleep? - mocks Cato. - Died. I sang to her till she died - I say. - Your district... sent me bread. I raise my hand - not for an arrow; I won't have time anyway. I just blow my nose. - Cato, make it quick, okay? His face shows conflicting emotions. Cato puts down the rock and says with almost a reproach: - This time, only this time, I'm letting you go. For the girl. We are even. No one owes anything to anyone anymore, understand? I nod,

Preprocessing dataset:   0%|          | 0/1 [00:00<?, ? Dicts/s]

What was she babbling about? You're Rue's ally? - I... I... we teamed up. We blew up the food of the Pros. I wanted to save her. Really did. But he found her first, the guy from District One - I say. Perhaps if Cato knows I helped Rue, he will kill me quickly and painlessly. - Did you kill him? - he asks grimly. - Yes. I killed him. And I covered her body with flowers. I sang to her till she fell asleep. Tears well up in my eyes. Will and strength are leaving me. There's only Rue, the pain in my head, fear of Cato and the moan of the dying girl. - Fell asleep? - mocks Cato. - Died. I sang to her till she died - I say. - Your district... sent me bread. I raise my hand - not for an arrow; I won't have time anyway. I just blow my nose. - Cato, make it quick, okay? His face shows conflicting emotions. Cato puts down the rock and says with almost a reproach: - This time, only this time, I'm letting you go. For the girl. We are even. No one owes anything to anyone anymore, understand? I nod,