## This notebook demonstrates how to perform bulk inference of [DeepSeek R1 Zero](https://github.com/deepseek-ai/DeepSeek-R1) on the [Tracto.ai](https://tracto.ai/) platform.

In [1]:
import yt.wrapper as yt
import uuid

In [2]:
yt.config["pickling"]["dynamic_libraries"]["enable_auto_collection"] = False
yt.config["pickling"]["ignore_system_modules"] = True
yt.config["pickling"]["safe_stream_mode"] = False

In [3]:
working_dir = f"//tmp/examples/tractorun-deepseek_{uuid.uuid4()}"
yt.create("map_node", working_dir, recursive=True)
print(working_dir)

//tmp/examples/tractorun-deepseek_6cd7600c-0f63-4bbd-a3ea-b7f5d7f45642


Prepare data for inference as an YTSaurus table.

In [5]:
from datasets import load_dataset

dataset = load_dataset("Rapidata/Other-Animals-10")

table_path = f"{working_dir}/questions"
yt.create("table", table_path, force=True)

questions = [
    {"question": f"Can {animal} fly?"}
    for animal in set(dataset["train"].features["label"].int2str(dataset["train"]["label"]))
]

yt.write_table(table_path, questions)

Run bulk inference of DeepSeek R1 Zero on 2 nodes.

In [7]:
from typing import Iterable
import logging
import sys
import random

@yt.aggregator
def bulk_inference(records: Iterable[dict[str, str]]) -> dict[str, str]:
    from vllm import LLM, SamplingParams

    # yt job have to write all logs to stderr
    vllm_logger = logging.getLogger("vllm")
    vllm_logger.handlers.clear()
    vllm_logger.addHandler(logging.StreamHandler(sys.stderr))

    llm = LLM(model="deepseek-ai/DeepSeek-R1-Zero", tensor_parallel_size=8, seed=random.randint(0, 1000000), trust_remote_code=True)
    sampling_params = SamplingParams(
        temperature=0.6,
        top_p=0.9,
        max_tokens=32000,
    )

    for record in records:
        conversations = [
            [
                {
                    "role": "user",
                    "content": record["question"],
                },
            ]
            for record in records
        ]
        outputs = llm.chat(
            messages=conversations,
            sampling_params=sampling_params,
        )
        for output in outputs:
            yield {
                "prompt": record["question"],
                "text": outputs[0].outputs[0].text,
            }

In [8]:
result_path = f"{working_dir}/result"

yt.run_map(
    bulk_inference,
    table_path,
    result_path,
    job_count=2,
    spec={
        "pool": "fifo",
        "pool_trees": ["gpu_h200"],
        "mapper": {
            "gpu_limit": 8,
            "memory_limit": 322122547200,
            "cpu_limit": 64,
        },
    },
)

2025-02-06 21:51:43,630	INFO	Operation started: https://playground.yt.nebius.yt/playground/operations/b43c7979-5e5e23e9-270703e8-307715f8/details


2025-02-06 21:51:43,659	INFO	( 0 min) operation b43c7979-5e5e23e9-270703e8-307715f8 starting


2025-02-06 21:51:44,191	INFO	( 0 min) operation b43c7979-5e5e23e9-270703e8-307715f8 initializing


2025-02-06 21:51:44,760	INFO	( 0 min) Unrecognized spec: {'enable_partitioned_data_balancing': false, 'mapper': {'title': 'bulk_inference'}}


2025-02-06 21:51:45,366	INFO	( 0 min) operation b43c7979-5e5e23e9-270703e8-307715f8: running=0     completed=0     pending=2     failed=0     aborted=0     lost=0     total=2     blocked=0    


2025-02-06 21:51:48,110	INFO	( 0 min) operation b43c7979-5e5e23e9-270703e8-307715f8: running=2     completed=0     pending=0     failed=0     aborted=0     lost=0     total=2     blocked=0    


2025-02-06 22:31:55,174	INFO	(40 min) operation b43c7979-5e5e23e9-270703e8-307715f8 completed


2025-02-06 22:31:55,207	INFO	(40 min) Alerts: {'low_cpu_usage': {'code': 1, 'message': "Average CPU usage of some of your job types is significantly lower than requested 'cpu_limit'. Consider decreasing cpu_limit in spec of your operation", 'attributes': {'pid': 1, 'tid': 5679524652574218927, 'thread': 'Controller:8', 'fid': 18446264713697520871, 'host': 'man0-0460.hw.nebius.yt', 'datetime': '2025-02-06T22:31:52.351045Z', 'trace_id': 'b65e8d5e-3600afb3-7cbbaf46-4b644334', 'span_id': 15531647196190902352}, 'inner_errors': [{'code': 1, 'message': 'Jobs of task "map" use 2.14% of requested cpu limit', 'attributes': {'pid': 1, 'tid': 5679524652574218927, 'thread': 'Controller:8', 'fid': 18446264713697520871, 'host': 'man0-0460.hw.nebius.yt', 'datetime': '2025-02-06T22:31:52.351029Z', 'trace_id': 'b65e8d5e-3600afb3-7cbbaf46-4b644334', 'span_id': 15531647196190902352, 'cpu_time': 6587388, 'cpu_limit': 64.0, 'exec_time': 4813296}}]}}


<yt.wrapper.operation_commands.Operation at 0x7f83c6005940>

In [9]:
for record in yt.read_table(result_path):
    print(record)

{'prompt': 'Can hare fly?', 'text': '<think>\nTo answer the question "Can a mosquito fly?" we need to understand what a mosquito is and what its capabilities are.\n\nA mosquito is a small flying insect that belongs to the family Culicidae. Female mosquitoes are well-known for biting humans and animals to feed on their blood, which they need for egg production. However, it is important to clarify that both male and female mosquitoes have the ability to fly.\n\nMosquitoes have two wings (which is a characteristic of the order Diptera, which means "two wings") that enable them to fly. Their flight capabilities include the ability to hover, maneuver through the air, and travel considerable distances depending on the species. Some mosquitoes can fly several miles from their breeding site in search of food or a suitable place to lay their eggs.\n\nBased on this information, the answer is:\nYes, mosquitoes can fly.\nHowever, it seems that the question might be a bit ambiguous or perhaps a pla