A lightweight Python helper library for day-to-day work with YTsaurus - https://ytsaurus.tech YQL CHYT and pandas DataFrames
PyPI · GitHub · Example notebook · Issues
The project wraps common analytics workflows into a small, readable API:
- run YQL queries and return results as
pandas.DataFrame - start long-running YQL queries without blocking the notebook
- write YQL results directly into YTsaurus tables
- read large query outputs through temporary YTsaurus tables with progress reporting
- execute CHYT queries through HTTP or the YTsaurus CLI
- upload pandas DataFrames into YTsaurus tables
This repository is designed as a clean portfolio-friendly version of the client: no company-specific hosts, pools, paths, tokens, or internal links are hardcoded
pip install ytsaurus_python_clientpip install -e .For production packaging:
python -m build
pip install dist/ytsaurus_python_client-*.whl- Python 3.9+
pandasrequestsnumpy- YTsaurus Python client with
yt.wrapper - Optional: YTsaurus CLI binary
ytfor CLI-based CHYT helpers
The library is configured through environment variables or explicit constructor arguments.
| Variable | Purpose | Default |
|---|---|---|
YT_PROXY |
YTsaurus proxy host | empty |
YT_TOKEN |
OAuth/token value used by HTTP CHYT helpers | read from YT_TOKEN_PATH |
YT_TOKEN_PATH |
Path to a local token file | ~/.yt/token |
YT_DEFAULT_TEMP_DIR |
Temp folder for large YQL result materialization | //tmp/ytsaurus-python-client |
YT_POOL |
Optional YQL pool pragma | unset |
YT_UI_BASE_URL |
Optional web UI base URL used only for printed links | unset |
CHYT_HOST |
CHYT HTTP host | YT_PROXY |
CHYT_PORT |
CHYT HTTP port | 8123 |
CHYT_CLIQUE_ALIAS |
Default CHYT clique alias | ch_public |
YT_BINARY |
YTsaurus CLI binary name/path | yt |
Example:
export YT_PROXY="your-ytsaurus-proxy.example.com"
export YT_TOKEN_PATH="$HOME/.yt/token"
export YT_DEFAULT_TEMP_DIR="//home/your-login/tmp"
export CHYT_CLIQUE_ALIAS="ch_public"from ytsaurus_python_client import YTsaurusHook
hook = YTsaurusHook(
yt_proxy="your-ytsaurus-proxy.example.com",
yt_query_result_temp_dir="//home/your-login/tmp",
)
df = hook.yql("""
SELECT
1 AS id,
"hello" AS value;
""")
print(df)query_id = hook.yql(
"""
INSERT INTO `//home/your-login/output_table`
SELECT *
FROM `//home/your-login/source_table`;
""",
wait=False,
)
print(query_id)query_id = hook.yql_wait("""
CREATE TABLE `//home/your-login/example_table` (
id Int64,
value String
);
""")df = hook.yql_unlim(
"""
SELECT *
FROM `//home/your-login/large_table`;
""",
chunksize=500_000,
)import pandas as pd
from ytsaurus_python_client import YTsaurusHook
hook = YTsaurusHook(yt_proxy="your-ytsaurus-proxy.example.com")
df = pd.DataFrame({"id": [1, 2], "name": ["Alice", "Bob"]})
schema = hook.generate_yt_schema(df)
hook.upload_df_to_yt(
df=df,
yt_path="//home/your-login/users",
schema=schema,
overwrite=True,
)from ytsaurus_python_client import chyt_df
df = chyt_df(
"""
SELECT 1 AS ok
""",
host="your-chyt-host.example.com",
clique_alias="ch_public",
)from ytsaurus_python_client import chyt_df_cli
df = chyt_df_cli(
"SELECT 1 AS ok",
yt_proxy="your-ytsaurus-proxy.example.com",
clique_alias="ch_public",
)from ytsaurus_python_client import (
YTsaurusHook,
DOYTHook, # backward-compatible alias
chyt_df,
chyt_raw,
chyt_to_yt,
chyt_df_cli,
chyt_raw_cli,
chyt_to_yt_cli,
chyt_check_cli,
)- Defaults are intentionally generic and safe for public repositories
- Secrets are never hardcoded. Use
YT_TOKEN,YT_TOKEN_PATH, or explicit arguments - Printed YTsaurus UI links are optional and controlled by
YT_UI_BASE_URL - YQL pragmas can be provided through
query_pragma_configor environment variables such asYT_POOL DOYTHookis kept as a backward-compatible alias; new code should preferYTsaurusHook
Before publishing, the project was cleaned from:
- macOS metadata files
- Python cache files
- internal company hosts and UI links
- internal pools and temp paths
- Russian comments and runtime messages
- local tokens or secret values
MIT © 2026 Alexey Voronko