Python client library for the Datacore API — supports two access modes:
- Demo: Preview datasets without an API key
- Paid: Full access with an API key
From PyPI:
pip install datacore # core (pandas-only)
pip install "datacore[polars]" # also installs polars
pip install "datacore[all]" # all optional extrasOr with uv:
uv add datacore
uv add "datacore[polars]"From GitHub directly:
pip install git+https://github.com/DataCore-VietNam/datacore-python.git
pip install "datacore[polars] @ git+https://github.com/DataCore-VietNam/datacore-python.git"For contributors only (you do not need this to use the library — the commands above are all an end user needs):
git clone https://github.com/DataCore-VietNam/datacore-python.git
cd datacore-python
pip install -e ".[dev]"Create a .env file in your working directory (see .env.example):
X_API_KEY=your-api-key-hereNever commit your real API key.
.envis git-ignored. Use.env.exampleas a template.
from datacore import Datacore
# Demo mode (no API key required)
client = Datacore()
# Paid mode — pass the key explicitly...
client = Datacore(api_key="your-api-key")
# ...or rely on X_API_KEY from .env / environment
client = Datacore()
# Enable request/response debug logging
client = Datacore(api_key="your-api-key", debug=True)Preview data without an API key.
# All columns
df = client.preview("dataset_historical_price")
print(df.head())
# Filter specific columns
df = client.preview("dataset_historical_price", columns=["symbol", "date", "close_price"])
print(df.head())# All columns — returns {"data": DataFrame, "info": str}
result = client.get_data("dataset_historical_price")
print(result["data"].head())
print(result["info"])
# num: 3760607, totalPage: 37607, currentPage: 1, queried_rows: 100
# Filter specific columns
result = client.get_data(
"dataset_historical_price",
columns=["symbol", "date", "close_price"],
)
print(result["data"].head())Full parameters:
result = client.get_data(
dataset_code="dataset_historical_price",
columns=["symbol", "date", "close_price"], # client-side column filter (optional)
conditions=None, # EXPERIMENTAL server-side row filter -- see note below
select_fields=None, # server-side field selection (optional)
page=1,
limit=100, # max 100 server-side (HTTP 400 if higher)
return_type="dataframe", # "dataframe" | "polars" | "json" | "dict"
include_info=True, # True: returns {"data": ..., "info": ...} | False: data only
)Page size: the gateway currently caps
limitat 100 rows per request. Passing a larger value returnsHTTP 400: Invalid request content. For larger downloads, paginate withdownload_dataorpaginate.
⚠️ conditionsis experimental. The server-sideconditionsrow filter is forwarded to the gateway verbatim, but the accepted JSON shape is not yet finalised — every shape tried so far is rejected bygateway.datacore.vn/data/ds/searchwithHTTP 400. Do not rely onconditionsin production yet. Until the gateway schema is confirmed, fetch unfiltered data and filter the returned DataFrame client-side. This parameter may change in a future release.
A convenience wrapper that returns the DataFrame directly (no info dict):
df = client.get_dataframe("dataset_historical_price", limit=100)If you installed with pip install "datacore[polars]", you can ask for a
polars DataFrame instead of pandas:
# Via return_type
result = client.get_data(
"dataset_historical_price",
columns=["symbol", "date", "close_price"],
limit=100,
return_type="polars",
)
print(type(result["data"])) # <class 'polars.DataFrame'>
print(result["data"].head())
# Convenience method (no info dict, just the polars frame)
df_pl = client.get_polars("dataset_historical_price", limit=100)
# Preview supports polars too
df_pl = client.preview("dataset_historical_price", return_type="polars")Pandas is the default for backwards compatibility; polars is purely opt-in.
The same columns= filter works for both backends.
for page_df in client.paginate("dataset_historical_price", limit=100, max_pages=5):
print(page_df.shape)# Download all pages of a small dataset (76 pages, ~7.5k rows)
download_result = client.download_data(
dataset_code="gross_domestic_product_dataset_ds",
output_path="data.csv",
file_format="csv", # "csv" or "json"
start_page=1,
end_page=None, # None = download until the last page
limit=100, # max per-request page size (see note above)
show_progress=True,
)
print(download_result)
# {"output_path": "data.csv", "pages_downloaded": 76, "rows_downloaded": 7551, ...}
# `dataset_historical_price` is large (~3.7M rows / 37k pages at limit=100);
# expect it to take a long time and a lot of network.
# Download only first 3 pages, filtered to specific columns (CSV only)
download_result = client.download_data(
dataset_code="dataset_historical_price",
columns=["symbol", "date", "close_price"],
output_path="data_page1_3.csv",
file_format="csv",
start_page=1,
end_page=3,
show_progress=True,
)
columnsfiltering only applies tofile_format="csv". JSON output preserves the full raw API response.
| Method | Description | Requires API key |
|---|---|---|
preview(dataset_code, columns, return_type) |
Preview a dataset (pandas or polars) | No |
preview_raw(dataset_code) |
Preview a dataset, raw dict response | No |
get_data(dataset_code, ...) |
Fetch data, returns {"data", "info"} by default |
Yes |
get_dataframe(dataset_code, ...) |
Fetch data, returns pandas DataFrame directly | Yes |
get_polars(dataset_code, ...) |
Fetch data, returns polars DataFrame directly (needs [polars] extra) |
Yes |
get_data_info(dataset_code, ...) |
Get dataset metadata summary | Yes |
paginate(dataset_code, ...) |
Generator yielding one pandas DataFrame per page | Yes |
download_data(dataset_code, output_path, ...) |
Download data to CSV/JSON file | Yes |
set_api_key(api_key) |
Set / replace the API key on an existing client | — |
is_authenticated() |
Returns True if an API key is configured |
— |
| Error | Cause | Solution |
|---|---|---|
AuthenticationError |
Missing or invalid API key (HTTP 401 or httpCode:401 in body) |
Pass api_key= or set X_API_KEY in .env |
PermissionDeniedError |
No access to dataset (HTTP 403) | Check your subscription plan |
APIRequestError |
Server error, invalid request, or unknown dataset | Check dataset_code and conditions |
ValueError |
Bad argument (e.g. unknown column, page < 1, bad file_format) |
Check the error message |
All exceptions inherit from DatacoreError, so you can catch them generically:
from datacore import Datacore, DatacoreError
try:
client.get_data("dataset_historical_price")
except DatacoreError as e:
print(f"Datacore call failed: {e}")MIT — see LICENSE.