datacore-python

Python client library for the Datacore API — supports two access modes:

Demo: Preview datasets without an API key
Paid: Full access with an API key

Installation

From PyPI:

pip install datacore                # core (pandas-only)
pip install "datacore[polars]"      # also installs polars
pip install "datacore[all]"         # all optional extras

Or with uv:

uv add datacore
uv add "datacore[polars]"

From GitHub directly:

pip install git+https://github.com/DataCore-VietNam/datacore-python.git
pip install "datacore[polars] @ git+https://github.com/DataCore-VietNam/datacore-python.git"

For contributors only (you do not need this to use the library — the commands above are all an end user needs):

git clone https://github.com/DataCore-VietNam/datacore-python.git
cd datacore-python
pip install -e ".[dev]"

Configuration (`.env`, optional)

Create a .env file in your working directory (see .env.example):

X_API_KEY=your-api-key-here

Never commit your real API key. .env is git-ignored. Use .env.example as a template.

Usage

1. Initialize the client

from datacore import Datacore

# Demo mode (no API key required)
client = Datacore()

# Paid mode — pass the key explicitly...
client = Datacore(api_key="your-api-key")

# ...or rely on X_API_KEY from .env / environment
client = Datacore()

# Enable request/response debug logging
client = Datacore(api_key="your-api-key", debug=True)

2. Preview a dataset (Demo mode)

Preview data without an API key.

# All columns
df = client.preview("dataset_historical_price")
print(df.head())

# Filter specific columns
df = client.preview("dataset_historical_price", columns=["symbol", "date", "close_price"])
print(df.head())

3. Fetch data (Paid mode)

# All columns — returns {"data": DataFrame, "info": str}
result = client.get_data("dataset_historical_price")
print(result["data"].head())
print(result["info"])
# num: 3760607, totalPage: 37607, currentPage: 1, queried_rows: 100

# Filter specific columns
result = client.get_data(
    "dataset_historical_price",
    columns=["symbol", "date", "close_price"],
)
print(result["data"].head())

Full parameters:

result = client.get_data(
    dataset_code="dataset_historical_price",
    columns=["symbol", "date", "close_price"],   # client-side column filter (optional)
    conditions=None,         # EXPERIMENTAL server-side row filter -- see note below
    select_fields=None,      # server-side field selection (optional)
    page=1,
    limit=100,               # max 100 server-side (HTTP 400 if higher)
    return_type="dataframe", # "dataframe" | "polars" | "json" | "dict"
    include_info=True,       # True: returns {"data": ..., "info": ...} | False: data only
)

Page size: the gateway currently caps limit at 100 rows per request. Passing a larger value returns HTTP 400: Invalid request content. For larger downloads, paginate with download_data or paginate.

⚠️ conditions is experimental. The server-side conditions row filter is forwarded to the gateway verbatim, but the accepted JSON shape is not yet finalised — every shape tried so far is rejected by gateway.datacore.vn/data/ds/search with HTTP 400. Do not rely on conditions in production yet. Until the gateway schema is confirmed, fetch unfiltered data and filter the returned DataFrame client-side. This parameter may change in a future release.

A convenience wrapper that returns the DataFrame directly (no info dict):

df = client.get_dataframe("dataset_historical_price", limit=100)

3b. Polars output (optional)

If you installed with pip install "datacore[polars]", you can ask for a polars DataFrame instead of pandas:

# Via return_type
result = client.get_data(
    "dataset_historical_price",
    columns=["symbol", "date", "close_price"],
    limit=100,
    return_type="polars",
)
print(type(result["data"]))     # <class 'polars.DataFrame'>
print(result["data"].head())

# Convenience method (no info dict, just the polars frame)
df_pl = client.get_polars("dataset_historical_price", limit=100)

# Preview supports polars too
df_pl = client.preview("dataset_historical_price", return_type="polars")

Pandas is the default for backwards compatibility; polars is purely opt-in. The same columns= filter works for both backends.

4. Iterate all pages (Paid mode)

for page_df in client.paginate("dataset_historical_price", limit=100, max_pages=5):
    print(page_df.shape)

5. Download data to file

# Download all pages of a small dataset (76 pages, ~7.5k rows)
download_result = client.download_data(
    dataset_code="gross_domestic_product_dataset_ds",
    output_path="data.csv",
    file_format="csv",     # "csv" or "json"
    start_page=1,
    end_page=None,         # None = download until the last page
    limit=100,             # max per-request page size (see note above)
    show_progress=True,
)
print(download_result)
# {"output_path": "data.csv", "pages_downloaded": 76, "rows_downloaded": 7551, ...}

# `dataset_historical_price` is large (~3.7M rows / 37k pages at limit=100);
# expect it to take a long time and a lot of network.

# Download only first 3 pages, filtered to specific columns (CSV only)
download_result = client.download_data(
    dataset_code="dataset_historical_price",
    columns=["symbol", "date", "close_price"],
    output_path="data_page1_3.csv",
    file_format="csv",
    start_page=1,
    end_page=3,
    show_progress=True,
)

columns filtering only applies to file_format="csv". JSON output preserves the full raw API response.

Method Summary

Method	Description	Requires API key
`preview(dataset_code, columns, return_type)`	Preview a dataset (pandas or polars)	No
`preview_raw(dataset_code)`	Preview a dataset, raw dict response	No
`get_data(dataset_code, ...)`	Fetch data, returns `{"data", "info"}` by default	Yes
`get_dataframe(dataset_code, ...)`	Fetch data, returns pandas DataFrame directly	Yes
`get_polars(dataset_code, ...)`	Fetch data, returns polars DataFrame directly (needs `[polars]` extra)	Yes
`get_data_info(dataset_code, ...)`	Get dataset metadata summary	Yes
`paginate(dataset_code, ...)`	Generator yielding one pandas DataFrame per page	Yes
`download_data(dataset_code, output_path, ...)`	Download data to CSV/JSON file	Yes
`set_api_key(api_key)`	Set / replace the API key on an existing client	—
`is_authenticated()`	Returns `True` if an API key is configured	—

Error Handling

Error	Cause	Solution
`AuthenticationError`	Missing or invalid API key (HTTP 401 or `httpCode:401` in body)	Pass `api_key=` or set `X_API_KEY` in `.env`
`PermissionDeniedError`	No access to dataset (HTTP 403)	Check your subscription plan
`APIRequestError`	Server error, invalid request, or unknown dataset	Check `dataset_code` and `conditions`
`ValueError`	Bad argument (e.g. unknown column, `page < 1`, bad `file_format`)	Check the error message

All exceptions inherit from DatacoreError, so you can catch them generically:

from datacore import Datacore, DatacoreError

try:
    client.get_data("dataset_historical_price")
except DatacoreError as e:
    print(f"Datacore call failed: {e}")

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
datacore		datacore
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
test_demo.ipynb		test_demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

datacore-python

Installation

Configuration (`.env`, optional)

Usage

1. Initialize the client

2. Preview a dataset (Demo mode)

3. Fetch data (Paid mode)

3b. Polars output (optional)

4. Iterate all pages (Paid mode)

5. Download data to file

Method Summary

Error Handling

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

datacore-python

Installation

Configuration (.env, optional)

Usage

1. Initialize the client

2. Preview a dataset (Demo mode)

3. Fetch data (Paid mode)

3b. Polars output (optional)

4. Iterate all pages (Paid mode)

5. Download data to file

Method Summary

Error Handling

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`.env`, optional)

Packages