Python version 3.0.0
This is a major release.
It brings many new features, but also includes breaking changes.
Breaking Changes
- The minimum supported Python version is now Python 3.7. Python 3.5's EOL was a year ago, so you should upgrade regardless of this release.
- Exception messages may have changed. Make sure to test it if you're reliant on the exact message.
- Stricter checks on formats and mapping types. If your code breaks due to this, you had a bug.
- Renamed
IngestionMappingType
toIngestionMappingKind
to align with other SDKs. - Renamed
ingestion_mapping
parameter tocolumn_mappings
for clarity and alignment. KustoMissingMappingReferenceError
andKustoMappingAndMappingReferenceError
are nowKustoMissingMappingError
.
Data Format
DataFormat
is now exported directly fromazure-kusto-data
, and no longer exported fromazure-kusto-ingest
(useazure-kusto-data
).DataFormat
's internal representation changed from a string to an enum, including the Kusto service's formatting, whether a mapping is required, relevantIngestionMappingKind
, and whether it's compressible.- Added missing formats and mappings (
SStream
,ApacheAvro
).
Major Features
Quickstart App
- Inside the repo there is now a new module for a Quickstart app. It serves as a sample and a tutorial for the entire flow for working with the SDK, including creating tables, ingesting data and querying.
Streaming Query
- A new API was added to
KustoClient
(sync and async) -execute_streaming_query
. - The regular
execute
call waits for all the data to arrive, parse it in memory, and provide Python objects to access it. - Streaming query lets you work with the data as it arrives, saving time and memory.
- For usage, see the streaming query section in
azure-kusto-data/tests/sample.py
.
Managed Streaming Ingestion
- Adds a new type of ingestion client -
ManagedStreamingIngestClient
. - The client will default to using streaming ingestion, but will be more robust to errors.
- If it fails to use streaming ingestion, it will fall back to queued when:
- Retrying streaming ingest fails multiple times, getting transient errors
- The size of the ingested object is more than 4MB
- Using
ingest_from_blob
- For usage, see the managed streaming section in
azure-kusto-ingest/tests/sample.py
.
Ingest Clients Changes
- All of the ingestion clients now inherit from
BaseIngestClient
. - All of the ingestion clients now support the following methods -
ingest_from_file
,ingest_from_stream
,ingest_from_dataframe
(only Queued and ManagedStreaming supportingest_from_blob
). IngestionResult
is now returned fromingest_*
methods. The result contains some details about the ingestion.- For
ManagedStreamingIngestClient
, you can query whether the data was ingested via queued or streaming usingIngestionStatus
(==QUEUED
for queued, ==SUCCESS
for streaming).
More Detailed Exceptions
KustoServiceError
now has two new subclasses -KustoApiError
,KustoMultiApiError
.- They both utilize the new
OneApiError
class, which has detailed information about the exception, for a consistent structure between SDKs. KustoApiError
is raised on a server error in the web request, and contains within it 1OneApiError
.KustoMultiApiError
is raised while reading the data, and can contain multipleOneApiError
s within.
- They both utilize the new
KustoClientError
has a new subclassKustoBlobError
, for when uploading a blob fails.- Since these classes are subclasses of exceptions that already exist, they will not break existing code unless it parses and relies on the exact message.
Proxy
- Kusto Client now supports connecting with a proxy using the
set_proxy(proxy_url: str)
method.
Minor Features and Bug Fixes
- More type annotations in the code, to help provide completions and error detections.
- Binary types in ingestions will no longer be compressed.
- Providing the size for blobs for ingestion is now optional.
- Dataframe - support numpy types, including int, long, real and decimal.
- Dataframe - handle invalid date types.
- Relax Azure Identity version to support more dependencies.
- The client now sends a specific API version to avoid conflicts.
- Send correct scopes to managed identity and AZ CLI providers.
- Only retrieve
CloudInfo
when needed. - Source IDs are always generated, and are always the correct type for improved telemetry.
- Added advanced checks for column mappings.
- Improved E2E test performance by only clearing cache once.