title | description | ms.reviewer | ms.topic | ms.date |
---|---|---|---|---|
Kusto.ingest into command (pull data from storage) |
This article describes The .ingest into command (pull data from storage) in Azure Data Explorer. |
orspodek |
reference |
06/26/2024 |
The .ingest into
command ingests data into a table by "pulling" the data
from one or more cloud storage files.
For example, the command
can retrieve 1000 CSV-formatted blobs from Azure Blob Storage, parse
them, and ingest them together into a single target table.
Data is appended to the table
without affecting existing records, and without modifying the table's schema.
[!INCLUDE direct-ingestion-note]
You must have at least Table Ingestor permissions to run this command.
.ingest
[async
] into
table
TableName SourceDataLocator [with
(
IngestionPropertyName =
IngestionPropertyValue [,
...] )
]
[!INCLUDE syntax-conventions-note]
Name | Type | Required | Description |
---|---|---|---|
async |
string |
If specified, the command returns immediately and continues ingestion in the background. The results of the command include an OperationId value that can then be used with the .show operation command to retrieve the ingestion completion status and results. |
|
TableName | string |
✔️ | The name of the table into which to ingest data. The table name is always relative to the database in context. If no schema mapping object is provided, the schema of the database in context is used. |
SourceDataLocator | string |
✔️ | A single or comma-separated list of storage connection strings. A single connection string must refer to a single file hosted by a storage account. Ingestion of multiple files can be done by specifying multiple connection strings, or by ingesting from a query of an external table. |
Note
We recommend using obfuscated string literals for the SourceDataLocators. The service will scrub credentials in internal traces and error messages.
[!INCLUDE ingestion-properties]
Each storage connection string indicates the authorization method to use for access to the storage. Depending on the authorization method, the principal may need to be granted permissions on the external storage to perform the ingestion.
The following table lists the supported authentication methods and the permissions needed for ingesting data from external storage.
Authentication method | Azure Blob Storage / Data Lake Storage Gen2 | Data Lake Storage Gen1 |
---|---|---|
Impersonation | Storage Blob Data Reader | Reader |
Shared Access (SAS) token | List + Read | This authentication method isn't supported in Gen1. |
Microsoft Entra access token | ||
Storage account access key | This authentication method isn't supported in Gen1. | |
Managed identity | Storage Blob Data Reader | Reader |
The result of the command is a table with as many records as there are data shards ("extents") generated by the command. If no data shards have been generated, a single record is returned with an empty (zero-valued) extent ID.
Name | Type | Description |
---|---|---|
ExtentId | guid |
The unique identifier for the data shard that was generated by the command. |
ItemLoaded | string |
One or more storage files that are related to this record. |
Duration | timespan |
How long it took to perform ingestion. |
HasErrors | bool |
Whether this record represents an ingestion failure or not. |
OperationId | guid |
A unique ID representing the operation. Can be used with the .show operation command. |
Note
This command doesn't modify the schema of the table being ingested into. If necessary, the data is "coerced" into this schema during ingestion, not the other way around (extra columns are ignored, and missing columns are treated as null values).
The following example instructs your cluster to read two blobs from Azure Blob Storage
as CSV files, and ingest their contents into table T
. The ...
represents
an Azure Storage shared access signature (SAS) which gives read access to each
blob. Note also the use of obfuscated strings (the h
in front of the string
values) to ensure that the SAS is never recorded.
.ingest into table T (
h'https://contoso.blob.core.windows.net/container/file1.csv?...',
h'https://contoso.blob.core.windows.net/container/file2.csv?...'
)
The following example shows how to read a CSV file from Azure Blob Storage and ingest its contents into table T
using managed identity authentication. For additional information on managed identity authentication method, see Managed Identity Authentication Overview.
.ingest into table T ('https://StorageAccount.blob.core.windows.net/Container/file.csv;managed_identity=802bada6-4d21-44b2-9d15-e66b29e4d63e')
The following example is for ingesting data from Azure Data Lake Storage Gen 2
(ADLSv2). The credentials used here (...
) are the storage account credentials
(shared key), and we use string obfuscation only for the secret part of the
connection string.
.ingest into table T (
'abfss://myfilesystem@contoso.dfs.core.windows.net/path/to/file1.csv;...'
)
The following example ingests a single file from Azure Data Lake Storage (ADLS). It uses the user's credentials to access ADLS (so there's no need to treat the storage URI as containing a secret). It also shows how to specify ingestion properties.
.ingest into table T ('adl://contoso.azuredatalakestore.net/Path/To/File/file1.ext;impersonate')
with (format='csv')
The following example ingests a single file from Amazon S3 using an access key ID and a secret access key.
.ingest into table T ('https://bucketname.s3.us-east-1.amazonaws.com/path/to/file.csv;AwsCredentials=AKIAIOSFODNN7EXAMPLE,wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY')
with (format='csv')
The following example ingests a single file from Amazon S3 using a preSigned URL.
.ingest into table T ('https://bucketname.s3.us-east-1.amazonaws.com/file.csv?<<pre signed string>>')
with (format='csv')