title | titleSuffix | description | author | ms.author | ms.service | ms.date | ms.devlang | ms.topic | ms.reviewer | ms.custom |
---|---|---|---|---|---|---|---|---|---|---|
Use Java to manage data in Azure Data Lake Storage Gen2 |
Azure Storage |
Use Azure Storage libraries for Java to manage directories and files in storage accounts that have a hierarchical namespace enabled. |
pauljewellmsft |
pauljewell |
azure-data-lake-storage |
08/08/2023 |
java |
how-to |
prishet |
devx-track-java, devx-track-extended-java |
This article shows you how to use Java to create and manage directories and files in storage accounts that have a hierarchical namespace.
To learn about how to get, set, and update the access control lists (ACL) of directories and files, see Use .Java to manage ACLs in Azure Data Lake Storage Gen2.
Package (Maven) | Samples | API reference | Gen1 to Gen2 mapping | Give Feedback
-
An Azure subscription. For more information, see Get Azure free trial.
-
A storage account that has hierarchical namespace enabled. Follow these instructions to create one.
To get started, open this page and find the latest version of the Java library. Then, open the pom.xml file in your text editor. Add a dependency element that references that version.
If you plan to authenticate your client application by using Microsoft Entra ID, then add a dependency to the Azure Identity library. For more information, see Azure Identity client library for Java.
Next, add these imports statements to your code file.
import com.azure.identity.*;
import com.azure.storage.common.StorageSharedKeyCredential;
import com.azure.core.http.rest.PagedIterable;
import com.azure.core.util.BinaryData;
import com.azure.storage.file.datalake.*;
import com.azure.storage.file.datalake.models.*;
import com.azure.storage.file.datalake.options.*;
[!INCLUDE data-lake-storage-sdk-note]
To work with the code examples in this article, you need to create an authorized DataLakeServiceClient instance that represents the storage account. You can authorize a DataLakeServiceClient
object using Microsoft Entra ID, an account access key, or a shared access signature (SAS).
You can use the Azure identity client library for Java to authenticate your application with Microsoft Entra ID.
Create a DataLakeServiceClient instance and pass in a new instance of the DefaultAzureCredential class.
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/Authorize_DataLake.java" id="Snippet_AuthorizeWithAzureAD":::
To learn more about using DefaultAzureCredential
to authorize access to data, see Azure Identity client library for Java.
To use a shared access signature (SAS) token, provide the token as a string and initialize a DataLakeServiceClient object. If your account URL includes the SAS token, omit the credential parameter.
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/Authorize_DataLake.java" id="Snippet_AuthorizeWithSAS":::
To learn more about generating and managing SAS tokens, see the following article:
You can authorize access to data using your account access keys (Shared Key). This example creates a DataLakeServiceClient instance that is authorized with the account key.
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/Authorize_DataLake.java" id="Snippet_AuthorizeWithKey":::
[!INCLUDE storage-shared-key-caution]
A container acts as a file system for your files. You can create a container by using the following method:
The following code example creates a container and returns a DataLakeFileSystemClient object for later use:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_CreateFileSystem":::
You can create a directory reference in the container by using the following method:
The following code example adds a directory to a container, then adds a subdirectory and returns a DataLakeDirectoryClient object for later use:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_CreateDirectory":::
You can rename or move a directory by using the following method:
Pass the path of the desired directory as a parameter. The following code example shows how to rename a subdirectory:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_RenameDirectory":::
The following code example shows how to move a subdirectory from one directory to a different directory:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_MoveDirectory":::
You can upload content to a new or existing file by using the following method:
The following code example shows how to upload a local file to a directory using the uploadFromFile
method:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_UploadFile":::
You can use this method to create and upload content to a new file, or you can set the overwrite
parameter to true
to overwrite an existing file.
You can upload data to be appended to a file by using the following method:
The following code example shows how to append data to the end of a file using these steps:
- Create a
DataLakeFileClient
object to represent the file resource you're working with. - Upload data to the file using the
DataLakeFileClient.append
method. - Complete the upload by calling the
DataLakeFileClient.flush
method to write the previously uploaded data to the file.
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_AppendDataToFile":::
The following code example shows how to download a file from a directory to a local file using these steps:
- Create a
DataLakeFileClient
object to represent the file that you want to download. - Use the
DataLakeFileClient.readToFile
method to read the file. This example sets theoverwrite
parameter totrue
, which overwrites an existing file.
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_DownloadFile":::
You can list directory contents by using the following method and enumerating the result:
Enumerating the paths in the result may make multiple requests to the service while fetching the values.
The following code example prints the names of each file that is located in a directory:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_ListFilesInDirectory":::
You can delete a directory by using one of the following methods:
- DataLakeDirectoryClient.delete
- DataLakeDirectoryClient.deleteIfExists
- DataLakeDirectoryClient.deleteWithResponse
The following code example uses deleteWithResponse
to delete a nonempty directory and all paths beneath the directory:
:::code language="java" source="~/azure-storage-snippets/blobs/howto/Java/Java-v12/src/main/java/com/datalake/manage/CRUD_DataLake.java" id="Snippet_DeleteDirectory":::