Skip to content

isabella232/data-factory-copy-blob-to-blob-python

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

services platforms author
data-factory
python
spelluru

Sample: copy data one folder to another folder in an Azure Blob Storage

In this sample you do the following steps by using Python SDK:

  1. Create a data factory.
  2. Create a linked service to link your Azure Storage account to the data factory.
  3. Create a dataset that represents input/output data used by the copy activity.
  4. Create a pipeline with a copy activity that copies data.

Prerequisites

  • Azure subscription. If you don't have a subscription, you can create a free trial account.
  • Azure Storage account. You use the blob storage as source and sink data store. If you don't have an Azure storage account, see the Create a storage account article for steps to create one.
  • Create an application in Azure Active Directory following this instruction. Make note of the following values that you use in later steps: application ID, authentication key, and tenant ID. Assign application to "Contributor" role by following instructions in the same article.
  • Create a blob container in Blob Storage, create an input folder in the container, and upload some files to the folder. The sample code use the inputpy and outputpy as input and output folder names. If you use different folders, update these values in the source code.

Install the Python package

  1. Open a terminal or command prompt with administrator privileges. 

  2. First, install the Python package for Azure management resources:

    pip install azure-mgmt-resource
    
  3. To install the Python package for Data Factory, run the following command:

    pip install azure-mgmt-datafactory
    

    The Python SDK for Data Factory supports Python 2.7, 3.3, 3.4, 3.5 and 3.6.

Set values for placeholders

In the source code, set values to replace the following placeholders:

# Specify your Azure subscription ID
subscription_id = '<Azure subscription ID>'

# Specify a name for the Azure resource group. 
rg_name = '<Azure resource group name>'

# Specify a name for the data factory. It must be globally unique.
df_name = '<Data factory name>'        

# Specify your Active Directory application ID, application authentication key, and tenant ID
credentials = ServicePrincipalCredentials(client_id='<AAD client ID>', secret='<AAD app authentication key>', tenant='<AAD tenant ID>')

# Specify your Azure storage account name and key
storage_string = SecureString('DefaultEndpointsProtocol=https;AccountName=<Azure storage account>;AccountKey=<Azure storage authentication key>')

See Also

For step-by-steps instructions to create this sample from scratch, see Quickstart: create a data factory and pipeline using Python.

About

Python script for creating a data factory that copies data from one folder to another in an Azure Blob Storage

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%