The main goal of this repository is demonstrate how you can use Azure Functions to ingest data into your Azure Data Lake.
- Terraform (for infra setup)
- Azure CLI (for infra setup)
- VSCODE with Azure Functions extension
- Python 3.8
Note: You can either use the Azure Functions Core Tools to test, but most of people aren't allowed to install on their company machine
Are used requests
and azure.storage.filedatalake
libs to read from the API and write it on Data lake.
For more information please check: ingestion code
Note: Azure Function context is read-only, to handle the data is used io
lib to make it in memory
Azure Function generate an end-point to make requests and activate the function, this is a good way to integrate with Azure Data Factory.
Get request
GET <Az-Function-endpoint>
Response
{
"docs": "check how to use on https://github.com/otacilio-psf/azure-functions-ingestion#sample-request"
}
Post request
POST <Az-Function-endpoint>
Body
{
"api_name": "nubank",
"url": "https://dadosabertos.nubank.com.br/taxasCartoes/itens"
}
*api_name: need to be a path like, it will used to create folder and file name
Response
{
"status": "success",
"menssage": "raw/nubank/nubank_1662750895.json"
}
git clone https://github.com/otacilio-psf/azure-functions-ingestion
cd azure-functions-ingestion
Follow README inside IaC
Some requirements are expected and created by IaC routine:
- Service Princial with Storage Blob Data Contributor on your Data lake
- Azure Funtions resource created
- Tenant Id, Sp Id Sp secret, Storage Acc name and Cointaner name as Environment variables
- Open VSCODE on Functions folder
- Click on Azure extension > Local Project > Deploy
- Follow the options:
- Login if you haven't
- Select the Function app
- Click on confirmation
We can use Github Actions to Continuous delivery our Azure Function, please check docs.