Running Selenium web scraping in containerized Azure Functions
Tech Stack: .Net Core 3.1 | Docker | Terraform | Azure Functions with Containers on Linux | Azure DevOps
For local development and testing.
Note: This code is developed and run locally on a Windows OS, but containerized in a Linux OS.
If pulling the code from the repository, create a local.settings.json
file in the root dir.
Add the following block to the local.settings.json
file:
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "dotnet",
"AzureWebJobsStorage": "ConnectionString to your storage account",
"ENVIRONMENT": "local"
}
}
Update the appConfig.json
file for your desired search query parameters, and desired target site to look for in the search results.
For local devlopment and testing
All infrastructure code will reside within the Terraform
dir.
Create a envVariables.tfvars
file in the root directory with the following variables:
ARM_CLIENT_ID="<YourPrincipalAppID>"
ARM_CLIENT_SECRET="<YourAppsSecret>"
ARM_SUBSCRIPTION_ID="<SubscriptionID>"
ARM_TENANT_ID="<AzureActiveDirectoryTenantID>"
cd .\Terraform\
terraform init
terraform plan -var-file="envVariables.tfvars"
terraform apply -var-file="envVariables.tfvars"
terraform destroy -var-file="envVariables.tfvars"
Using Docker Desktop for local environment
- Build the image:
docker build --build-arg StorageConnectionString="<StorageAccountConnectionString" -t seo-bot:latest .
- Launch from the built image:
docker run -d --name seo-bot -p 80:80 seo-bot:latest
- Check the status:
docker ps
- Check logs:
docker logs seo-bot
- Remote shell into the image:
docker exec -it seo-bot /bin/bash
The azure-pipelines.yml
is used to configure and run the pipeline from Github source to Azure.
-
Before creating the pipeline, create a new Azure Active Directory Service Principal following this guide. Use the option to authenticate using a secret.
-
Create an Azure DevOps project and setup a Service Connection of type Azure Resource Manager to connect Azure subscription. Use the Service Principal Client ID, secret, and Azure Subscription details created in the firt step.
-
The pipeline relies on a Service Connection of type Docker Registry to connect to the container registry for storing the image. In my example, using Terraform to create the container registry, but am unable to successfully run the pipeline prior to creating the required Azure Container Registry. In my case, I use Terraform locally to create the initial Azure resources and then create a Service Connection to the Azure Container Registry.
NOTE: If the Azure Container Registry is ever removed and recreated, a new Service Connection is required to re-establish credentials.
-
Update the
azure-pipelines.yml
variablesazureServiceConnection
anddockerRegistryServiceConnection
to match the name defined when creating the Service Connections. -
Update the Terraform variables
TF_*
to match the shared state file stored in the Azure Storage Account.
Follow the new pipeline creation process to connect to your respisitory and select the existing azure-pipelines.yml
file.
Before saving the pipeline, create four new variables to store your Azure Active Directory Service Principal credentials:
ARM_CLIENT_ID="<YourPrincipalAppID>"
ARM_CLIENT_SECRET="<YourAppsSecret>"
ARM_SUBSCRIPTION_ID="<SubscriptionID>"
ARM_TENANT_ID="<AzureActiveDirectoryTenantID>"
Save and run the pipeline.
Navigate to the function app created by Terraform. From Deployment > Deployment Center, configure the Container Registry settings for a Single Container.