nkcrawler is a container batch application for collecting netkeiba.com data. nkcrawler depend on nkparser which is a python library.
This application execute below:
- Download SQLite database file from Azure Blob Storage, if the database file is exists on Azure.
- Collect netkeiba.com data by defined year of environment variables.
- Insert collected data to SQLite Database.
- Upload SQLite database file to Azure Blob Storage.
RECCOMEND: You shoud use async execution like tmux due to long time execution (over 10 hours per month).
$ export CONNECTION_STRING="YOUR CONNECTION STRING"
$ pip install -U pip
$ pip install -r requirements.txt
$ python run.py
$ docker build -t nkcrawler .
$ docker run --rm -e CONNECTION_STRING=${CONNECTION_STRING} -it nkcrawler
- Push Image to Azure Container Registry
You can see pushed contaier image on Azure Container Registry > Repository after below commands.
$ ACR_USERNAME="AZURE CONTAINER REGISTRY USERNAME"
$ ACR_PASSWORD="AZURE CONTAINER REGISTRY PASSWORD"
$ ACR_REGISTRY="YOUR AZURE CONTAINER REGISTRY REGISTRY"
$ docker login -u ${ACR_USERNAME} -p ${ACR_PASSWORD} ${ACR_REGISTRY}
$ docker tag nkcrawler ${ACR_REGISTRY}/nkcrawler
$ docker push ${ACR_REGISTRY}/nkcrawler
- Run Container by Azure Container Instance
Access to
Azure Container Instance
and then you selectCreate
. See Set environment variables in container instances
In Advance Tab, you might setrestart policy
toNone
. and you also setCONNECTION_STRING
andYEAR
environment variables.