Skip to content

nkcrawler is a container batch application for collecting netkeiba.com data. nkcrawler depend on nkparser which is a python library.

License

Notifications You must be signed in to change notification settings

new-village/nkcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nkcrawler

nkcrawler is a container batch application for collecting netkeiba.com data. nkcrawler depend on nkparser which is a python library.

Description

This application execute below:

  1. Download SQLite database file from Azure Blob Storage, if the database file is exists on Azure.
  2. Collect netkeiba.com data by defined year of environment variables.
  3. Insert collected data to SQLite Database.
  4. Upload SQLite database file to Azure Blob Storage.

Usage

RECCOMEND: You shoud use async execution like tmux due to long time execution (over 10 hours per month).

$ export CONNECTION_STRING="YOUR CONNECTION STRING"
  
$ pip install -U pip
$ pip install -r requirements.txt
$ python run.py

Docker Container

$ docker build -t nkcrawler .
$ docker run --rm -e CONNECTION_STRING=${CONNECTION_STRING} -it nkcrawler

Azure Docker Registry

  1. Push Image to Azure Container Registry
    You can see pushed contaier image on Azure Container Registry > Repository after below commands.
$ ACR_USERNAME="AZURE CONTAINER REGISTRY USERNAME"
$ ACR_PASSWORD="AZURE CONTAINER REGISTRY PASSWORD"
$ ACR_REGISTRY="YOUR AZURE CONTAINER REGISTRY REGISTRY"
  
$ docker login -u ${ACR_USERNAME} -p ${ACR_PASSWORD} ${ACR_REGISTRY}
$ docker tag nkcrawler ${ACR_REGISTRY}/nkcrawler
$ docker push ${ACR_REGISTRY}/nkcrawler
  1. Run Container by Azure Container Instance Access to Azure Container Instance and then you select Create. See Set environment variables in container instances
    In Advance Tab, you might set restart policy to None. and you also set CONNECTION_STRING and YEAR environment variables.

About

nkcrawler is a container batch application for collecting netkeiba.com data. nkcrawler depend on nkparser which is a python library.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published