Skip to content

hurshd0/train_ml_with_github_actions

Repository files navigation

Demo showcasing simple MLOps workflow

Follow below instructions to try out 👇

  1. Fork this repo 🍴

  1. Sign In to AWS Account and create S3 bucket (in N. Virginia) and some folders

Follow this guide if you don't know: How do I create an S3 Bucket?

Should look exactly like👇

  1. Go to IAM Console and create AWS Access Keys, store them in safe place

How do I set up an IAM user and sign in to the AWS Management Console using IAM credentials?

How do I create an access key for an existing IAM user?

Some tips:

  1. Install AWS CLI, I'm using WSL2 on Windows, so I did python -m pip install --user awscli to install as global package

For more detail instructions follow, https://github.com/aws/aws-cli

  1. Configure AWS credentials
$ aws configure
AWS Access Key ID: MYACCESSKEY
AWS Secret Access Key: MYSECRETKEY
Default region name [us-east-1]: us-east-1
Default output format [None]: json
  1. Download the raw dataset

a. Dataset: https://titanic-model.s3.amazonaws.com/raw_titanic.csv

b. Create a folder inside titanic_model called data,

following is project structure that should look like

.
├── notebooks
└── titanic_model
    ├── data
    ├── config
    ├── processing
    └── trained_model_artifacts

5 directories
  1. Install packade dependencies and run it locally to verify if it works

Pre-requisites

  • Python 3
  • Conda [Optional, but recommended]

a. If you have conda, than install pipenv via conda install pipenv, if you don't just do pip install pipenv

b. To install dependencies do pipenv install

c. To activate virtual environment do pipenv shell

d. cd into titanic_model & do dvc remote add data which adds data folder so it can be tracked by DVC

NOTE: If you have any issues visit: https://dvc.org/doc/user-guide/external-dependencies

e. Run tox to train ML model and generate reports, and pickled model saved in titanic_model/trained_model_artifacts

  1. Checkout a branch and test out a different ML model via git checkout -b random_forest

  2. Add ML classifier of your choice to titanic_model/pipeline.py

  3. Add your AWS AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to Github Secrets

  4. Create a Pull Request to master

  5. Go get a sip of ☕ while your model trains Once traininig is completed it should look like this

About

Learn how to do train a simple ML model with Github Actions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published