GitHub - hurshd0/train_ml_with_github_actions: Learn how to do train a simple ML model with Github Actions

Demo showcasing simple MLOps workflow

Follow below instructions to try out 👇

Fork this repo 🍴

Sign In to AWS Account and create S3 bucket (in N. Virginia) and some folders

Follow this guide if you don't know: How do I create an S3 Bucket?

Should look exactly like👇

Go to IAM Console and create AWS Access Keys, store them in safe place

How do I set up an IAM user and sign in to the AWS Management Console using IAM credentials?

How do I create an access key for an existing IAM user?

Some tips:

For beginners create Admin user with full access
For advanced users create a user with only access to that bucket, follow this, How To Grant Access To Only One S3 Bucket Using AWS IAM Policy

Install AWS CLI, I'm using WSL2 on Windows, so I did python -m pip install --user awscli to install as global package

For more detail instructions follow, https://github.com/aws/aws-cli

Configure AWS credentials

$ aws configure
AWS Access Key ID: MYACCESSKEY
AWS Secret Access Key: MYSECRETKEY
Default region name [us-east-1]: us-east-1
Default output format [None]: json

Download the raw dataset

a. Dataset: https://titanic-model.s3.amazonaws.com/raw_titanic.csv

b. Create a folder inside titanic_model called data,

following is project structure that should look like

.
├── notebooks
└── titanic_model
    ├── data
    ├── config
    ├── processing
    └── trained_model_artifacts

5 directories

Install packade dependencies and run it locally to verify if it works

Pre-requisites

Python 3
Conda [Optional, but recommended]

a. If you have conda, than install pipenv via conda install pipenv, if you don't just do pip install pipenv

b. To install dependencies do pipenv install

c. To activate virtual environment do pipenv shell

d. cd into titanic_model & do dvc remote add data which adds data folder so it can be tracked by DVC

NOTE: If you have any issues visit: https://dvc.org/doc/user-guide/external-dependencies

e. Run tox to train ML model and generate reports, and pickled model saved in titanic_model/trained_model_artifacts

Checkout a branch and test out a different ML model via git checkout -b random_forest
Add ML classifier of your choice to titanic_model/pipeline.py
Add your AWS AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to Github Secrets
Create a Pull Request to master
Go get a sip of ☕ while your model trains Once traininig is completed it should look like this

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.dvc		.dvc
.github/workflows		.github/workflows
notebooks		notebooks
titanic_model		titanic_model
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
requirements.txt		requirements.txt
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.dvc

.dvc

.github/workflows

.github/workflows

notebooks

notebooks

titanic_model

titanic_model

.gitignore

.gitignore

LICENSE

LICENSE

Pipfile

Pipfile

Pipfile.lock

Pipfile.lock

README.md

README.md

requirements.txt

requirements.txt

tox.ini

tox.ini

Repository files navigation

Demo showcasing simple MLOps workflow

Follow below instructions to try out 👇

Pre-requisites

About

Releases

Packages

Contributors 2

Languages

License

hurshd0/train_ml_with_github_actions

Folders and files

Latest commit

History

Repository files navigation

Demo showcasing simple MLOps workflow

Follow below instructions to try out 👇

Pre-requisites

About

Resources

License

Stars

Watchers

Forks

Languages