Skip to content

mustafamadraswala/simple-dvc-demo

Repository files navigation

git bash --> code . source C:/Users/Owner/anaconda3/Scripts/activate base

Create environment -

conda create -n liveapp python=3.8.8 -y

Activate environment -

conda activate liveapp

Create and Install requirements - touch requirements.txt pip install -r requirements.txt

Create andd in README.md touch README.md history Copy-Paste the commands in README.md file

Create python file touch template.py add the code according to video - https://www.youtube.com/watch?v=n4sz9cG_B7k&list=PLZoTAELRMXVOk1pRcOCaG5xtXxgMalpIe&index=5

mkdir data_given paste winequality.csv file

Download the data from https://drive.google.com/drive/folders/1xw0XX-WK74uxtFFLySbtnX-ODdmdK5Ec

git init
dvc init
dvc add data_given/wine_quality.csv #When you want your data to be tracked for all the changes
git add . #To add the data in the staging area
git config user.email "museychamp@gmail.com"
git config user.email "mustafamadraswala"
git commit -m "first commit"
git add . && git commit -m "update README.md"

Create a repo on github - simple_dvc_demo -

Push an existing repository from the command line -

git remote add origin https://github.com/mustafamadraswala/simple-dvc-demo.git git branch -M main git push -u origin main

link - https://github.com/mustafamadraswala/simple-dvc-demo

If any update in readme -

git add . && git commit -m "update README.md"

And then push it to the repository -

git push origin main

Code for - Params.yaml git add . && git commit -m "params added" git push origin main

Create src/get_data.py - Add the data Add code according to the video

Create src/load_data.py - Load the data Add code according to the video

In Dvc.yaml - add stage 1 git rm -r --cached 'data\raw\winequality.csv' #Stop tracking from git

dvc repro - #It will run all the stages git add . && git commit -m "stage 2 added" git push origin main #push to github

add stage 2 git rm -r --cached 'data\raw\winequality.csv' #Stop tracking from git

dvc repro - #It will run all the stages git add . && git commit -m "stage 2 added" git push origin main

Create report directory create files params.json and scores.json

Make changes to train_test_evalutae.py - dvc.yaml and params.yaml

Add stage 3 git rm -r --cached 'data\raw\winequality.csv' #Stop tracking from git

dvc repro - #It will run all the stages git add . && git commit -m "stage 2 added" git push origin main

To check the difference between metrics dvc metrics diff

About

Creating a Machine Learning Pipeline using Git, Dvc, MLflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages