DVC Demo

A demonstration of Data Version Control (DVC) for managing ML pipelines and data versioning.

What is DVC?

DVC is an open-source version control system for machine learning projects. It helps you:

Version control large files, data sets, machine learning models, and metrics
Track ML experiments
Create reproducible ML pipelines
Collaborate with team members

Project Structure

.
├── data/              # Raw and processed data files
│   └── raw.dvc        # DVC file for raw data
├── src/               # Source code for data processing and model training
├── config/            # Configuration files
├── .dvc/              # DVC internal files
├── dvc.yaml           # DVC pipeline definition
├── dvc.lock           # DVC lock file for reproducible pipelines
└── .dvcignore         # Files/directories to be ignored by DVC

Setup

Install project dependencies using uv:

uv sync dvc

Pull the data from remote storage:

dvc pull

Run the pipeline to reproduce all stages:

dvc repro

Version Control

Track data files: dvc add <file>
Push data to remote storage: dvc push
Pull data from remote storage: dvc pull
Check status: dvc status

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.dvc		.dvc
config		config
data		data
src		src
tests		tests
.dvcignore		.dvcignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DVC Demo

What is DVC?

Project Structure

Setup

Version Control

About

Uh oh!

Releases

Packages

Uh oh!

Languages

CodeCutTech/dvc-demo

Folders and files

Latest commit

History

Repository files navigation

DVC Demo

What is DVC?

Project Structure

Setup

Version Control

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages