Skip to content

Commit

Permalink
Reorganize tutorials into their own section
Browse files Browse the repository at this point in the history
  • Loading branch information
dashohoxha committed Oct 4, 2019
1 parent 613e1c9 commit ef5373f
Show file tree
Hide file tree
Showing 21 changed files with 145 additions and 41 deletions.
42 changes: 22 additions & 20 deletions src/Documentation/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,29 @@
{
"label": "Get Older Files",
"slug": "older-versions"
},
{
"label": "Example: Versioning",
"slug": "example-versioning"
},
}
]
},
{
"slug": "tutorials",
"source": "tutorials/index.md",
"children": [
"basics",
"interactive",
"versioning",
"pipelines",
{
"label": "Example: Pipelines",
"slug": "example-pipeline"
"slug": "tutorial",
"source": "tutorial/index.md",
"children": [
"preparation",
{
"label": "Define ML Pipeline",
"slug": "define-ml-pipeline"
},
"reproducibility",
"sharing-data"
]
}
]
},
Expand Down Expand Up @@ -284,19 +299,6 @@
}
]
},
{
"slug": "tutorial",
"source": "tutorial/index.md",
"children": [
"preparation",
{
"label": "Define ML Pipeline",
"slug": "define-ml-pipeline"
},
"reproducibility",
"sharing-data"
]
},
{
"label": "Understanding DVC",
"slug": "understanding-dvc",
Expand Down
2 changes: 1 addition & 1 deletion static/docs/command-reference/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ $ dvc run -f train.dvc \
```

To see this whole example go to
[Example: Versioning](/doc/get-started/example-versioning).
[Tutorial: Versioning](/doc/tutorials/versioning).

Since no top-level DVC-file is generated with the `--recursive` option we cannot
use the directory structure as a whole.
2 changes: 1 addition & 1 deletion static/docs/command-reference/checkout.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ $ cd example-get-started
</details>

The workspace looks almost like in this
[pipeline setup](/doc/get-started/example-pipeline):
[pipeline setup](/doc/tutorials/pipelines):

```dvc
.
Expand Down
2 changes: 1 addition & 1 deletion static/docs/command-reference/fetch.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ $ cd example-get-started
</details>

The workspace looks almost like in this
[pipeline setup](/doc/get-started/example-pipeline):
[pipeline setup](/doc/tutorials/pipelines):

```dvc
.
Expand Down
3 changes: 1 addition & 2 deletions static/docs/command-reference/repro.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,7 @@ specified), and updates stage files with the new checksum information.

For simplicity, let's build a pipeline defined below. (If you want get your
hands-on something more real, see this shot
[pipeline tutorial](/doc/get-started/example-pipeline)). It takes this
`text.txt` file:
[pipeline tutorial](/doc/tutorials/pipelines)). It takes this `text.txt` file:

```
dvc
Expand Down
5 changes: 2 additions & 3 deletions static/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,10 @@ creating a new stage. For example, for every output there should be only one
stage that explicitly specifies it. There should be no cycles, etc.

Note that `dvc repro` provides an interface to check state and reproduce this
graph (pipeline) later. This concept is similar to that of
graph (pipeline) later. This concept is similar to the one of the
[Make](https://www.gnu.org/software/make/) in software build automation, but DVC
captures data and caches <abbr>data artifacts</abbr> along the way. See this
[example](/doc/get-started/example-pipeline) to learn more and try to create a
pipeline.
[example](/doc/tutorials/pipelines) to learn more and try to create a pipeline.

## Options

Expand Down
2 changes: 1 addition & 1 deletion static/docs/get-started/connect-code-and-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ $ git commit -m "Add source code files to repo"
Having installed the `src/prepare.py` script in your repo, the following command
transforms it into a reproducible [stage](/doc/command-reference/run) for the ML
pipeline we're building (described in the
[next chapter](/doc/get-started/example-pipeline)).
[next chapter](/doc/tutorials/pipelines)).

```dvc
$ dvc run -f prepare.dvc \
Expand Down
4 changes: 2 additions & 2 deletions static/docs/get-started/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ go into details much, but provides links and expandable sections to learn more.

At the very end there are a few complete examples to give you more hands-on
experience with real life scenarios. The first one is about model and dataset
[versioning](/doc/get-started/example-versioning), and the second one is focused
on [pipelines and reproducibility](/doc/get-started/example-pipeline).
[versioning](/doc/tutorials/versioning), and the second one is focused on
[pipelines and reproducibility](/doc/tutorials/pipelines).

✅ Please, join our [community](/chat) or see these [support](/support) options
if you have any questions or need any help. We are very responsive ⚡.
Expand Down
2 changes: 1 addition & 1 deletion static/docs/get-started/older-versions.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ $ dvc checkout
```

Read the `dvc checkout` command reference and a dedicated data versioning
[example](/doc/get-started/example-versioning) for more information.
[example](/doc/tutorials/versioning) for more information.
2 changes: 1 addition & 1 deletion static/docs/get-started/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ $ dvc push
```

This example is simplified just to show you a basic pipeline, see a more
advanced [example](/doc/get-started/example-pipeline) or complete
advanced [example](/doc/tutorials/pipelines) or complete
[tutorial](/doc/tutorial) to create a
[NLP](https://en.wikipedia.org/wiki/Natural_language_processing) pipeline
end-to-end.
Expand Down
22 changes: 22 additions & 0 deletions static/docs/tutorials/basics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# DVC Basic Concepts and Features

Learn basic concepts and features of DVC with interactive lessons:

1. [Data Management](https://katacoda.com/dvc/courses/basics/data) <br/> The
core function of DVC is data tracking and management. Let's see how to do it.

2. [Getting the Best Performance](https://katacoda.com/dvc/courses/basics/performance)
<br/> It is important to optimize the DVC setup for having the best
performance with handling big data files.

3. [Tracking Data Versions](https://katacoda.com/dvc/courses/basics/versioning)
<br/> DVC takes advantage of GIT's versioning features to keep track of the
data versions.

4. [Sharing Data](https://katacoda.com/dvc/courses/basics/sharing) <br/> DVC
facilitates sharing of data between different people that work on the same
project.

5. [Stages And Pipelines](https://katacoda.com/dvc/courses/basics/pipelines)
<br/> DVC has a built-in way to connect ML steps into a DAG and run the full
pipeline end-to-end.
70 changes: 70 additions & 0 deletions static/docs/tutorials/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# DVC Basic Concepts and Features

Learn basic concepts and features of DVC with interactive lessons:

1. [Data Management](https://katacoda.com/dvc/courses/basics/data) <br/> The
core function of DVC is data tracking and management. Let's see how to do it.

2. [Getting the Best Performance](https://katacoda.com/dvc/courses/basics/performance)
<br/> It is important to optimize the DVC setup for having the best
performance with handling big data files.

3. [Tracking Data Versions](https://katacoda.com/dvc/courses/basics/versioning)
<br/> DVC takes advantage of GIT's versioning features to keep track of the
data versions.

4. [Sharing Data](https://katacoda.com/dvc/courses/basics/sharing) <br/> DVC
facilitates sharing of data between different people that work on the same
project.

5. [Stages And Pipelines](https://katacoda.com/dvc/courses/basics/pipelines)
<br/> DVC has a built-in way to connect ML steps into a DAG and run the full
pipeline end-to-end.

# Interactive Tutorials

1. [Data Versioning](https://katacoda.com/dvc/courses/tutorials/versioning)
<br/> Using DVC commands to work with multiple versions of datasets and ML
models.

2. [Pipelines](https://katacoda.com/dvc/courses/tutorials/pipelines) <br/> Using
DVC commands to build a simple ML pipeline.

3. [dvc fetch](https://katacoda.com/dvc/courses/examples) <br/> We will use an
example project with some data, code, ML models, pipeline stages, as well as
a few Git tags. Then we will see what happens with dvc fetch as we switch
from tag to tag.

# Native Tutorials

1. [Versioning](/docs/tutorials/versioning) <br/> Using DVC commands to work
with multiple versions of datasets and ML models.

2. [Pipelines](/docs/tutorials/pipelines) <br/> Using DVC commands to build a
simple ML pipeline.

3. [Longer Tutorial](/docs/tutorials/tutorial) <br/> Introduces DVC
step-by-step, while additionally explaining in great detail the motivation
and what's happening internally.

# Community Blogs and Tutorials

- [Data Version Control Tutorial](https://blog.dataversioncontrol.com/data-version-control-tutorial-9146715eda46)

- [Creating an awesome project using DVC and DAGsHub](https://dagshub.com/docs/overview/)

- [Using DVC to create an efficient version control system for data projects](https://medium.com/qonto-engineering/using-dvc-to-create-an-efficient-version-control-system-for-data-projects-96efd94355fe)

- [Introduction to using DVC to manage machine learning project datasets](https://techsparx.com/software-development/ai/dvc/simple-example.html)

- [Managing versioned machine learning datasets in DVC, and easily share ML projects with colleagues](https://techsparx.com/software-development/ai/dvc/versioning-example.html)

- [A walkthrough of DVC](https://blog.codecentric.de/en/2019/03/walkthrough-dvc/)

- [DVC dependency management](https://blog.codecentric.de/en/2019/08/dvc-dependency-management/)

- [How to use data version control (dvc) in a machine learning project](https://towardsdatascience.com/how-to-use-data-version-control-dvc-in-a-machine-learning-project-a78245c0185)

- [My first try at DVC](https://stdiff.net/MB2019051301.html)

- [Effective Management of your Machine Learning Laboratory](https://www.linkedin.com/pulse/effective-management-your-machine-learning-laboratory-ulaganathan/)
13 changes: 13 additions & 0 deletions static/docs/tutorials/interactive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Interactive Tutorials

1. [Data Versioning](https://katacoda.com/dvc/courses/tutorials/versioning)
<br/> Using DVC commands to work with multiple versions of datasets and ML
models.

2. [Pipelines](https://katacoda.com/dvc/courses/tutorials/pipelines) <br/> Using
DVC commands to build a simple ML pipeline.

3. [dvc fetch](https://katacoda.com/dvc/courses/examples) <br/> We will use an
example project with some data, code, ML models, pipeline stages, as well as
a few Git tags. Then we will see what happens with dvc fetch as we switch
from tag to tag.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Example: Pipelines
# Tutorial: Pipelines

To show DVC in action, let's play with an actual machine learning scenario.
Let's explore the natural language processing
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Example: Versioning
# Tutorial: Versioning

> Reading time is 10-13 minutes. Running the training is 30-40 minutes
> (including downloading the dataset). Running the code is optional, and reading
Expand Down Expand Up @@ -362,8 +362,8 @@ Here's where the [pipelines](/doc/command-reference/pipeline) feature of DVC
comes very handy and was designed for. We touched it briefly when we described
`dvc run` and `dvc repro` at the very end. The next step here would be splitting
the script into two parts, and utilizing pipelines. See
[this example](/doc/get-started/example-pipeline) to get a hands-on experience
with pipelines and try to apply it here. Don't hesitate to join our
[this example](/doc/tutorials/pipelines) to get a hands-on experience with
pipelines and try to apply it here. Don't hesitate to join our
[community](/chat) to ask any questions!

Another detail we only brushed on here is the way we captured the `metrics.json`
Expand Down
7 changes: 3 additions & 4 deletions static/docs/use-cases/data-and-model-files-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> This document provides an overview the file versioning workflow with DVC. To
> get more hands-on experience on this we recommend following along the
> [Versioning](/doc/get-started/example-versioning) example.
> [Versioning](/doc/tutorials/versioning) tutorial.
DVC allows versioning data files and directories, intermediate results, and ML
models using Git, but without storing the file contents in the repository. It's
Expand Down Expand Up @@ -117,6 +117,5 @@ To share your data with others you need to setup a
[Share Data And Model Files](/doc/use-cases/share-data-and-model-files) use case
to get an overview on how to do this.

Please also don't forget to see the
[Versioning](/doc/get-started/example-versioning) example to get a hands-on
experience with datasets and models versioning.
Please also don't forget to see the [Versioning](/doc/tutorials/versioning)
example to get a hands-on experience with datasets and models versioning.

0 comments on commit ef5373f

Please sign in to comment.