## Setting up DVC

Data Version Control (DVC) offers a structured method for managing data versioning, a crucial aspect that is sometimes neglected. By using DVC, you can accurately monitor modifications in your datasets, ensuring reproducibility, collaboration, and simplified troubleshooting. It acts as a protective shield against data-related hurdles, promoting confidence and efficiency in your data-centric endeavors

In this exercise, you will practice initializing a DVC project and checking how DVC is installed. Git has already been initialized for this project.

### Ide Exercise Instruction
    - Initialize DVC in the workspace.
    - Verify that .dvcignore file and .dvc folder are present.
    - Check DVC version and learn how it was installed.
    - Commit the changes using Git with "initial commit" commit message.

In [None]:
#$ dvc init
#$ dvc version
#$ git add .
#$ git commit -m "initial commit"

## .dvcignore Patterns

The .dvcignore file plays a crucial role in DVC (Data Version Control) projects by marking which files and/or directories should be excluded when traversing a DVC project. It allows you to specify patterns or paths that DVC should ignore during operations.

In this exercise, you will modify the contents of a .dvcignore file to set file patterns that DVC should ignore during operation. You will also use the dvc check-ignore command to verify whether specific targets are ignored by DVC according to the .dvcignore file.

### Ide Exercise Instruction
    - Ignore all files in the dataset directory
    - Make an exception for dataset/myData.csv to be tracked by DVC.
    - Ignore all JSON files in the current workspace.
    - Using dvc check-ignore -d <file_name or file_pattern> command, check if JSON files are actually ignored.

In [1]:
#### ========> .dvcignore
# # Add patterns of files dvc should ignore, which could improve
# # the performance. Learn more at
# # https://dvc.org/doc/user-guide/dvcignore
# 
# # Ignore all files in the 'dataset' directory
# dataset/*
# 
# # But don't ignore 'dataset/myData.csv'
# !dataset/myData.csv
# 
# # Ignore all .json files
# *.json

#### ========> Command-line
#$ dvc check-ignore -d dataset/myData.csv