In [1]:
# Xvc

This is the Python port of [Xvc](https://github.com/iesahin/xvc). Xvc's goal is let users perform all ML operations regarding data, files, models and pipelines from the command line, and version all of these on top of Git. Xvc.py extends this goal by allowing all commands to be run from Python shells, code and Jupyter notebooks. 

In [2]:
!chmod -R +w .xvc && rm -rf .xvc

chmod: .xvc: No such file or directory


In [3]:
!rm -rf .git test-data

!pip install xvc

In [4]:
import xvc

In [5]:
xvc.version()

'v0.6.3-30-gdb6ebaf'

In [6]:
from xvc import Xvc

Let's create a directory tree have with the data files we want to track.

In [7]:
!xvc-test-helper create-directory-tree --directories 3 --files 3 --root test-data --seed 20240101

In [8]:
!tree

[1;36m.[0m
├── Readme.ipynb
├── Untitled.ipynb
└── [1;36mtest-data[0m
    ├── [1;36mdir-0001[0m
    │   ├── file-0001.bin
    │   ├── file-0002.bin
    │   └── file-0003.bin
    ├── [1;36mdir-0002[0m
    │   ├── file-0001.bin
    │   ├── file-0002.bin
    │   └── file-0003.bin
    └── [1;36mdir-0003[0m
        ├── file-0001.bin
        ├── file-0002.bin
        └── file-0003.bin

5 directories, 11 files


Now we can initialize Xvc repository. We first init Git in the directory. Xvc tracks binary files with their metadata (size, timestamp and digest) and this metadata is kept in text files. You can use Xvc without Git, but adding versioning to your data is usually a good choice.   Xvc manages Git operations on the text files it creates automatically by default, so it won't add any extra commands to the workflow.

In [9]:
!git init test-data/

Initialized empty Git repository in /Users/iex/github.com/iesahin/xvc-notebooks/Readme/test-data/.git/


Now we can init the Xvc repository

First we create an Xvc instance. You can specify the workdir and this instance will work on that directory. You can use multiple such instances to manage multiple directories. 

In [10]:
xvc_debug = Xvc(debug=True, verbosity=3)
xvc_test_data = Xvc()

XvcConfigInitParams { default_configuration: "\n[core]\n# The repository id. Please do not delete or change it.\n# This is used to identify the repository and generate paths in storages.\n# In the future it may be used to in other ways.\nguid = \"b536ddec2f775716\"\n# Default verbosity level.\n# One of \"error\", \"warn\", \"info\"\nverbosity = \"error\"\n\n[git]\n# Automate git operations.\n# Turning this off leads Xvc to behave as if it's not in a Git repository.\n# Not recommended unless you're really not using Git\nuse_git = true\n# Command to run Git process.\n# You can set this to an absolute path to specify an executable\n# If set to a non-absolute path, the executable will be searched in $PATH.\ncommand = \"git\"\n\n# Commit changes in .xvc/ directory after commands.\n# You can set this to false if you want to commit manually.\nauto_commit = true\n\n# Stage changes in .xvc/ directory without committing.\n# auto_commit implies auto_stage.\n# If you want to commit manually but do

[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:74:5] path = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:262:9] path = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:266:9] &abs_path = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com/iesahin/xvc-notebooks"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/iesahin/xvc-notebooks/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent 

ote that some of the operations are implemented in parallel by default, and this option affects some heavier operations.\nno_parallel = false\n\n[file.list]\n\n# Format for `xvc file list` rows. You can reorder or remove columns.\n# The following are the keys for each row:\n# - {acd64}:  actual content digest. All 64 digits from the workspace file's content.\n# - {acd8}:  actual content digest. First 8 digits the file content digest.\n# - {aft}:  actual file type. Whether the entry is a file (F), directory (D),\n#   symlink (S), hardlink (H) or reflink (R).\n# - {asz}:  actual size. The size of the workspace file in bytes. It uses MB,\n#   GB and TB to represent sizes larger than 1MB.\n# - {ats}:  actual timestamp. The timestamp of the workspace file.\n# - {cst}:  cache status. One of \"=\", \">\", \"<\", \"X\", or \"?\" to show\n#   whether the file timestamp is the same as the cached timestamp, newer,\n#   older, not cached or not tracked.\n# - {name}: The name of the file or directo

s:269:13] parent = "/Users/iex/github.com/iesahin"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/iesahin/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/x

s.yaml\"\n# Number of command processes to run concurrently\nprocess_pool_size = 4\n# \n\n", current_dir: AbsolutePath("/Users/iex/github.com/iesahin/xvc-notebooks/Readme"), include_system_config: true, include_user_config: true, project_config_path: None, local_config_path: None, include_environment_config: true, command_line_config: None }


Let's initialize Xvc in the directory. This will  create a `.xvc` directory, will put several initial files and also create a `.gitignore` entry for those elements that shouldn't be tracked by Git. 

In [11]:
xvc_test_data.init()

XvcCLI { verbosity: 0, quiet: false, debug: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, skip_git: false, from_ref: None, to_branch: None, command: Init(InitCLI { path: None, no_git: false, force: false }), command_string: "xvc init" }
Dispatching command: Init(InitCLI { path: None, no_git: false, force: false })
workdir: "."
Running command: Init(InitCLI { path: None, no_git: false, force: false })
Running command: Init(InitCLI { path: None, no_git: false, force: false })


[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:262:9] path = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:266:9] &abs_path = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/iesahin/xvc-notebooks/Readme/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com/iesahin/xvc-notebooks"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users/iex/github.com/iesahin/xvc-notebooks/.xvc"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:269:13] parent = "/Users/iex/github.com/iesahin"
[/Users/iex/github.com/iesahin/xvc/core/src/types/xvcroot.rs:271:13] &xvc_candidate = "/Users

''

In [12]:
xvc_test_data.root(absolute=True)

["xvc", "root", "--absolute"]
XvcCLI { verbosity: 0, quiet: false, debug: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, skip_git: false, from_ref: None, to_branch: None, command: Root(RootCLI { absolute: true }), command_string: "xvc root --absolute" }
Dispatching command: Root(RootCLI { absolute: true })
workdir: "."
Running command: Root(RootCLI { absolute: true })
Running command: Root(RootCLI { absolute: true })
/Users/iex/github.com/iesahin/xvc-notebooks/Readme


''

In [13]:
xvc_test_data.file().track("test-data/dir-0001/")

XvcCLI { verbosity: 0, quiet: false, debug: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, skip_git: false, from_ref: None, to_branch: None, command: File(XvcFileCLI { verbosity: 0, quiet: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, subcommand: Track(TrackCLI { recheck_method: None, no_commit: false, text_or_binary: None, force: false, no_parallel: false, targets: Some(["test-data/dir-0001/"]) }) }), command_string: "xvc file track test-data/dir-0001/" }
Dispatching command: File(XvcFileCLI { verbosity: 0, quiet: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, subcommand: Track(TrackCLI { recheck_method: None, no_commit: false, text_or_binary: None, force: false, no

[src/file.rs:38:9] "{:?}" = "{:?}"
[src/file.rs:38:9] &cli_opts = [
    "xvc",
    "file",
]
[src/file.rs:41:9] "{:?}" = "{:?}"
[src/file.rs:41:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:43:9] "{:?}" = "{:?}"
[src/file.rs:43:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:50:9] "{:?}" = "{:?}"
[src/file.rs:50:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:57:9] "{:?}" = "{:?}"
[src/file.rs:57:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:64:9] "{:?}" = "{:?}"
[src/file.rs:64:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:66:9] "{:?}" = "{:?}"
[src/file.rs:66:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]
[src/file.rs:73:9] "{:?}" = "{:?}"
[src/file.rs:73:9] &cli_opts = [
    "xvc",
    "file",
    "track",
]


''

In [14]:
list_result = xvc_test_data.file().list()

XvcCLI { verbosity: 0, quiet: false, debug: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, skip_git: false, from_ref: None, to_branch: None, command: File(XvcFileCLI { verbosity: 0, quiet: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, subcommand: List(ListCLI { format: None, sort: None, no_summary: false, show_dot_files: false, targets: None }) }), command_string: "xvc file list" }
Dispatching command: File(XvcFileCLI { verbosity: 0, quiet: false, workdir: ".", config: None, no_system_config: false, no_user_config: false, no_project_config: false, no_local_config: false, no_env_config: false, subcommand: List(ListCLI { format: None, sort: None, no_summary: false, show_dot_files: false, targets: None }) })
workdir: "."
Running command: File(XvcFileCLI { verbosity: 0, quiet:

In [24]:
print(list_result)

FX        2003 2024-05-26 07:41:51          41e16be7 test-data/dir-0003/file-0003.bin
FX        2002 2024-05-26 07:41:51          27f0efd0 test-data/dir-0003/file-0002.bin
FX        2001 2024-05-26 07:41:51          66de5084 test-data/dir-0003/file-0001.bin
DX         160 2024-05-26 07:41:51                   test-data/dir-0003
FX        2003 2024-05-26 07:41:51          41e16be7 test-data/dir-0002/file-0003.bin
FX        2002 2024-05-26 07:41:51          27f0efd0 test-data/dir-0002/file-0002.bin
FX        2001 2024-05-26 07:41:51          66de5084 test-data/dir-0002/file-0001.bin
DX         160 2024-05-26 07:41:51                   test-data/dir-0002
FC        2003 2024-05-26 07:41:51 41e16be7 41e16be7 test-data/dir-0001/file-0003.bin
FC        2002 2024-05-26 07:41:51 27f0efd0 27f0efd0 test-data/dir-0001/file-0002.bin
FC        2001 2024-05-26 07:41:51 66de5084 66de5084 test-data/dir-0001/file-0001.bin
DX         160 2024-05-26 07:41:51                   test-data/dir-0001
FX        

## 🏃🏾 Quicktart

Xvc seamlessly monitors your files and directories on top of Git. To commence, execute the following command within the repository:

```console
$ git init # if you're not already in a Git repository
Initialized empty Git repository in [CWD]/.git/

$ xvc init
```

This command initializes the `.xvc/` directory and adds a `.xvcignore` file for specifying paths you wish to conceal from Xvc.

Include your data files and directories for tracking:

```shell
$ xvc file track my-data/ --as symlink
```

This command calculates content hashes for data (using BLAKE-3, by default) and logs them. The changes are committed to Git, and the files are copied to content-addressed directories within `.xvc/b3`. Additionally, read-only symbolic links to these directories are created. 

You can specify different [recheck (checkout) methods](https://docs.xvc.dev/ref/xvc-file-recheck/) for files and directories, depending on your use case.
If you need to track model files that change frequently, you can set recheck method `--as copy` (the default).

```shell
$ xvc file track my-models/ --as copy
```

Configure a cloud storage to share the files you added.

```shell
$ xvc storage new s3 --name my-remote --region us-east-1 --bucket-name my-xvc-remote
```

You can send the files to this storage.

```shell
$ xvc file send --to my-remote
```

When you (or someone else) want to access these files later, you can clone the Git repository and get the files from the
storage.

```shell
$ git clone https://example.com/my-machine-learning-project
Cloning into 'my-machine-learning-project'...

$ cd my-machine-learning-project
$ xvc file bring my-data/ --from my-remote

```

This approach ensures convenient access to files from the shared storage when needed.

You don't have to reconfigure the storage after cloning, but you need to have valid credentials as environment variables
to access the storage.
Xvc never stores any credentials.

If you have commands that depend on data or code elements, you can configure a pipeline.

For this example, we'll use [a Python script](https://github.com/iesahin/xvc/blob/main/workflow_tests/templates/README.in/generate_data.py) to generate a data set with random names with random IQ scores.

The script uses the Faker library and this library must be available where you run the pipeline. To make it repeatable, we start the pipeline by adding a step that installs dependencies.

```console
$ xvc pipeline step new --step-name install-deps --command 'python3 -m pip install --quiet --user -r requirements.txt'
```

We'll make this this step to depend on `requirements.txt` file, so when the file changes it will make the step run. 

```console
$ xvc pipeline step dependency --step-name install-deps --file requirements.txt
```

Xvc allows to create dependencies between pipeline steps. Dependent steps wait for dependencies to finish successfully. 

Now we create a step to run the script and make `install-deps` step a dependency of it. 

```console
$ xvc pipeline step new --step-name generate-data --command 'python3 generate_data.py'
$ xvc pipeline step dependency --step-name generate-data --step install-deps
```

After you define the pipeline, you can run it by:

```console
$ xvc pipeline run
[DONE] install-deps (python3 -m pip install --quiet --user -r requirements.txt)
[OUT] [generate-data] CSV file generated successfully.
 
[DONE] generate-data (python3 generate_data.py)

```

Xvc allows many kinds of dependnecies, like [files](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#file-dependencies), 
[groups of files and directories defined by globs](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#glob-dependencies), 
[regular expression searches in files](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#regex-dependencies), 
[line ranges in files](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#line-dependencies), 
[hyper-parameters defined in YAML, JSON or TOML files](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#hyper-parameter-dependencies)
[HTTP URLs](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#url-dependencies),
[shell command outputs](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#generic-command-dependencies), 
and [other steps](https://docs.xvc.dev/ref/xvc-pipeline-step-dependency#step-dependencies). 

Suppose you're only interested in the IQ scores of those with _Dr._ in front of their names and how they differ from the rest in the dataset we created. Let's create a regex search dependency to the data file that will show all _doctors_ IQ scores.

```console
$ xvc pipeline step new --step-name dr-iq --command 'echo "${XVC_REGEX_ADDED_ITEMS}" >> dr-iq-scores.csv '
$ xvc pipeline step dependency --step-name dr-iq --regex-items 'random_names_iq_scores.csv:/^Dr\..*'
```

The first line specifies a command, when run writes `${XVC_REGEX_ADDED_ITEMS}` environment variable to `dr-iq-scores.csv` file. 
The second line specifies the dependency which will also populate the `$[XVC_REGEX_ADDED_ITEMS]` environment variable in the command. 

Some dependency types like [regex items], 
[line items] and [glob items] inject environment variables in the commands they are a dependency.
For example, if you have two million files specified with a glob, but want to run a script only on the added files after the last run, you can use these environment variables. 


When you run the pipeline again, a file named `dr-iq-scores.csv` will be created. Note that, as `requirements.txt` didn't change `install-deps` step and its dependent `generate-data` steps didn't run.

```console
$ xvc pipeline run
[DONE] dr-iq (echo "${XVC_REGEX_ADDED_ITEMS}" >> dr-iq-scores.csv )

$ cat dr-iq-scores.csv
Dr. Brian Shaffer,122
Dr. Brittany Chang,82
Dr. Mallory Payne MD,70
Dr. Sherry Leonard,93
Dr. Susan Swanson,81

````

We are using this feature to get lines starting with `Dr.` from the file and write them to another file. When the file changes, e.g. another record matching the dependency regex added to the `random_names_iq_scores.csv` file, it will also be added to `dr-iq-scores.csv` file.

```console
$ zsh -cl 'echo "Dr. Albert Einstein,144" >> random_names_iq_scores.csv'

$ xvc pipeline run
[DONE] dr-iq (echo "${XVC_REGEX_ADDED_ITEMS}" >> dr-iq-scores.csv )

$ cat dr-iq-scores.csv
Dr. Brian Shaffer,122
Dr. Brittany Chang,82
Dr. Mallory Payne MD,70
Dr. Sherry Leonard,93
Dr. Susan Swanson,81
Dr. Albert Einstein,144

```

Now we want to add a another command that draws a fancy histogram from `dr-iq-scores.csv`. As this new step must wait `dr-iq-scores.csv` file to be ready, we'll define `dr-iq-scores.csv` as an _output_ of `dr-iq` step and set the file as a dependency to this new `visualize` step.

```console
$ xvc pipeline step output --step-name dr-iq --output-file dr-iq-scores.csv
$ xvc pipeline step new --step-name visualize --command 'python3 visualize.py'
$ xvc pipeline step dependency --step-name visualize --file dr-iq-scores.csv
$ xvc pipeline run
[ERROR] Step visualize finished UNSUCCESSFULLY with command python3 visualize.py

```

You can get the pipeline in Graphviz DOT format to convert to an image.

```console
$ zsh -cl 'xvc pipeline dag | dot -opipeline.png'

```

You can also export and import the pipeline to JSON to edit in your editor.

```console
$ xvc pipeline export --file my-pipeline.json

$ cat my-pipeline.json
{
  "name": "default",
  "steps": [
    {
      "command": "python3 -m pip install --quiet --user -r requirements.txt",
      "dependencies": [
        {
          "File": {
            "content_digest": {
              "algorithm": "Blake3",
              "digest": [
                43,
                86,
                244,
                111,
                13,
                243,
                28,
                110,
                140,
                213,
                105,
                20,
                239,
                62,
                73,
                75,
                13,
                146,
                82,
                17,
                148,
                152,
                66,
                86,
                154,
                230,
                154,
                246,
                213,
                214,
                40,
                119
              ]
            },
            "path": "requirements.txt",
            "xvc_metadata": {
              "file_type": "File",
              "modified": {
                "nanos_since_epoch": [..],
                "secs_since_epoch": [..]
              },
              "size": 14
            }
          }
        }
      ],
      "invalidate": "ByDependencies",
      "name": "install-deps",
      "outputs": []
    },
    {
      "command": "python3 generate_data.py",
      "dependencies": [
        {
          "Step": {
            "name": "install-deps"
          }
        }
      ],
      "invalidate": "ByDependencies",
      "name": "generate-data",
      "outputs": []
    },
    {
      "command": "echo /"${XVC_REGEX_ADDED_ITEMS}/" >> dr-iq-scores.csv ",
      "dependencies": [
        {
          "RegexItems": {
            "lines": [
              "Dr. Brian Shaffer,122",
              "Dr. Susan Swanson,81",
              "Dr. Brittany Chang,82",
              "Dr. Mallory Payne MD,70",
              "Dr. Sherry Leonard,93",
              "Dr. Albert Einstein,144"
            ],
            "path": "random_names_iq_scores.csv",
            "regex": "^Dr//..*",
            "xvc_metadata": {
              "file_type": "File",
              "modified": {
                "nanos_since_epoch": [..],
                "secs_since_epoch": [..]
              },
              "size": 19021
            }
          }
        }
      ],
      "invalidate": "ByDependencies",
      "name": "dr-iq",
      "outputs": [
        {
          "File": {
            "path": "dr-iq-scores.csv"
          }
        }
      ]
    },
    {
      "command": "python3 visualize.py",
      "dependencies": [
        {
          "File": {
            "content_digest": null,
            "path": "dr-iq-scores.csv",
            "xvc_metadata": null
          }
        }
      ],
      "invalidate": "ByDependencies",
      "name": "visualize",
      "outputs": []
    }
  ],
  "version": 1,
  "workdir": ""
}
```

You can edit the file to change commands, add new dependencies, etc. and import it back to Xvc.

```console
$ xvc pipeline import --file my-pipeline.json --overwrite
```

Lastly, if you noticed that the commands are long to type, there is an `xvc aliases` command that prints a set of aliases for commands. You can source the output in your `.zshrc` or `.bashrc`, and use the following commands instead, e.g., `xvc pipelines run` becomes `pvc run`. 


```console
$ xvc aliases

alias xls='xvc file list'
alias pvc='xvc pipeline'
alias fvc='xvc file'
alias xvcf='xvc file'
alias xvcft='xvc file track'
alias xvcfl='xvc file list'
alias xvcfs='xvc file send'
alias xvcfb='xvc file bring'
alias xvcfh='xvc file hash'
alias xvcfco='xvc file checkout'
alias xvcfr='xvc file recheck'
alias xvcp='xvc pipeline'
alias xvcpr='xvc pipeline run'
alias xvcps='xvc pipeline step'
alias xvcpsn='xvc pipeline step new'
alias xvcpsd='xvc pipeline step dependency'
alias xvcpso='xvc pipeline step output'
alias xvcpi='xvc pipeline import'
alias xvcpe='xvc pipeline export'
alias xvcpl='xvc pipeline list'
alias xvcpn='xvc pipeline new'
alias xvcpu='xvc pipeline update'
alias xvcpd='xvc pipeline dag'
alias xvcs='xvc storage'
alias xvcsn='xvc storage new'
alias xvcsl='xvc storage list'
alias xvcsr='xvc storage remove'

```

Please create an issue or discussion for any other kinds of dependencies that you'd like to be included.

I'm planning to add [data label and annotations tracking](https://github.com/iesahin/xvc/discussions/208)), [experiments tracking](https://github.com/iesahin/xvc/discussions/207)), [model tracking](https://github.com/iesahin/xvc/discussions/211)), encrypted cache, server to control all commands from a web interface, and more as my time permits.

Please check [`docs.xvc.dev`](https://docs.xvc.dev) for documentation.





## 🤟 Big Thanks

xvc stands on the following (giant) crates:

- [trycmd] is used to run all example commands in this file, [reference, and how-to documentation](https://docs.xvc.dev) at
  every PR. It makes sure that the documentation is always up-to-date and shown commands work as described. We start
  development by writing documentation and implementing them thanks to [trycmd].

- [serde] allows all data structures to be stored in text files. Special thanks from [`xvc-ecs`] for serializing components in an ECS with a single line of code.

- Xvc processes files in parallel with pipelines and parallel iterators thanks to [crossbeam] and [rayon].

- Thanks to [strum], Xvc uses enums extensively and converts almost everything to typed values from strings.

- Xvc has a deep CLI that has subcommands of subcommands (e.g. `xvc storage new s3`), and all these work with minimum bugs thanks to [clap].

- Xvc uses [rust-s3] to connect to S3 and compatible storage services. It employs excellent [tokio] for fast async Rust. These cloud storage features can be turned off thanks to Rust conditional compilation.

- Without implementations of [BLAKE3], BLAKE2, SHA-2 and SHA-3 from Rust [crypto] crate, Xvc couldn't detect file changes so fast.

- Many thanks to small and well built crates, [reflink], [relative-path], [path-absolutize], [glob] for file system and glob handling.

- Thanks to [sad_machine] for providing a State Machine implementation that I used in `xvc pipeline run`. A DAG composed of State Machines made running pipeline steps in parallel with a clean separation of process states.

- Thanks to [thiserror] and [anyhow] for making error handling a breeze. These two crates make me feel I'm doing something good for the humanity when handling errors.

- Xvc is split into many crates and owes this organization to [cargo workspaces].

[crossbeam]: https://docs.rs/crossbeam/
[cargo workspaces]: https://crates.io/crates/cargo-workspaces
[rayon]: https://docs.rs/rayon/
[strum]: https://docs.rs/strum/
[clap]: https://docs.rs/clap/
[serde]: https://serde.rs
[blake3]: https://docs.rs/blake3/
[crypto]: https://docs.rs/rust-crypto/
[reflink]: https://docs.rs/reflink/
[relative-path]: https://docs.rs/relative-path/
[path-absolutize]: https://docs.rs/path-absolutize/
[glob]: https://docs.rs/glob/
[wax]: https://docs.rs/wax/
[trycmd]: https://docs.rs/trycmd/
[sad_machine]: https://docs.rs/sad_machine/
[thiserror]: https://docs.rs/thiserror/
[anyhow]: https://docs.rs/anyhow/
[rust-s3]: https://docs.rs/rust-s3/
[`xvc-ecs`]: https://docs.rs/xvc-ecs/
[tokio]: https://tokio.rs

And, biggest thanks to Rust designers, developers and contributors. Although I can't see myself expert to appreciate it all, it's a fabulous language and environment to work with.


## 🚁 Support

- You can use [Discussions](https://github.com/iesahin/xvc/discussions) to ask questions. I'll answer as much as possible. Thank you.
- I don't follow any other sites regularly. You can also reach me at [emre@xvc.dev](mailto:emre@xvc.dev)



## 👐 Contributing

- Star this repo. I feel very happy for every star and send my best wishes to you. That's a certain win to spend your two seconds for me. Thanks.
- Use xvc. Tell me how it works for you, read the [documentation](https://docs.xvc.dev), [report bugs](https://github.com/iesahin/xvc/issues), [discuss features](https://github.com/iesahin/xvc/discussions).
- Please note that, I don't accept large code PRs. Please open an issue to discuss your idea and write/modify a
  reference page before sending a PR. I'm happy to discuss and help you to implement your idea. Also, it may require a copyright transfer to me, as there may be cases which I provide the code in other licenses. 



## 📜 License

Xvc and these Python bindings are licensed under the [GNU GPL 3.0 License](https://github.com/iesahin/xvc/blob/main/LICENSE). If you want to use the code in your project with other licenses, please contact me.


## 🌦️ Future and Maintenance

I'm using Xvc daily and I'm happy with it. Tracking all my files with Git via arbitrary servers and cloud providers is
something I always need. I'm happy to improve and maintain it as long as I use it.

Given that I'm working on this for the last two years for pure technical bliss, you can expect me to work on it more. 



## ⚠️ Disclaimer

This software is fresh and ambitious. Although I use it and test it close to real-world conditions, it didn't go under
the test of time. **Xvc can eat your files and spit them into the eternal void!** Please take backups.
