🦉 ML Experiments and Data Management with Git
-
Updated
May 27, 2024 - Python
🦉 ML Experiments and Data Management with Git
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).
Python Data as Code core implementation
create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools
Playground for learning DVC
Declaratively create, transform, manage and version ML datasets.
sgr (command line client for Splitgraph) and the splitgraph Python library
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
Stop programming common dvc stages. Configure them.
A CKAN extension for data versioning.
An abstraction layer for data storage systems
Personal project aimed at developing a ML service which resembles a production environment system
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
Add a description, image, and links to the data-version-control topic page so that developers can more easily learn about it.
To associate your repository with the data-version-control topic, visit your repo's landing page and select "manage topics."