The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Updated
May 15, 2024 - Python
DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
One framework to develop, deploy and operate data workflows with Python and SQL.
Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way 🌰
Unified storage framework for the entire machine learning lifecycle
Open Source Data Quality Monitoring.
😎 A curated list of awesome DataOps tools
Interactive computing for complex data processing, modeling and analysis in Python 3
DataOps framework for Machine Learning projects.
End-to-end DataOps platform deployed by Terraform.
Squirrel dataset hub
Run LLM-related tools in containers.
Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines
A framework for rapid development of robust data pipelines following a simple design pattern