Skip to content
Quilt versions and deploys data
Branch: master
Clone or download

README.md

docs on_gitbook chat on_slack codecov pypi

Note: this is the documentation for Quilt 3. For Quilt 2 see here and here.

Overview

Quilt is a collaboration tool for creating, managing, and sharing datasets in S3. Quilt users transform raw, messy data in S3 buckets into immutable datasets--reusable, trusted building blocks that are easy to version, test, share and catalog. Working with datasets in Quilt speeds up model creation, accelerates experimentation, reduces downtime, and increases the productivity of data science teams.

Collaborate in S3

  • Quilt adds search, content preview, versioning, and a Python API to any S3 bucket
  • Every file in Quilt is versioned and searchable
  • Quilt is for data scientists, data engineers, and data-driven teams

Use cases

  • Collaborate - get everyone on the same page by pointing them all to the same immutable data version
  • Experiment faster - blob storage is schemaless and scalable, so iterations are quick
  • Recover, rollback, and reproduce with immutable packages
  • Understand what's in S3 - plaintext and faceted search over S3

Key features

  • Browse, search any S3 bucket
  • Preview images, Jupyter notebooks, Vega visualizations - without downloading
  • Read/write Python objects to and from S3
  • Immutable versions for objects, immutable packages for collections of objects

Components

  • /catalog (JavaScript) - Search, browse, and preview your data in S3
  • /api/python - Read, write, and annotate Python objects in S3
You can’t perform that action at this time.