VTS (Vector Transport Service) is an open-source tool for moving vectors and unstructured data. It is developed by Zilliz based on Apache Seatunnel.
- Meeting the Growing Data Migration Needs: VTS evolves from our Milvus Migration Service, which has successfully helped over 100 organizations migrate data between Milvus clusters. User demands have grown to include migrations from different vector databases, traditional search engines like Elasticsearch and Solr, relational databases, data warehouses, document databases, and even S3 and data lakes to Milvus.
- Supporting Real-time Data Streaming and Offline Import: As vector database capabilities expand, users require both real-time data streaming and offline batch import options.
- Simplifying Unstructured Data Transformation: Unlike traditional ETL, transforming unstructured data requires AI and model capabilities. VTS, in conjunction with the Zilliz Cloud Pipelines, enables vector embedding, tagging, and complex transformations, significantly reducing data cleaning costs and operational complexity.
- Ensuring End-to-End Data Quality: Data integration and synchronization processes are prone to data loss and inconsistencies. VTS addresses these critical data quality concerns with robust monitoring and alerting mechanisms.
Built on top of Apache Seatunnel, Vector-Transport-Service offers:
- Rich, extensible connectors
- Unified stream and batch processing for real-time synchronization and offline batch imports
- Distributed snapshot support for data consistency
- High performance, low latency, and scalability
- Real-time monitoring and visual management
Additionally, Vector-Transport-Service introduces vector-specific capabilities such as multiple data source support, schema matching, and basic data validation.
Future developments include:
- Incremental synchronization
- Combined one-time migration and change data capture
- Advanced data transformation capabilities
- Enhanced monitoring and alerting
- Docker installed
- Access to source and target databases
- Required credentials and permissions
- Pull the VTS Image
docker pull zilliz/vector-transport-service:latest
docker run -it zilliz/vector-transport-service:latest /bin/bash
- Configure Your Migration
Create a configuration file (e.g.,
migration.conf
):
env {
parallelism = 1
job.mode = "BATCH"
}
source {
# Source configuration (e.g., Milvus, Elasticsearch, etc.)
Milvus {
url = "https://your-source-url:19530"
token = "your-token"
database = "default"
collections = ["your-collection"]
batch_size = 100
}
}
sink {
# Target configuration
Milvus {
url = "https://your-target-url:19530"
token = "your-token"
database = "default"
batch_size = 10
}
}
- Run the Migration
Cluster Mode (Recommended):
# Start the cluster
mkdir -p ./logs
./bin/seatunnel-cluster.sh -d
# Submit the job
./bin/seatunnel.sh --config ./migration.conf
Local Mode:
./bin/seatunnel.sh --config ./migration.conf -m local
- Adjust
parallelism
based on your data volume - Configure appropriate
batch_size
for optimal performance - Set up proper authentication and security measures
- Monitor system resources during migration
VTS supports various connectors for data migration:
- Milvus (example config)
- Elasticsearch (example config)
- Pinecone (example config)
- Qdrant (example config)
- Postgres Vector (example config)
- Tencent VectorDB (example config)
For more advanced features, refer to our Tutorial.md and the Apache SeaTunnel Documentation:
- Transformers (TablePathMapper, FieldMapper, Embedding)
- Cluster mode deployment
- RESTful API for job management
- Docker deployment
- Advanced configuration options
For development setup and contribution guidelines, see Development.md.
Need help? Contact our support team:
- Email: support@zilliz.com
- Discord: Join our community
SeaTunnel is a next-generation, high-performance, distributed data integration tool. It's:
- Capable of synchronizing vast amounts of data daily
- Trusted by numerous companies for efficiency and stability
- Released under Apache 2 License
- A top-level project of the Apache Software Foundation (ASF)
For more information, visit the Apache Seatunnel website.