AutoMQ: A Cloud-Native fork of Kafka by separating storage to S3

🍵 AutoMQ vs Other Streaming Platforms

Feature	AutoMQ	Apache Kafka	Confluent	Apache Pulsar	Redpanda	Warpstream
Apache Kafka Compatibility	Native Kafka	Native Kafka	Native Kafka	Non-Kafka	Kafka Protocol	Kafka Protocol
Source Code Availability	Yes	Yes	No	Yes	Yes	No
Stateless Broker	Yes	No	No	Yes	No	Yes
P99 Latency	Single-digit ms latency					> 400ms
Continuous Self-Balancing	Yes	No	Yes	Yes	Yes	Yes
Scale in/out	In seconds	In hours/days	In hours	In hours (scale-in); In seconds (scale-out)	In hours/In seconds (Enterprise Only)	In seconds
Spot Instance Support	Yes	No	No	No	No	Yes
Partition Reassignment	In seconds	In hours/days	In hours	In seconds	In hours/In seconds (Enterprise Only)	In seconds
Component	Broker Controller	Broker Controller Zookeeper (Non-Kraft)		Broker Controller Zookeeper Bookkeeper Proxy	Broker Controller	Agent MetadataServer
Durability	Guaranteed by S3/EBS[1]	Guaranteed by ISR		Guaranteed by Bookkeeper	Guaranteed by Raft	Guaranteed by S3
Inter-AZ Networking Fees	No	Yes				No

Tips: Apache Kafka Compatibility's definition is coming from this blog.

[1] EBS Durability: On Azure, GCP, and Alibaba Cloud, Regional EBS replicas span multiple AZs. On AWS, ensure durability by double writing to EBS and S3 Express One Zone in different AZs.

🔶Why AutoMQ

Cloud Native: Built on cloud service. Every system design decision takes the cloud service's feature and billing items into consideration to offer the best low-latency, scalable, reliable, and cost-effective Kafka service on the cloud.
High Reliability: Leverage the features of cloud service to offer RPO of 0 and RTO in seconds.
- AWS: Use S3 express one zone and S3 to offer AZ-level disaster recovery.
- GCP: Use regional SSD and cloud storage to offer AZ-level disaster recovery.
- Azure: Use zone-redundant storage and blob storage to offer AZ-level disaster recovery.
Serverless:
- Auto Scaling: Watch key metrics of cluster and scale in/out automatically to match your workload and achieve pay-as-you-go.
- Scaling in seconds: The computing layer (broker) is stateless and can scale in/out in seconds, which makes AutoMQ truly serverless. Learn more
- Infinite scalable: Use the cloud's object storage as the main storage, never worry about storage capacity.
Manage-less: Built-in auto-balancer component balance partition and network traffic across brokers automatically. Never worry about partition re-balance. Learn more
Cost effective: Leveraging object storage as the primary storage solution, incorporating billing considerations into the system design, and maximizing the use of cloud services collectively enable AutoMQ to be 10x more cost-effective than Apache Kafka. Refer to this report to see how we cut Apache Kafka billing by 90% on the cloud.
High performance:
- Low latency: Use cloud block storage like AWS EBS as the WAL(Write Ahead Log) to accelerate writing.
- High throughput: Use pre-fetching, batch processing, and parallel to achieve high throughput.
Refer to the AutoMQ Performance White Paper to see how we achieve this.
A superior alternative to Apache Kafka: 100% compatible with Apache Kafka greater than 0.9.x and not lose any good features of it, but cheaper and better.

✨Architecture

AutoMQ uses logSegment as a coding aspect of Apache Kafka to weave into our features. The architecture includes the following main components:

S3Stream: A streaming library based on object storage offered by AutoMQ. It is the core component of AutoMQ and is responsible for reading and writing data to object storage. Learn more.
Stream: Stream is an abstraction for mapping the logSegment of Apache Kafka. LogSegment's data, index, and other metadata will map to different types of streams. Learn more
WAL: AutoMQ uses a small-size cloud block storage like AWS EBS as the WAL(Write Ahead Log) to accelerate writing. Pay attention that this is not tiered storage and the AutoMQ broker can decoupled from the WAL completely. Learn more
Stream set object: A Stream Set Object aggregates data from multiple streams into individual segments, significantly cutting down on object storage API usage and metadata size. Learn more
Stream Object: A Stream Object contains data from a single stream, typically separated when compacting Stream Set Objects for streams with larger data volumes. Learn more

⛄Get started with AutoMQ

Quick Start With A Single Line

curl https://download.automq.com/install.sh | sh

The easiest way to run AutoMQ. You can experience features like Partition Reassignment in Seconds and Continuous Self-Balancing in your local machine. Learn more

Attention: Local mode mock object storage locally and is not a production-ready deployment. It is only for demo and test purposes.

Run AutoMQ on the cloud manually

Deploy AutoMQ manually with released tgz files on the cloud, currently compatible with AWS, Aliyun Cloud, Tencent Cloud, Huawei Cloud, and Baidu Cloud. Learn more

💬Community

You can join the following groups or channels to discuss or ask questions about AutoMQ:

Ask questions or report a bug by GitHub Issues
Discuss about AutoMQ or Kafka by Slack or Wechat Group

👥How to contribute

If you've found a problem with AutoMQ, please open a GitHub Issues. To contribute to AutoMQ please see Code of Conduct and Contributing Guide. We have a list of good first issues that help you to get started, gain experience, and get familiar with our contribution process.

🌈Roadmap

Coming soon...

⭐License

AutoMQ is released under Business Source License 1.1. When contributing to AutoMQ, you can find the relevant license header in each file.

Name		Name	Last commit message	Last commit date
Latest commit History 13,228 Commits
.github		.github
.idea/copyright		.idea/copyright
bin		bin
checkstyle		checkstyle
clients		clients
config		config
connect		connect
core		core
docker		docker
docs		docs
examples		examples
generator/src		generator/src
gradle		gradle
group-coordinator		group-coordinator
jmh-benchmarks		jmh-benchmarks
kshell-sdk/src/main		kshell-sdk/src/main
licenses		licenses
log4j-appender/src		log4j-appender/src
metadata		metadata
raft		raft
s3stream		s3stream
server-common/src		server-common/src
server/src		server/src
shell/src		shell/src
storage		storage
streams		streams
tests		tests
tools		tools
transaction-coordinator/src/main/java/org/apache/kafka/coordinator/transaction		transaction-coordinator/src/main/java/org/apache/kafka/coordinator/transaction
trogdor		trogdor
vagrant		vagrant
.asf.yaml		.asf.yaml
.gitignore		.gitignore
BSL.md		BSL.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTING_GUIDE.md		CONTRIBUTING_GUIDE.md
HEADER		HEADER
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
LICENSE-binary		LICENSE-binary
NOTICE		NOTICE
NOTICE-binary		NOTICE-binary
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
Vagrantfile		Vagrantfile
automq_release.py		automq_release.py
build.gradle		build.gradle
doap_Kafka.rdf		doap_Kafka.rdf
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
gradlewAll		gradlewAll
kafka-merge-pr.py		kafka-merge-pr.py
release.py		release.py
release_notes.py		release_notes.py
retry_zinc		retry_zinc
reviewers.py		reviewers.py
settings.gradle		settings.gradle
wrapper.gradle		wrapper.gradle

License

Licenses found

AutoMQ/automq

Folders and files

Latest commit

History

Repository files navigation