Skip to content

Commit

Permalink
add notes for system-design/live-comment (#145)
Browse files Browse the repository at this point in the history
Signed-off-by: Xiaoming Guo <danniel1205@gmail.com>
  • Loading branch information
danniel1205 committed Jan 9, 2024
1 parent bf29180 commit 5f74c1a
Show file tree
Hide file tree
Showing 43 changed files with 487 additions and 9 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ This repo contains the notes/tutorials from my personal tech exploration.
- [Design text based search system](system-design/topics/text-based-search/readme.md)
- [Design distributed delayed job queue](system-design/topics/distributed-delayed-job-queueing-system/readme.md)
- [Design distributed caching](system-design/topics/caching/readme.md)
- [Design real time interactions on live video](system-design/topics/real-time-interactions-on-live-video/readme.md)
- [Design real time interactions on live video](system-design/topics/realtime-interactions-on-live-video/readme.md)
- [Design message broker(RabbitMQ) or event streaming system(Kafka)](system-design/topics/message-broker-and-event-streaming/readme.md)
- [Design geolocation based service(UberEats/DoorDash/...)](system-design/topics/geolocation-based-service/readme.md)
- [Design i18n/translation service](system-design/topics/i18n-service/readme.md)
Expand Down
3 changes: 2 additions & 1 deletion md_style.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@
rule 'MD029', :style => :ordered
exclude_rule 'MD031'
exclude_rule 'MD010'
exclude_rule 'MD024'
exclude_rule 'MD024'
exclude_rule 'MD004'
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ Crash could happend in between of the disk IO. File system crash recovery soluti

## Solution 2 (SSTables and LSM-Trees)

![](resources/lsm-tree.png)

- Make the segement file sorted by keys (Keep the recent data entry if keys are same)

Benefits: Data in log segement file is sorted; Compacting and merging is easier (Merge sort)
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 4 additions & 2 deletions system-design/topics/distributed-counter/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Challenges:

![](resources/row-lock-is-slow.png)

* Write scalability is a challenge because traditional relational database often have a single mater node for write,
* Write scalability is a challenge because traditional relational database often have a single master node for write,
creating bottleneck for high write volumes.
* If we want to iterate and provide more features which causes complex queries, that can be computationally expensive
and slow.
Expand Down Expand Up @@ -271,7 +271,7 @@ It can be implemented as [look-aside with write-around architecture](../caching/

## Push real-time count updates to all users(subscribed users)

Please see [this notes](../real-time-interactions-on-live-video/readme.md) for more details on how Linkedin handles the
Please see [this notes](../realtime-interactions-on-live-video/readme.md) for more details on how Linkedin handles the
live likes updates. Also the `Green` boxes in architecture diagram shows the workflow.

## Failure handling
Expand Down Expand Up @@ -302,7 +302,9 @@ partitioned network in the CRDT database is the following:
## Miscellaneous

* Cassandra stress test

![](resources/cassandra-stress-test.png)

* <https://netflixtechblog.com/benchmarking-cassandra-scalability-on-aws-over-a-million-writes-per-second-39f45f066c9e>

* Millions of websockets
Expand Down
34 changes: 34 additions & 0 deletions system-design/topics/message-broker-and-event-streaming/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,40 @@ func (Consumer c) Consume(topic string, labels) []byte {
- Consumer maintains the offset. Message ID has data length embedded, so it is easy to know how much data to load.
- ZooKeeper also records the last read offset in case of consumer failures.

#### How does Kafka know which consumers subscribe to a specific topic

* Consumer Groups: Consumers don't subscribe individually to topics. Instead, they join consumer groups, which act as
logical units for message consumption. A consumer group can have multiple consumers working collaboratively to consume
messages from the shared topics.

* Group Coordinator: Each consumer group has a designated group coordinator, typically the leader of one of the
partitions in the topic. This coordinator is responsible for managing group membership and keeping track of which
consumers are subscribed to the topic within the group.

* Group Metadata: Every consumer in a group maintains a local copy of the group metadata. This metadata includes
information about all members of the group, including their IDs, leader status, and assigned partitions.

* Heartbeats: Consumers periodically send heartbeats to the group coordinator. These heartbeats confirm their current
status and presence in the group. If a consumer fails to send heartbeats within a specific timeout, the coordinator
assumes it has left the group and removes it from the metadata.

* Rebalancing: When a consumer joins or leaves a group, or a partition leader changes, the group coordinator initiates a
rebalancing process. This process reassigns partitions among the active consumers in the group to ensure even load
distribution and efficient message consumption.

* Offset Commits: Each consumer tracks its progress within a topic by recording its offset, which indicates the last
message it has processed. Consumers periodically commit their offsets to a dedicated topic called the consumer_offsets
topic.

* Offset Tracking: The group coordinator and all consumers maintain copies of the committed offsets for each consumer
and partition. This allows the coordinator to track individual progress and reassign partitions during rebalancing
based on current consumption positions.

In summary, Kafka uses a combination of consumer groups, group coordinators, metadata, heartbeats, rebalancing, and
offset tracking to manage subscriptions and dynamically deliver messages to the appropriate consumers within each group.
This system ensures efficient and resilient message flow while adapting to changes in group membership and partition
leadership.

### RabbitMQ Architecture

![rabbitmq-architecture](resources/rabbitmq-architecture.png)
Expand Down
2 changes: 1 addition & 1 deletion system-design/topics/news-feeds/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ Above workflow is from [this youtube video](https://www.youtube.com/watch?v=bUHF

### Streaming likes on Live video

More details could be found [here](../real-time-interactions-on-live-video/readme.md)
More details could be found [here](../realtime-interactions-on-live-video/readme.md)

### Add comments

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Long Polling vs SSE vs WebSocket

## HTTP Long Polling

Mechanism: Client sends a request, server holds it open until data is available or a timeout occurs, then responds and
closes the connection. Client immediately sends another request, creating a continuous loop.

![](resources/long-polling.png)

### Pros

* Simple, works over standard HTTP without special libraries.
* Works behind most firewalls and proxies.

### Cons

* Inefficient for high-frequency updates due to frequent polling and connection overhead.
* Server-side resource usage can be high due to many open connections.

## Server-Sent Events (SSE)

Mechanism: Unidirectional communication from server to client over a single, long-lived HTTP connection. Server pushes
events to the client as they occur.

![](resources/sse-workflow.png)

### Pros

* More efficient than Long Polling for one-way updates.
* Reduced server-side resource usage.
* Simpler JavaScript API for client-side handling.

### Cons

* Still relies on HTTP, not as efficient as WebSockets for bidirectional communication.
* Limited browser support compared to WebSockets.

## WebSockets

Mechanism: Full-duplex, bidirectional communication over a persistent TCP connection. Enables real-time, two-way data
exchange between client and server.

![](resources/websocket.png)

### Pros

* Most efficient for real-time, bidirectional communication.
* Low latency, low overhead.
* Wide browser support.
* Can handle binary data and messages of any size.

### Cons

* Requires WebSocket-specific libraries and protocols.
* Might be blocked by some firewalls or proxies.
Loading

0 comments on commit 5f74c1a

Please sign in to comment.