Skip to content

Commit 72a19dd

Browse files
author
Joseph Luce
authored
Update web_crawler.md
1 parent 6f54df6 commit 72a19dd

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

system_design/web_crawler.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ This isn't 100% best solution due to collisions, this is a question you need to
5454

5555
# Fault Tolerance
5656
The URL Manager and the Content Manager should both have a master-slave architecture for better up-time.
57-
However, if both go down at the same time, we can revive them by using the database.
57+
However, if both go down at the same time, we can revive them by using each their own database.
5858
Since the URL Manager will be getting a stream of URLs, hence, having a queue, it would be important to save this queue into the database.
59-
Therefore, requiring the database to have a visited URL hash codes and unvisited set of URL links.
59+
Therefore, requiring the database to have visited set of URL hash codes and an unvisited set of URL links.
6060

6161
For the extractors, it will depend if the extractors will have a queue or not.
6262
If they will hold a queue, it will be important to also have a master-slave architecture.

0 commit comments

Comments
 (0)