Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

remove request step-by-step from README again

  • Loading branch information...
commit 3753986b9c091f1d08865aa20b18d6c7bd670ab3 1 parent bf04ff3
@havocp havocp authored
Showing with 0 additions and 34 deletions.
  1. +0 −34 README.md
View
34 README.md
@@ -54,40 +54,6 @@ The major technologies in the app, in brief:
[Binary JSON aka BSON][bson] documents. On Heroku, one way to
use it is with the [MongoHQ][mongohq] addon.
-## A request step-by-step
-
-If you follow an incoming request to Web Words, here's what the app
-shows you:
-
- - an **embedded Jetty** HTTP server receives requests to spider sites
- - http://wiki.eclipse.org/Jetty/Tutorial/Embedding_Jetty
- - requests are forwarded to **Akka HTTP**, which uses Jetty Continuations
- to keep requests from tying up threads
- - http://akka.io/docs/akka/1.2/scala/http.html
- - the web process checks for previously-spidered info in a
- **MongoDB capped collection** which acts as a cache.
- This uses the **Heroku MongoHQ addon**.
- - http://www.mongodb.org/display/DOCS/Capped+Collections
- - http://devcenter.heroku.com/articles/mongohq
- - if the spider results are not cached, the web process
- sends a spider request to an indexer process using
- the **RabbitMQ AMQP addon**
- - http://www.rabbitmq.com/getstarted.html
- - http://blog.heroku.com/archives/2011/8/31/rabbitmq_add_on_now_available_on_heroku/
- - the app talks to RabbitMQ using **Akka AMQP**
- - http://akka.io/docs/akka-modules/1.2/modules/amqp.html
- - the indexer process receives a request from AMQP and shallow-spiders
- the site using an Akka actor that encapsulates **AsyncHttpClient**
- - https://github.com/sonatype/async-http-client
- - the indexer uses Akka, **Scala parallel collections**, and **JSoup** to
- grind through the downloaded HTML taking advantage of multiple CPU cores
- - http://www.scala-lang.org/api/current/scala/collection/parallel/package.html
- - http://jsoup.org
- - the indexer stores its output back in the MongoDB cache and sends
- an AMQP message back to the web process
- - the web process loads the now-cached data from MongoDB
- - the web process unsuspends the Jetty request and writes out the results
-
## Setup
If you want to understand this app and/or try running it, here's what you'll
Please sign in to comment.
Something went wrong with that request. Please try again.