A simple toy crawler that craws IMDB top 1000 movies.
-
Install Rust and Cargo, preferrably using rustup and select nightly toolchain.
-
Run cargo run
Then the crawler will first crawl the page by spawning 8 threads. When it's done, it will spawn a web server at localhost:8000. Send request to http://localhost:8000/your term to see JSON-serialized results.
- Use a "professional" in-memory DB (e.g. redis) instead of an adhoc hashtable to maintain results.
- Separate crawling and server executable. Both could talk with the DB backend.