Forchins is everything you could ask for in a 4chan scraper
lxml before using Forchins.Make sure you've installed
So, you want to grab some data off of 4chan, but it's too much work?
Look no further.
It's pretty easy to use.
from forchins import forchins as f scraper = f.Scraper() #defaults to 'b' as the board on page 0
Boom. Done. Now you've got the imageboard all to yourself.
Fetching a thread.
from forchins import forchins as f scraper = f.Scraper() threads = scraper.fetch_threads() print threads
What you should see should be a dictionary. The root dictionary items are named for their post number.
It is important to note that this dictionary does not define the threads listed in their entirety, but only what was retrieved when the object was created, such as when viewing a page that lists current threads. Also, to maximize object reusability, this method returns the most recent results.
Retrieving an entire thread.
from forchins import forchins as f scraper = f.Scraper() threads = scraper.get_thread_list() whole_thread = scraper.load_thread(threads.keys()) #we're getting the keys (post numbers) and picking the zeroith one for simplicity's sake
This will request and parse an entire thread from the original post number. The 'whole_thread' object will contain a set of informational attributes about the post along with the replies to the thread, each with their own specific attributes as well.