Skip to content
thehaou edited this page Dec 23, 2022 · 3 revisions

πŸš’πŸ“šπŸ”– ficscraper βœπŸ’¬β€οΈ

The goal of ficscraper is to provide fanfiction readers with a way to generate & interpret stats on their reading habits on websites that provide none. For example:

  • How many words of Harry Potter fanfiction did I read in the year 20XX?
  • What is the ranking of authors I've read the most from (either word count or # of fics-wise)?
  • For each fandom I read this year, what was the order I started reading them in, and which fic did I read from them first?
  • Based on the tags of all the fics I've read, what would my "ideal fic" look like?

And so on. Fanfiction itself is a labor of love and I genuinely hope that ficscraper can provide you with some interesting ways to investigate your own personal relationship with it.

🚨 Let's Get Started

  1. First follow the instructions on the Installation page.
  2. After that's all set up, follow the instructions on the How to Use page.

βœ‹ Supported fanfiction websites

Currently ficscraper only supports stats on Archive of Our Own (aka AO3). This is due to AO3's rich tagging system that allows significant more flexibility in finding patterns in fics read.

Other fanfiction websites such as FanFiction.net (FFN), Wattpad, and Tumblr blogs dedicated to fic writing are considered out-of-scope for this project until I feel ficscraper's AO3 side is sufficiently developed. I more than welcome discussion on implementation of ficscraper for other websites though!

❓ Functionality

ficscraper works in three stages:

  1. Extract user's interactions with fic, such as:
    1. reading history
    2. kudos history
    3. personal bookmarks & tags
  2. Transfer & load the collected information into SQLite, an extremely handy no-installation-needed/in-memory/embedded database management system.
    1. One could actually stop at this stage if they want to begin running stats on their interactions. See here for example queries you can run against SQLite.
  3. Visualize certain types of interactions into something nicely readable for humans (and can be shared)!
    1. See here for ficscraper's current visualization templates. (It uses templated HTML+CSS with some Python mixed in via Mako.)
    2. If you're here to learn how to create your own AO3-Year-In-Review, see here.

πŸ™‹ Frequently asked questions

Please submit legitimate bugs/errors with ficscraper to Issues (don't forget to add the bug label), and all other suggestsions/questions to Discussions.


Q. Why didn't you make a website and have it run ficscraper for me instead? I don't want to have to do all this coding work, and it'd be nice if I could just login and see my stats rather than have to do upkeep myself.

A. A couple key reasons.

  1. I'm setting up a session with AO3 by literally scraping the authentication token and using it for the whole session. Furthermore, I'm requiring plaintext username & passwords to even grab said token. This is frankly way out of my comfort zone to even think about putting on a website - I'm not fluent in implementing website security, and I don't want to be on the hook for your account getting hacked.
  2. AO3 rate limits approximately 20-80 requests per 10 minutes. This is perfectly fine when you're slowly reading through a multichapter fic - it's less fine when there are 200+ pages of bookmarks ficscraper is trying to grab. Multiply it out to multiple users and you can quickly see how the throughput of this falls through the floor.

Q. Why Python 3.9?

A. No real reason; it was newish at the time of implementation and bs4 is pretty straightforward to use in Python.