Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
bug 1271509: Add ./manage.py scrape_document #4248
Fix some stuff that should have been in the last PR (less user test fixtures, improve the scraper), and then add the
@@ Coverage Diff @@ ## master #4248 +/- ## ========================================== + Coverage 87.09% 87.86% +0.77% ========================================== Files 155 162 +7 Lines 9420 10022 +602 Branches 1263 1367 +104 ========================================== + Hits 8204 8806 +602 Misses 989 989 Partials 227 227
Wow, this is a lot of great work, with superb tests as well. I really like the core, iterative, state-driven processing loop and how it elegantly accommodates the complex, recursive dependencies of scraping data from Kuma. It reveals a lot of careful thought invested in breaking-down a complex problem. It's a kind of microcosm of Kuma itself. Reading it in detail helped enlarge as well as confirm my knowledge of Kuma.
I'm almost embarrassed to even mention a couple of nitpicks. This PR deserves fanfare.