While training new models for a release(as mentioned in #52 and #60 ), I was getting much worse performance on content extraction than what was reported before #52, and much worse than on content with comments(I consistently get an F1 score of about 0.6).
I'm looking into the cause of this now, but I don't think a new release should be made until that is resolved(or if anyone here has trained a good model since #52 was merged).