Poor performance for content-only extraction

While training new models for a release(as mentioned in #52 and #60 ), I was getting much worse performance on content extraction than what was reported before #52, and much worse than on content with comments(I consistently get an F1 score of about 0.6).

I'm looking into the cause of this now, but I don't think a new release should be made until that is resolved(or if anyone here has trained a good model since #52 was merged).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Poor performance for content-only extraction #61

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Poor performance for content-only extraction #61

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions