Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolated case - summarizes bio instead of the article #1

Open
lukeseo opened this issue Sep 24, 2013 · 1 comment
Open

Isolated case - summarizes bio instead of the article #1

lukeseo opened this issue Sep 24, 2013 · 1 comment

Comments

@lukeseo
Copy link

lukeseo commented Sep 24, 2013

Like the extension!

Try this page, it only summarizes the author's bio.

http://techcrunch.com/2013/09/18/yahoo-updates-its-flagship-iphone-app-with-cinemagraphs-read-later-feature-more-news-and-tumblr/

@xissy
Copy link
Owner

xissy commented Sep 24, 2013

@lukeseo thanks for your feedback!

I confirmed the issue. To summary a web page, the program has to extract article body section first. 3-Sentences uses Boilerpipe as a article body extractor. Unfortunately, Boilerpipe with TechCrunch is confusing where the right article section is. It returns the author's bio section instead of the article section.

I guess that a lot of people want to summarize TechCrunch like you and me. So I'll add some treatments especially for TechCrunch to either the next version of 3-Sentences chrome extension or server side.

Furthermore, because it might be possible to meet these special cases with various popular websites, I'm going to devise an elegant solution which can filter these special management targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants