Skip to content
This repository has been archived by the owner on Oct 10, 2019. It is now read-only.

deduce article title from url if it is empty #453

Merged
merged 2 commits into from
Feb 12, 2017

Conversation

p1er
Copy link

@p1er p1er commented Dec 23, 2016

Try to deduce title from url only if the parsed title is empty. This happens with some blogs which don't populate the title properly.

This patch uses the already existing function make_title, although it is improved to not include .html or .htm at the end of the title.

@coveralls
Copy link

coveralls commented Dec 23, 2016

Coverage Status

Coverage decreased (-0.03%) to 32.394% when pulling f33f44d on p1er:title-from-url into 310d1ad on akrennmair:master.

Copy link
Collaborator

@Minoru Minoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you please cover this new functionality with tests?

@@ -1069,6 +1069,11 @@ std::string utils::make_title(const std::string& const_url) {
if (title.at(0)>= 'a' && title.at(0)<= 'z') {
title[0] -= 'a' - 'A';
}
//strip .htm or .html from title
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better move this up and place it on line 1065. This will keep the flow of the function straightforward: it gradually trims the URL down to the title.

@@ -1069,6 +1069,11 @@ std::string utils::make_title(const std::string& const_url) {
if (title.at(0)>= 'a' && title.at(0)<= 'z') {
title[0] -= 'a' - 'A';
}
//strip .htm or .html from title
size_t pos = title.find(".html",0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather limit this to the end of the URL. We probably don't want to strip ".html" from the middle of the title.

I also don't mind seeing this re-written with a regex and extended to cover .php and .aspx as these are quite common; but that's up to you.

@coveralls
Copy link

coveralls commented Dec 24, 2016

Coverage Status

Coverage increased (+0.01%) to 32.434% when pulling e25e681 on p1er:title-from-url into 310d1ad on akrennmair:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 32.434% when pulling e25e681 on p1er:title-from-url into 310d1ad on akrennmair:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 32.434% when pulling e25e681 on p1er:title-from-url into 310d1ad on akrennmair:master.

@Minoru
Copy link
Collaborator

Minoru commented Dec 24, 2016

Looks great, but a test for rss_parser will be nice, too—after all, that's what you wanted to change in the first place; changes to make_title are incidental.

@Minoru
Copy link
Collaborator

Minoru commented Jan 25, 2017

@p1er, will you finish this, or should I take over? (I'll credit you in the Changelog either way.)

@Minoru
Copy link
Collaborator

Minoru commented Feb 12, 2017

Okay, I'm taking over this.

@Minoru Minoru merged commit e25e681 into akrennmair:master Feb 12, 2017
@Minoru
Copy link
Collaborator

Minoru commented Feb 12, 2017

Thank you for your work, @p1er!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants