-
Notifications
You must be signed in to change notification settings - Fork 100
deduce article title from url if it is empty #453
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Can you please cover this new functionality with tests?
@@ -1069,6 +1069,11 @@ std::string utils::make_title(const std::string& const_url) { | |||
if (title.at(0)>= 'a' && title.at(0)<= 'z') { | |||
title[0] -= 'a' - 'A'; | |||
} | |||
//strip .htm or .html from title |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better move this up and place it on line 1065. This will keep the flow of the function straightforward: it gradually trims the URL down to the title.
@@ -1069,6 +1069,11 @@ std::string utils::make_title(const std::string& const_url) { | |||
if (title.at(0)>= 'a' && title.at(0)<= 'z') { | |||
title[0] -= 'a' - 'A'; | |||
} | |||
//strip .htm or .html from title | |||
size_t pos = title.find(".html",0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather limit this to the end of the URL. We probably don't want to strip ".html" from the middle of the title.
I also don't mind seeing this re-written with a regex and extended to cover .php
and .aspx
as these are quite common; but that's up to you.
2 similar comments
Looks great, but a test for |
@p1er, will you finish this, or should I take over? (I'll credit you in the Changelog either way.) |
Okay, I'm taking over this. |
Thank you for your work, @p1er! |
Try to deduce title from url only if the parsed title is empty. This happens with some blogs which don't populate the title properly.
This patch uses the already existing function make_title, although it is improved to not include .html or .htm at the end of the title.