Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some of the favicon errors #1319

Merged
merged 1 commit into from
May 3, 2021
Merged

Fix some of the favicon errors #1319

merged 1 commit into from
May 3, 2021

Conversation

Grotax
Copy link
Member

@Grotax Grotax commented Apr 28, 2021

Some potential improvements for favcion fetching.

Comment on lines +328 to +331
ini_set('user_agent', 'NextCloud-News/1.0');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could also just use Guzzle to download the file, but I'm fine trying this first.

@Grotax Grotax added the Skip-Changelog No changelog update is required, minor change label Apr 28, 2021
@codecov-commenter

This comment has been minimized.

@Grotax Grotax force-pushed the fix/favicon_fetching branch 2 times, most recently from c53eb60 to 5c7adb1 Compare April 28, 2021 19:27
@Grotax
Copy link
Member Author

Grotax commented Apr 28, 2021

So this reduced the amount of failing feeds for me but still have some errors left.

Exception":"Error","Message":"copy(https://element.io/blog/favicon.png): Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden\r\n at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#337",

if copy fails with http erros it seems like
$is_image = substr(mime_content_type($favicon_path), 0, 5) === "image";
is executed anyway and therefore

mime_content_type(/tmp/favicon.png): Failed to open stream: No such file or directory at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#341

@anoymouserver
Copy link
Contributor

copy should return false in case of an error, so the mime check should probably only happen if copy succeeded.

Check could be done e.g. like this:

$downloaded = copy($favicon, $favicon_path);
$is_image = $downloaded && substr(mime_content_type($favicon_path), 0, 5) === "image";

@Grotax Grotax requested a review from anoymouserver May 2, 2021 09:12
@anoymouserver anoymouserver marked this pull request as ready for review May 2, 2021 09:17
Copy link
Contributor

@anoymouserver anoymouserver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I haven't tested it though.

But wouldn't it make sense to also add a changelog entry, since it's a reported bug, so that it's seen right away in the release?

// check if feed has a logo
if (is_null($favicon)) {
return $this->faviconFactory->get($url);
}

$favicon_path = join("/", [$this->ITempManager->getTempBaseDir(), basename($favicon)]);
copy(

$downloaded = copy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still get errors copy(): Filename cannot be empty at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#340; perhaps that should be addressed as well here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the new change fixes that

@Grotax
Copy link
Member Author

Grotax commented May 2, 2021

But wouldn't it make sense to also add a changelog entry, since it's a reported bug, so that it's seen right away in the release?

True, it has to be also updated in the master branch though.

@Grotax Grotax added bug patch and removed Skip-Changelog No changelog update is required, minor change labels May 2, 2021
@Grotax
Copy link
Member Author

Grotax commented May 2, 2021

Turns out that the favicon lib does indeed trim the urls, but the issue is it also tries to use the orignal url but as most feed endpoints won't be able to handle any favicon path, it causes errors in the log. While the favicon will be then fetched from the base url afterwards.
This might also lead to more requests depending on how the cache works with this, so base url is probably better for this.

@Grotax
Copy link
Member Author

Grotax commented May 2, 2021

@SMillerDev @anoymouserver
It works so far, it reduces the am amount of errors but it won't be zero some pages don't have a favicon or refuse to respond with the file.

Should we leave it like that for now, or put a try/catch around the getFavicon()

@jknockaert
Copy link
Contributor

Turns out that the favicon lib does indeed trim the urls, but the issue is it also tries to use the orignal url but as most feed endpoints won't be able to handle any favicon path, it causes errors in the log. While the favicon will be then fetched from the base url afterwards.
This might also lead to more requests depending on how the cache works with this, so base url is probably better for this.

But this causes plenty of errors: Undefined variable: base_url at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#332

@Grotax
Copy link
Member Author

Grotax commented May 2, 2021

But this causes plenty of errors: Undefined variable: base_url at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#332

Is it possible that you did a mistake when applying this patch?

@jknockaert
Copy link
Contributor

Ja right, now I see that I forgot to update the dependency.

change user agent for fetching the feeds log to 'NextCloud-News/1.0'
check if logo in feed is just an empty string
remove path of url before searching for a favicon
check if file was actually downloaded
Ignore http errors when fetching favicons
Add feed to integration tests that doesn't have a logo

Co-authored-by: Alec Kojaev <alec.kojaev@gmail.com>
Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
@Grotax
Copy link
Member Author

Grotax commented May 3, 2021

Squashed the commits

@Grotax
Copy link
Member Author

Grotax commented May 3, 2021

/backport to master

@Grotax Grotax merged commit 6dfc4ad into stable15 May 3, 2021
@delete-merged-branch delete-merged-branch bot deleted the fix/favicon_fetching branch May 3, 2021 07:28
@backportbot-nextcloud
Copy link

The backport to master failed. Please do this backport manually.

@Grotax
Copy link
Member Author

Grotax commented May 3, 2021

Merged as it improves the situation even if it doesn't fix all issues yet.

@jknockaert
Copy link
Contributor

Yeah there's still to fix something: copy(http://******): failed to open stream: Connection timed out at /var/www/nextcloud/apps/news/lib/Fetcher/FeedFetcher.php#340

@Grotax
Copy link
Member Author

Grotax commented May 3, 2021

Yea we probably need to put a try catch around it but at that level there is no logger available and probably shouldn't
So all I can think of is silencing them completely.

Grotax added a commit that referenced this pull request May 3, 2021
Fixed
- content of atom feeds is missing (#1325)
- Fix some of the favicon fetching errors (#1319)

Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
Grotax added a commit that referenced this pull request May 3, 2021
Fixed
- content of atom feeds is missing (#1325)
- Fix some of the favicon fetching errors (#1319)

Signed-off-by: Benjamin Brahmer <info@b-brahmer.de>
@jknockaert
Copy link
Contributor

@Grotax after upgrading to 15.4.1 I get a long list of output when the cronjob runs: Method getDescription is deprecated and will be removed in feed-io 5.0. Please use getContent() instead.

@Grotax
Copy link
Member Author

Grotax commented May 3, 2021

Yea we already noticed, https://github.com/nextcloud/news/releases/tag/15.4.2

@jknockaert
Copy link
Contributor

OK thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants