Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<guid> matching of entries is done case-insensitively, should be case-sensitive #2077

Closed
Dan-Q opened this issue Oct 24, 2018 · 5 comments
Closed

Comments

@Dan-Q
Copy link

Dan-Q commented Oct 24, 2018

I have an RSS feed which summarises URLs produced by a URL-shortening service. The of each entry in the feed is the shortened URL, each of which is unique. However, FreshRSS discards those where the guid matches that of an existing guid, even where they differ in case.

E.g. the following feed:

<?xml version="1.0" encoding="UTF-8" ?>
<rss version='2.0' xmlns:atom='http://www.w3.org/2005/Atom'>
  <channel>
    <item>
      <title>An Oral History of ‘Leisure Suit Larry’ – MEL Magazine</title>
      <description></description>
      <link>https:&#x2F;&#x2F;melmagazine.com&#x2F;an-oral-history-of-leisure-suit-larry-ef41bc374802</link>
      <guid>https://url-shortening-service/Mw</guid>
      <pubDate>Sat, 04 Aug 2018 18:52:06 -0000</pubDate>
    </item>
    <item>
      <title>The Web That Never Was - Dylan Beattie - YouTube</title>
      <description></description>
      <link>https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=j51Fmn4JVwU</link>
      <guid>https://url-shortening-service/MW</guid>
      <pubDate>Sat, 04 Aug 2018 13:14:21 -0000</pubDate>
    </item>
  </channel>
</trr>

When imported into FreshRSS produces only ONE entry in the database: it appears that the guids are being compared in a case-insensitive manner.

This is incorrect behaviour because (with the exception of the protocol and domain name parts) URLs, which are often used as guids, are NOT (necessarily) case-sensitive.

@Frenzie
Copy link
Member

Frenzie commented Oct 24, 2018

Does the behavior differ if you add isPermaLink="true"?

At a glance I only see a whole bunch of characters being stripped out (cf. #1335) but perhaps it's down to SimplePie.

@Alkarex Alkarex added this to the 1.13.0 milestone Oct 24, 2018
@Alkarex
Copy link
Member

Alkarex commented Oct 24, 2018

@Dan-Q Thanks for the bug report.

In FreshRSS, GUID is currently case-insensitive only when using MySQL (COLLATE utf8mb4_unicode_ci), and should already be case-sensitive when using SQLite (COLLATE BINARY) or PostgreSQL.

Processors MUST compare atom:id elements on a character-by-character basis (in a case-sensitive fashion)

So this is something that indeed requires fixing on FreshRSS side.

@Alkarex
Copy link
Member

Alkarex commented Oct 24, 2018

@Dan-Q Could you please provide an URL for such a feed?

Alkarex added a commit to Alkarex/FreshRSS that referenced this issue Oct 24, 2018
@Alkarex Alkarex modified the milestones: 1.13.0, 1.12.0 Oct 24, 2018
@Alkarex
Copy link
Member

Alkarex commented Oct 24, 2018

Patch available #2078 (not tested yet - feedback much welcome)

@Alkarex
Copy link
Member

Alkarex commented Oct 25, 2018

Fixed by #2078

@Alkarex Alkarex closed this as completed Oct 25, 2018
Alkarex added a commit that referenced this issue Oct 25, 2018
* MySQL GUID case sensitive

latin1_bin
#2077

* Prepare update for existing bases

* Perform DB update during actualize

* Reduce frequency slightly

* No optimize at the same time

* Take advantage of the SQL modifications in 1.12

* Move higher up

* Move to purge, which all users can manually call
Alkarex added a commit that referenced this issue Oct 25, 2018
javerous pushed a commit to javerous/FreshRSS that referenced this issue Jan 20, 2020
* MySQL GUID case sensitive

latin1_bin
FreshRSS#2077

* Prepare update for existing bases

* Perform DB update during actualize

* Reduce frequency slightly

* No optimize at the same time

* Take advantage of the SQL modifications in 1.12

* Move higher up

* Move to purge, which all users can manually call
javerous pushed a commit to javerous/FreshRSS that referenced this issue Jan 20, 2020
mdemoss pushed a commit to mdemoss/FreshRSS that referenced this issue Mar 25, 2021
* MySQL GUID case sensitive

latin1_bin
FreshRSS#2077

* Prepare update for existing bases

* Perform DB update during actualize

* Reduce frequency slightly

* No optimize at the same time

* Take advantage of the SQL modifications in 1.12

* Move higher up

* Move to purge, which all users can manually call
mdemoss pushed a commit to mdemoss/FreshRSS that referenced this issue Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants