-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some feeds have duplicate entries #79
Comments
A way to solve this is to have the user delete and re-add the feed (obviously, any feed and entry metadata along with the entries no longer in the feed are lost). |
Another example:
|
Another related (and useful) case is the same entry being re-posted in a related feed:
In this case, the first feed has enclosures and the second doesn't, and we'd like to only keep the first entry. Another one:
|
So, a possible implementation for entry deduplication:
Some questions:
Update: I did some digging, and the approach described above could be implemented like this:
Both seem a bit too "heavy" for the problem I'm trying to solve. |
We can also give up on a single magic solution, and acknowledge that there are probably at least two different problems:
For both cases, the duplicates:
|
For now, I will implement 2.i. (different title, identifying pattern) as I am currently affected by it. |
The only other sub-case that actually happened is 1.i. (same title, same feed); it happened about 2-4 times in 1 year. |
Ran |
Covered all known cases, resolving. |
Some entries have duplicate entries, or their ids change (resulting in an entry being stored twice).
E.g., a feed that had the entry id format updated:
If possible, only one should be shown (similar to #78).
The text was updated successfully, but these errors were encountered: