Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Duplicate entries + Google Alert implementation #30

Open
ScottYTAUTO opened this issue Aug 23, 2022 · 2 comments
Open

Remove Duplicate entries + Google Alert implementation #30

ScottYTAUTO opened this issue Aug 23, 2022 · 2 comments

Comments

@ScottYTAUTO
Copy link

Awesome product, great idea and exactly what me and my team are after. Sorry if any formatting is weird, new to GitHub.

Currently running it every hour or so to grab most up to date news, however noticing that doing this so frequently is causing a number of duplicate entries. Is there any way of setting it up to remove a page if one with the exact title already exists?

Also, I've attempted to set up a few Google alerts to feed into the Reader, however doing so doesn't look to remove the HTML tags like the regular feeds taken from other websites, example attached
image

Taken a brief look at the code and if I'm honest I'm not entirely sure how to address this, is there anyway of using turndownService's remove function to filter out any leftover HTML tags? If you know of anything and can point me in the right direction I can look to implement it on my fork.

Use-Case: My team and I use Notion to track video projects based on current breaking news, where the Title of the page roughly translates into video titles. Having the Feed updating frequently is essential to grab breaking news as soon as it happens so we can get them into production pipeline ASAP.

Again, awesome product and great work, once I've gotten this working in my team environment I'll also look to get this set up for personal use.

@ravgeetdhillon
Copy link
Owner

@ScottYTAUTO Thanks for using the project. For the more frequent updates, there is already an open issue (#16) for that. I haven't been able to find some time to work on this feature but it is surely on my mind.

And regarding the HTML strings, can you please share the feed that adds the titles with HTML? Then I could look into it and fix the issue.

@ScottYTAUTO
Copy link
Author

@ravgeetdhillon Thanks for getting back to me, sorry for the double up, I must have missed that when I was looking at already opened issues.

As for the HTML tags, please see the below RSS feed that is being used in the above example that is causing the issues:

It looks to be happening whenever a Google Alert is generated that outputs to an RSS feed, when using RSS feeds directly from websites it works perfectly fine, for example I have another RSS feed set up that works without issue:

I did also attempt to run the Google Alert feed through an RSS2HTML converter (https://www.rssdog.com/) and then through to Notion, but it appears to suffer the same issue.

Please let me know if I can provide any further information to help identify the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants