Reimplement full-text scraping#563
Conversation
|
Do you plan to continue working on this? |
|
I use it everyday, so I at least plan to keep it working and fix the bugs. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
|
It has been marked as stale, please let me know if you are not interested in this and I will let it expire |
|
Oh I'm interested I thought you would fix the complains of the bots first :) |
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
|
Ops... sorry, I hadn't noticed them. |
|
I took a look at this and to me it looks good. Why did you go for readability? Seems like it's not well maintained anymore. Maybe considering https://github.com/j0k3r/graby would be a good idea. |
|
I chose readability.php because it is a port of Mozilla's Readability.js which works really well on Firefox, and when I started using it, a few months ago, it was still maintained. Honestly I prefer readability.php because it's simpler, graby has a huge amount of dependency, but I'll give it a try to check how well it works and I'll let you know. |
|
I tried https://github.com/j0k3r/graby and I noticed that readability.php works better on the feeds that I use. But if you prefer to use graby in the official version of news, I have the code ready (https://github.com/DriverXX/news/tree/graby) and I can edit my pull request to use it, instead of readability. Please, let me know if you want to use graby, and I will edit my PR. |
|
No it's alright you have checked it and made a reasonable decision. I'm fine with it. It also looks not too complex to me so I think we will go with it. If it breaks at some point we might decide to remove it again, if no one is willing to fix it. |
|
awesome. Many thanks |
I have reimplemented full-text scraping using this library:
https://github.com/andreskrey/readability.php
What do you think about this?