-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TED is not pushing 6 big topics anymore #150
Comments
@benoit74 Do you mean that the topics themselves have been deprecated and replaced by others, or that new ones have been added and out of these, 10 are featured on the font page (but there could be 16 or 20 topics in all)? |
Topics are not presented at all anymore in the front page (they were if I got @rgaudin remark correctly). Regarding which topics are available, you can research this on your own, they are displayed in the talks page. "Legacy" topics were Business, Design, Entertainment, Global Issues, Science, Technology. "New main" topics are AI, Business, Communication, Education, Health, Language, Leadership, Mental Health, Motivation, Personal Growth, Psychology, Sleep, Sports, TED-ed So it looks like "legacy" topics are not that important anymore. But they are all still available (via the "See all" button). In all there are like hundreds of topics (again, "See all", but I did not counted them ^^). |
We decided that at the end of the day we would like to capture these new topics. However, since the scraper is out of commission for the time being there is no point in creating the corresponding recipes. We can keep the existing files for the time being, they're still very watchable, but the TED scraper ultimately needs a fix (ideally one that also captures new topics when/if they are created). A single mega-zim without the |
Do you mean that you will capture the 100+ topics in 100+ ZIMs? For the scraper, there is no difference between old and new topics, they are just topics. So once fix, it will be possible to scrape any topic. And I think that just like we decided for |
Likely yes, until we find a better way. |
Same remark as openzim/zimfarm#878 (comment), how do you plan to create 100+ recipes manually? And maintain them manually on the long term? Be informed of new topics which will obviously appear at some point? In addition to this burden for the content team to create and maintain 100+ recipes, I'm also not convinced because impact on storage is not negligible. As already mentioned most videos are present in many topics, so a single video will be stored in multiple ZIM, hence increasing storage space. This is a concern for us (but we could say that we don't mind and will pay for it), but also (and probably even more importantly) a concern for ZIM users who won't have a convenient way to download a collection of TED ZIMs (or even the whole TED collection) without wasting storage space on their devices. I don't have a solution to this concern yet, but I feel that answering "let's create 100+ ZIMs" is not a realistic solution. |
Well ideally once the recipes are created maintenance should be fairly light, if at all needed except once in a blue moon when everything breaks at once and we are forced to have these discussions. Being informed of new topics is indeed an question for which we have no answer, which is why I wrote that there should be an actual project to manage all these issues and more. In the meantime, we are in the process of providing educational content to users (TED is quite popular) and have no way to know whether their interests overlap across topics (ie would someone interested in Storage on our side does not seem to be an issue: the largest TED zim we have is |
OK, this makes sense. |
This is correct. |
OK, so it is quite clear now how to move this forward, we first need to fix the scraper. |
We previously took the decision to create one ZIM for each six topics pushed by TED on its website (Business, Design, Entertainment, ...)
These six topics are not pushed forward anymore (at least not anymore on the front page), and there are now many many topics.
Given the fact that the scraper functionality to create ZIMs by topics is broken (see #149), we wonder if it makes sense to continue on this strategy or adopt a new one.
Some remarks:
@Popolechien @RavanJAltaie we need your help on this
The text was updated successfully, but these errors were encountered: