-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Regex filtering for black and whitelists #183
Comments
The idea sounds fine to me, and was requested multiple times before. I think a lot of people will like it. And you are welcome to implement it :) But please tell me more what kind of content you want to filter and how it is listed in your moodle? This filter is definitely good, but I'm not sure if it will do what you want :D
There is no filter for LTI Modules, like kalvidres, helixmedia, opencast ... these will not be filtered by these patterns because they are "cookie mods" (maybe someday someone implements them as normal mod) and handled here: Moodle-DL/moodle_dl/downloader/task.py Line 723 in 0bfc759
But anyway this regex filter is a good idea, and more filters for other parts of moodle-dl will come eventually. |
Well, what i wanted to filer was exactly |
We could also add a regex filter to the end of filter_courses in moodle_service.py |
We will also somday add filters for modules, so that kalvidres module can also be filtered |
If you add a filter to filter_courses in moodle_service.py you can filter for |
In principle related to #135 |
Description of the problem
I want to filter files being downloaded based on the URLs in detail, to avoid downloading unwanted content such as videos
My university hosts a lot of content by itself, so I can't just do a domain filtering, i need to filter the URLs
Solution
Regex would be a very flexible tool to do it and would expand the capabilities of the tool while using a very common technology
I want to add a
download_domains_blacklist_regex
to the options, i think that's all inconfig.py
Then a small edit to the function
is_filtered_external_domain
intask.py
, checking the full URLs against this Regex list, and then filtering by hostname using the old version of the blacklistI might have to also add the option properties in the
types.py
dataclassThis should be all the needed steps to add the option correctly, although some of this is based on notes from a previous version of the library
I'll likely start working on a PR in the next few days, please tell me if anything here is incorrect or inaccurate
Alternatives
Perhaps the new filtering could cover the previous domain method, but that would break backwards compatibility unnecessarily
Other text schemes are possible, but Regex is extremely popular and well implemented in Python
The text was updated successfully, but these errors were encountered: