Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Sciebo downloads #117

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

nilshee
Copy link
Contributor

@nilshee nilshee commented Jun 12, 2024

solves #110

Copy link
Collaborator

@septatrix septatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently have no Mooble room with sciebo available so I am not really able to test this. Feedback from anyone else would be appreciated though in generall I am in favor of improving the Sciebo scraping

@@ -24,6 +24,7 @@ install_requires =
yt-dlp>=2021.12.27
pdfkit>=0.6.0
tqdm>=4.0.0
lxml>=5.0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is lxml needed for? (I know it can be used by bs4 but I do not know why it is necessary)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the response of /public.php/webdav is a xml-file which describes the structure of the sharing link we found.

To be able to parse xml we need to install lxml so that bs4 can use it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As that XML is highly structured it should be sufficient to take the XML parser from the stdlib

syncmymoodle/__main__.py Outdated Show resolved Hide resolved
syncmymoodle/__main__.py Outdated Show resolved Hide resolved
Comment on lines +1053 to +1054
sciebo_links = list(
set(re.findall("https://rwth-aachen.sciebo.de/s/[a-zA-Z0-9-]+", text))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just move the set() to the for-loop and drop the list, it has no effect

Co-authored-by: Nils K <24257556+septatrix@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants