Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor slot and professor spiders due to changes to sigarra HTML #110

Closed
tomaspalma opened this issue Jun 29, 2024 · 0 comments · Fixed by #112
Closed

Refactor slot and professor spiders due to changes to sigarra HTML #110

tomaspalma opened this issue Jun 29, 2024 · 0 comments · Fixed by #112
Assignees

Comments

@tomaspalma
Copy link
Member

tomaspalma commented Jun 29, 2024

The HTML page now looks like the following:

image

In the spider code we assume an hardcoded structure of the HTML as well as the names for the classes, which have changed, thus making the scrapper work worse.

There is an API which the javascript code from the sigarra schedule page uses and that the scrapper might want to use.
The link to said API does not need to be computed by us as it is inside the properties of one of the elements present in the HTML code of the page.

image

This is huge, since instead of havnig a complicated logic in order to retrieve the items, we just find the html element that has the link to the API and then make a request to it, which will hopefully make the code more understandable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant