Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parsing of the new GSoC Server Side rendered website #7

Open
thealphadollar opened this issue Oct 1, 2022 · 2 comments
Open
Labels
bug Something isn't working good first issue Good for newcomers hacktoberfest help wanted Extra attention is needed

Comments

@thealphadollar
Copy link
Owner

The GSoC archives website has been set to server side rendering - this is not allowing us to parse the webpage as it was possible previously.

Alongside the same, the class names have changed. However, they have become simpler on a card basis. The main issue to solve is rendering the server side rendered pages completely before passing them to BeautifulSoup.

@thealphadollar thealphadollar added bug Something isn't working help wanted Extra attention is needed good first issue Good for newcomers hacktoberfest labels Oct 1, 2022
@nilesh05apr
Copy link

Hello @thealphadollar, I was exploring the project and found this can be done using a headless browser like in Selenium or will have to use some other library/framework with built-in support for rendering. It is not possible in bs4. Shall I try with Selenium?

@thealphadollar
Copy link
Owner Author

Sure, try it. Let me know if you get stuck anywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers hacktoberfest help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants