Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scraping data from parent and children webpages #120

Closed
duttashi opened this issue Jun 9, 2020 · 0 comments
Closed

scraping data from parent and children webpages #120

duttashi opened this issue Jun 9, 2020 · 0 comments
Assignees
Labels
learn_stuff everything related to learning python-beautifulsoup Beautiful Soup is a Python library for pulling data out of HTML and XML files. python-requests Requests is an elegant and simple HTTP library for Python, built for human beings. Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag python-selenium Selenium is a tool to test your web application. TODO
Milestone

Comments

@duttashi
Copy link
Owner

duttashi commented Jun 9, 2020

Given the website https://www.gsmarena.com/ , browse and scrape data from all phones listed on the home page including children webpages on this website

The idea is to practice selenium, requests, beautifulsoup and mysql. Do check the robots.txt file for this website.

@duttashi duttashi added TODO learn_stuff everything related to learning Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag python-beautifulsoup Beautiful Soup is a Python library for pulling data out of HTML and XML files. python-requests Requests is an elegant and simple HTTP library for Python, built for human beings. python-selenium Selenium is a tool to test your web application. labels Jun 9, 2020
@duttashi duttashi added this to the data pipeline milestone Jun 9, 2020
@duttashi duttashi self-assigned this Jun 9, 2020
@duttashi duttashi closed this as completed Jun 9, 2020
@duttashi duttashi changed the title scraping data from parent and children webpages in a given website scraping data from parent and children webpages Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
learn_stuff everything related to learning python-beautifulsoup Beautiful Soup is a Python library for pulling data out of HTML and XML files. python-requests Requests is an elegant and simple HTTP library for Python, built for human beings. Python-scraper all Python scripts or notes related to web-data scraping are grouped in this tag python-selenium Selenium is a tool to test your web application. TODO
Projects
None yet
Development

No branches or pull requests

1 participant