Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use static scraping libraries instead of headless browser to make it more performant #1

Closed
jayantkatia opened this issue Apr 29, 2021 · 0 comments · Fixed by #5
Closed
Assignees
Labels
enhancement New feature or request revamp change current implementation

Comments

@jayantkatia
Copy link
Owner

jayantkatia commented Apr 29, 2021

Current implementation

chromedp which uses headless browser to scrap information.

Better solution

Use static scraping libraries and make same network calls which the website makes internally (passing same query params and request headers).

Example,

curl --location --request GET 'https://www.91mobiles.com/template/category_finder/finder_ajax.php?show_next=1&ord=0.17428677812670812&excludeId=&hash=&search=&hidFrmSubFlag=1&page=1&category=mobile&unique_sort=ga_views&gaCategory=Upcoming+Mobiles+Price+List+in+India-filter&requestType=1&showPagination=1&listType=list&listType_v3=list&listType_v1=list&listType_v2=list&listType_v4=list&listType_v5=list&listType_v6=list&page_type=upcoming&finderRuleUrl=&selMobSort=ga_views&hdnCategory=mobile&user_search=&url_feat_rule=upcoming-mobiles-in-india&buygaCat=upcoming-mob&amount=0%3B200000&sCatName=mobile&price_range_apply=1&tr_fl%5B%5D=mob_market_status_filter.marketstatus_filter%3Aupcoming&tr_fl%5B%5D=mob_market_status_filter.marketstatus_filter%3Arumoured' \
--header 'x-requested-with: XMLHttpRequest' \
--header 'user-agent: Mozilla/5.0'

Massive overhaul

Since this will lead to massive changes, i propose using another branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request revamp change current implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant