Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated the web-scrape file #4

Merged
merged 6 commits into from Oct 17, 2019
Merged

updated the web-scrape file #4

merged 6 commits into from Oct 17, 2019

Conversation

Expire0
Copy link
Contributor

@Expire0 Expire0 commented Oct 16, 2019

I put the execution statements in a function and used pandas for the output formatting. I also updated the requirements file to include pandas.

Additionally I added the promo price html tag ( div) .

Example of the output
https://www.lehmansubaru.com/promotions/service/index.htm
Product Promo
0 FREE MULTI-POINT INSPECTION $49.95
1 FREE MULTI-POINT INSPECTION $14.95
2 FREE MULTI-POINT INSPECTION $59.95
3 Subaru Tire Store $49.95
4 Subaru Tire Store $14.95
5 Subaru Tire Store $59.95
6 A/C Cabin Filter $49.95
7 A/C Cabin Filter $14.95
8 A/C Cabin Filter $59.95
9 Tire Rotation $49.95
10 Tire Rotation $14.95
11 Tire Rotation $59.95
12 Synthetic Oil & Filter Change $49.95
13 Synthetic Oil & Filter Change $14.95
14 Synthetic Oil & Filter Change $59.95

@Expire0
Copy link
Contributor Author

Expire0 commented Oct 16, 2019

I created a new module for oil only. We can use this as a template to scrap other data . The two sites mention are not identical . Due to this we needed two functions.

Example output
python3 oil-scraper.py
url product price
miami https://www.subaruofmiami.com SYNTHETIC (OEM) OIL CHANGE AND FILTER CHANGE $55.00
url product price
lehmans https://www.lehmansubaru.com/ Synthetic Oil & Filter Change $59.95

Expire0 and others added 3 commits October 16, 2019 23:23
adding the headers to the request
adding the headers and timeout
@Expire0
Copy link
Contributor Author

Expire0 commented Oct 17, 2019

@will62185 I needed to add the header information to the request. I also added a timeout to the request to keep it from hanging. I verified this is working on the remote server now.

@will62185 will62185 merged commit 7699d36 into will62185:master Oct 17, 2019
@will62185
Copy link
Owner

@Expire0 good job! The new module works very well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants