Skip to content

Commit

Permalink
Merge pull request #89 from codedex-io/exrlla/webscrape-project-bug-fix
Browse files Browse the repository at this point in the history
Fix Bug on Amazon Web Scrape Project
  • Loading branch information
Dusch4593 committed Jul 27, 2023
2 parents 184715a + 717cf31 commit 687aa18
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
2 changes: 1 addition & 1 deletion projects/web-scrape-amazon-with-beautiful-soup/scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def get_product_details(product_url: str) -> dict:
title = soup.find(
"span", attrs={"id": "productTitle"}).get_text().strip()
extracted_price = soup.find(
"span", attrs={"class": "apexPriceToPay"}).get_text().strip()
"span", attrs={"class": "a-price"}).get_text().strip()
price = "$" + extracted_price.split("$")[1]

# Adding it to the product details dictionary
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,13 +179,13 @@ Similar to the title of the product, if you inspect the prices by right clicking
Thus, we can extract the price in a similar fashion.

```python
price = soup.find('span', attrs={'class': 'apexPriceToPay'}).get_text().strip()
price = soup.find('span', attrs={'class': 'a-price'}).get_text().strip()
```

But, you will see a problem when you print the price of the product. The extracted price will be something like **\$166.00\$166.00** because the parent `span` element contains two `span` elements with the price text in them. But we can clean this extracted price to get the price of the product in the following way:

```python
extracted_price = soup.find('span', attrs={'class': 'apexPriceToPay'}).get_text().strip()
extracted_price = soup.find('span', attrs={'class': 'a-price'}).get_text().strip()
price = '$' + extracted_price.split('$')[1]
```

Expand All @@ -212,7 +212,7 @@ def get_product_details(product_url: str) -> dict:
try:
# Scrape the product details
title = soup.find('span', attrs={'id': 'productTitle'}).get_text().strip()
extracted_price = soup.find('span', attrs={'class': 'apexPriceToPay'}).get_text().strip()
extracted_price = soup.find('span', attrs={'class': 'a-price'}).get_text().strip()
price = extracted_price.split('$')[1]

# Adding it to the product details dictionary
Expand Down Expand Up @@ -265,7 +265,7 @@ def get_product_details(product_url: str) -> dict:
title = soup.find(
'span', attrs={'id': 'productTitle'}).get_text().strip()
extracted_price = soup.find(
'span', attrs={'class': 'apexPriceToPay'}).get_text().strip()
'span', attrs={'class': 'a-price'}).get_text().strip()
price = '$' + extracted_price.split('$')[1]

# Adding it to the product details dictionary
Expand Down

0 comments on commit 687aa18

Please sign in to comment.