In [21]:
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

In [22]:
chromedriver_path = r'C:\Users\xiaqi\Downloads\chromedriver_win32\chromedriver.exe'

The r before the string in chromedriver_path is called a "raw string" prefix in Python. It is used to treat the string as a "raw" string literal, meaning that backslashes (\) within the string are treated as literal characters rather than escape characters.

In [23]:
driver = webdriver.Chrome(executable_path=chromedriver_path)

webdriver.Chrome creates an instance of the Chrome WebDriver in Selenium. This allows you to automate and control the Chrome browser using Python.

executable_path=chromedriver_path is an argument passed to the webdriver.Chrome constructor. It specifies the path to the Chromedriver executable file on your system. This is required for Selenium to know where to find the Chromedriver binary.

By providing the executable_path argument with the correct path to the Chromedriver executable, Selenium will be able to launch an instance of the Chrome browser using the specified Chromedriver.

In [24]:
driver.get('https://www.amazon.com')

In [27]:
product_name = 'phone'  # Replace with the actual product name
search_box = driver.find_element_by_id('nav-bb-search') #this changes time by time
search_box.send_keys(product_name)
search_box.send_keys(Keys.RETURN)

In [28]:
search_results = driver.find_elements_by_css_selector('[data-component-type="s-search-result"]')

The square brackets [] in the CSS selector '[data-component-type="s-search-result"]' indicate an attribute selector in CSS.

In this specific CSS selector, [data-component-type="s-search-result"], the attribute being targeted is data-component-type, and the value that attribute should have is "s-search-result".

By using this CSS selector, you are instructing Selenium to find elements that have the attribute data-component-type set to "s-search-result". In the case of Amazon search results, each search result item is typically associated with this attribute value.

The square brackets, in combination with the attribute selector, allow you to specify more specific criteria for selecting elements based on their attributes. In this case, you are looking for elements that have a specific attribute value to target the search result items on the page.

In [29]:
data = []

In [30]:
for result in search_results:
    title_element = result.find_element_by_css_selector('.a-text-normal')
    title = title_element.text

    price_element = result.find_element_by_css_selector('.a-offscreen')
    price = price_element.get_attribute('textContent')

    review_element = result.find_element_by_css_selector('.a-size-base')
    review_count = review_element.get_attribute('textContent')

    # Append the data to the list
    data.append([title, price, review_count])


In [31]:
output_file = 'amazon_phone data.csv'  # Provide a filename for the output file
header = ['Title', 'Price', 'Review Count']

In [32]:
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(header)
    writer.writerows(data)

In [33]:
driver.quit()   

Cleanup: Calling driver.quit() is essential for proper resource cleanup. It releases system resources and closes the browser instances opened by the WebDriver. Failing to quit the driver may lead to resource leaks or accumulation of unnecessary browser processes.

Session termination: driver.quit() ends the WebDriver session, which effectively terminates the communication between your Python script and the browser controlled by the WebDriver. This is particularly important if you plan to run multiple tests or automation tasks sequentially or in parallel.

Avoiding browser instances: If you don't call driver.quit(), the browser window opened by the WebDriver will remain open even after your script finishes execution. This can be inconvenient if you're automating tasks and need the browser to close automatically.