- Tracking product updates using Python 3
- Web scraped data on product
- Automated email updates by automating code runner
- Resources
- SMPT library to set up email notifications
- Web scraping product availability data
- Windows Task Scheduler Automation
- Conclusion
- Project Management (Agile/Scrum/Kanban)
- Project Evaluation
- Looking Ahead
- Questions & Contact me
Python 3, Windows Task Scheduler, Gmail, Amazon
Anaconda Packages: requests beautifulsoup4 ipykernel
Powershell command for installing anaconda packages used for this project
pip install requests beautifulsoup4 ipykernel
Using smtplib I created a function for each email response that would be called upon based on the product availability.
- Created a dummy email to perform this checking.
- av_email() would be called if product is available.
- notav_email() would be called if the product is unavailable.
# Functions to send product availability emails to myself
def av_email():
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.ehlo()
sender_email = "[SENDER EMAIL]"
receiver_email = "[RECIEVER EMAIL]"
password = input("Type your password and press enter:")
server.login(sender_email, password)
subject = "Product available!"
body = "Hello, \n\nProduct link: https://www.amazon.co.uk/gp/product/B00353ECJO/ref=ox_sc_saved_image_1?smid=&psc=1 \n\nBest, \n\nSent from Python"
msg = f"Subject:{subject}\n\n{body}"
server.sendmail(sender_email, receiver_email, msg)
print("Email has been sent!")
server.quit()
# Functions to send product availability emails to myself
def notav_email():
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.ehlo()
sender_email = "[SENDER EMAIL]"
receiver_email = "[RECIEVER EMAIL]"
password = input("Type your password and press enter:")
server.login(sender_email, password)
subject = "Product not available yet"
body = "Hello, \n\nProduct link: https://www.amazon.co.uk/gp/product/B00353ECJO/ref=ox_sc_saved_image_1?smid=&psc=1 \n\nBest, \n\nSent from Python"
msg = f"Subject:{subject}\n\n{body}"
server.sendmail(sender_email, receiver_email, msg)
print("Email has been sent!")
server.quit()
Using beautifulsoup4 I parsed the HTML to find where id was equal to "availability" as this contained the availability data.
- Conducted research to see possibilities in availability.
- Created condition for calling av_email() or notav_email() functions based on data scraped.
# I have requested the web server from the amazon webpage.
site = "https://www.amazon.co.uk/gp/product/B00353ECJO/ref=ox_sc_saved_image_1?smid=&psc=1"
# Find Your User-Agent: https://httpbin.org/get
header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36"}
# Soup object which parses the webpage.
def get_price():
html = rq.get(site, headers=header).text
soup = BeautifulSoup(html, 'html.parser')
results = soup.find(id="availability")
# Within that id above ^ results.find loks for <span> within that id with particular class and get_text() to extract the text within
available = results.find("span", class_="a-size-medium a-color-success").get_text() # find used instead of find_all becasue we are not looking for a list
# If string 'Currently unavailable.' is in text then product is unavailable.
if 'Currently unavailable.' in available:
notav_email()
else: # just else and not [ elif 'In stock.' in available: ] as it can also be only 10 remaining or 5 remaining etc # See -> "Availability types.docx"
av_email()
To automate the code running process I used task scheduler to run the code at different intervals.
- Batch file created with command to run code.
- Time conditions set in task scheduler of when to run code.
# Function call to check product availability
get_price()
- The [batch file](Product checks.bat) ran successfully for 3 days before throwing an error because of a change in the HTML indicating the product had now become available.
-
The span class HTML will change slightly from when it is unavailable to when it is available, but it is safe to say when you stop recieiving emails about it being unavaialble then its probably now avaialble.
-
Product Page HTML BEFORE -- span class="a-size-medium a-color-price">Currently unavailable."
-
Product Page HTML AFTER -- span class="a-size-medium a-color-success">In stock on November 11, 2021.
-
- Resources used
- Jira
- Confluence
- Trello
- WWW
- The end-to-end process
- The email notifications worked sucessfully
- EBI
- If deployed on a virtual machine, this could run 24/7
- Potential for NLP and automated email response to emails..
For questions, feedback, and contribution requests contact me