## **Scraping Wisdom Pet Medicine Website (wisdompetmed.com)**

Understanding the Requests Package

In [None]:
%pip install requests

In [None]:
import requests

In [None]:
requests.get('http://www.wisdompetmed.com')

In [None]:
#store requests.get into resposne and check url with response.url
response = requests.get('http://www.wisdompetmed.com')
response.url

In [None]:
#retrieve status code for wisdompetmed.com
response.status_code
#

Status Code Breakdown
- Always 3 digits long with 2 parts
- First number indicates the 'category'
  - start with 2 is success response
- last 2 numbers indicate 'description'
  - 00 stands for OK without other distinctions
- **100 - 199:** Information Codes
  - usually used to communicate in the backend so we don't work with it directly most likely
- **200 - 299:** Successful Codes
- **300 - 399:** Redirection Codes
- **400 - 499:** Client Error Codes
- **500 - 599:** Server Error Codes

- Check out the HTTP Status messages in w3schools.com

In [None]:
#retrieve headers of Wisdom Pet Medicine Website
#headers provides tech info about webpage in Python dictionary form
response.headers

## Retrieving HTML Code

In [None]:
#retrieve content from website
#response.content provides byte form HTML code
response.content

In [None]:
#retrieve content with response.text to get it in string form
#instead of byte form
response.text

## Using BeautifulSoup to Beautify HTML

In [None]:
#install beautifulSoup package
%pip install bs4

In [None]:
#import BeautifulSoup package
from bs4 import BeautifulSoup

In [None]:
#Bring in HTML code with BeautifulSoup
soup = BeautifulSoup(response.text)

In [None]:
#Prettify the HTML code with BeautifulSoup
print(soup.prettify())

## find() & find_all() methods

In [None]:
#find where first title tag occurs in code for vet business name
soup.find("title")

<title>Wisdom Pet Medicine</title>

In [None]:
#find all article tags that list services
soup.find_all("article")

In [None]:
#find the business phone number and print it nicely
#by using span tag, class number, and convert to .text
print(soup.find("span",class_="phone").text)

## looping find_all()

In [None]:
#find all featured testimonials
featured_testimonial = soup.find_all("div",class_="quote")
for testimonial in featured_testimonial:
  print(testimonial.text)

In [None]:
#find all staff members
staff = soup.find_all("div",class_="info col-xs-8 col-xs-offset-2 col-sm-7 col-sm-offset-0 col-md-6 col-lg-8")
for s in staff:
  print(s.text)

## Retrive Webpage Links

In [None]:
#find all links on the page
links = soup.find_all("a")
for link in links:
  print(link.text, link.get('href'))

## write HTML code to text file

In [None]:
#write HTML code we pulled to text file
with open("wisdom_vet.txt","w") as f:
  f.write(soup.prettify())