# Basic Web Scraping Using Beautiful Soup

This is a multi-part webscraping Python assignment. Please follow each segment of directions.

As always, your code should be reasonably efficient, utilize docstrings where necessary, and use appropriate error-handling.

In [146]:
import requests
from bs4 import BeautifulSoup as soup
import bs4
import pandas as pd

## Web site to scrape
Below is the URL from which you will scrape the name, title and phone number for each faculty member at Lamar School.

In [6]:
url = "http://www.lamarschool.com/about-us/faculty.cfm"

## Initialize headers and get the HTML page

In [8]:
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}

r = requests.get(url, headers=headers)

print(str(r.content)[:1000])

b'<!DOCTYPE html>\r\n<html lang="en">\r\n<head>\r\n<meta charset="utf-8">\r\n<meta name="viewport" content="width=device-width,user-scalable=no,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0">\r\n<meta name="keywords" content="Lamar School,independant,East Mississippi,West Alabama,Christian,MPSA,SACS,CASI,Meridian,MS,Mississippi,39305" />\r\n<meta name="robots" content="index,follow" />\r\n<meta http-equiv="cache-control" content="max-age=172800" />\r\n\r\n<title>Faculty - Lamar School</title>\r\n\r\n<link rel="shortcut icon" href="/custom/images/favicon.ico" />\r\n\r\n<link href="http://www.lamarschool.com/pro/responsive/css/global.css?v=09272019150000" rel="stylesheet" type="text/css" />  \r\n\r\n\r\n\r\n<link href="http://www.lamarschool.com/css.css?v=05032020000500" rel="stylesheet" type="text/css" />\r\n\r\n\r\n\t<link href="http://www.lamarschool.com/implementation_colors.css?v=05032020000500" rel="stylesheet" type="text/css"  />\r\n\r\n\r\n\r\n<link href="http://www.lamar

## Parse the HTML

In [14]:
# Use Beautiful Soup to parse the HTML. 
# Pass the HTML page (.content returns bytes, .text returns string) and use the "html.parser"
soup = bs4.BeautifulSoup(r.content, 'html.parser')

# Optional: print the soup object you created above

In [15]:
# Print the title of the web site to verify you have successfully parsed the page
soup.title 

<title>Faculty - Lamar School</title>

## Scrape name, title and phone number
Now that you have successfully retrieved the HTML for the faculty page, next we want to scrape the faculty name, title, and number.

Once you have scraped the data return and print a formatted list of the facutly names, titles phone numbers similar the example below:

```
Adams, Shane
  Title:  Dir of Athletics and  Facilities
  Phone:  601-482-1345 

Alexander, Danny
  Title:  HS Science
  Phone:  601-482-1345 

Autry, Bill
  Title:  Tennis Coach
  Phone:  601-482-1345 
```

Do this in an assignment named **info_scraping()** that returns the formatted list. Remember to print the formatted list as well.

In [212]:
# Scrape faculty name, title, and phone number

def info_scraping():
    """ This function takes the name, title, and phones of the faculty at the url:http://www.lamarschool.com/about-us/faculty.cfm
         then formats and displays them""" 
    r = requests.get(url)
    soup = bs4.BeautifulSoup(r.content)
    faculty_data = soup.find_all('tr',{'class':'directory-row dir-row dir-border'})

    for i,f in enumerate(faculty_data,1):
        try:
            names = f.find('td', attrs = {'class':'dir-name'}).text.strip()
            titles = f.find('td', attrs = {'class':'dir-col2'}).text.strip()
            phones = f.find('td', attrs = {'class':'dir-col4'}).text.strip()
            faculty_final = "{}\n " " " " Title: " " " " {}\n " " " " Contact: {} \n".format(names, titles, phones)
            print(faculty_final)
        except NameError as e:
            print(e)
        except Exception as e:
            print(e)


In [213]:
# Print formatted listing of faculty name, title, phone
info_scraping()

Adams, Shane
   Title:   Dir of Athletics and  Facilities
   Contact: 601-482-1345 

Bakane, Heather
   Title:   MS English
   Contact: 601-482-1345 

Ballou, Leigh Ann
   Title:   Head of School
   Contact: 601-482-1345 

Barnes, Mac
   Title:   Bible, Head Football Coach
   Contact: 601-482-1345 

Brown, Ashley
   Title:   HS Spanish Teacher
   Contact: 6014821345 

Brown, Stephanie
   Title:   Dual Credit Biology
   Contact: 601-484-8664 

Browne, Marilyn
   Title:   MS/HS  Math
   Contact: 601-482-1345 

Cade, Burt
   Title:   HS Math
   Contact: 601-482-1345 

Carruth, Leslie
   Title:   MS/HS Art
   Contact: 601-482-1345 

Carver, Andrea
   Title:   Discovery
   Contact: 601-482-1345 

Castle, Jamie
   Title:   
   Contact:  

Chesney, Miriam
   Title:   Elem. Music
   Contact: 601-482-1345 

Clodfelter, April
   Title:   Pre-Kindergarten
   Contact: 601-482-1345 

Cole, Beth
   Title:   Elem. Admissions & Academics
   Contact: 601-482-1345 

Cook, Angie
   Title:   High School S