### Web Scrapping

**Installing Libraries** - Run below command in Windows Command Prompt or Mac Terminal
```
python3 -m pip install beautifulsoup4  
python3 -m pip install requests
```

Learning Links - 
* https://www.dataquest.io/blog/web-scraping-python-using-beautiful-soup/
* https://realpython.com/beautiful-soup-web-scraper-python/
* https://pythonprogramming.net/introduction-scraping-parsing-beautiful-soup-tutorial/

In [9]:
import requests
from bs4 import BeautifulSoup

In [10]:
url = "https://toronto.craigslist.org/d/jobs/search/jjj"

In [11]:
response = requests.get(url)

In [12]:
print(response)

<Response [200]>


In [13]:
# Pretty Display of Resource data

data = response.text
soup = BeautifulSoup(data, 'html.parser')
soup

<!DOCTYPE html>

<html class="no-js">
<head>
<meta charset="utf-8"/>
<meta content="IE=Edge" http-equiv="X-UA-Compatible"/>
<meta content="width=device-width,initial-scale=1" name="viewport"/>
<meta content="craigslist" property="og:site_name"/>
<meta content="preview" name="twitter:card"/>
<meta content="toronto jobs - craigslist" property="og:title"/>
<meta content="toronto jobs - craigslist" name="description"/>
<meta content="toronto jobs - craigslist" property="og:description"/>
<meta content="https://toronto.craigslist.org/d/jobs/search/jjj" property="og:url"/>
<title>toronto jobs - craigslist</title>
<link href="https://toronto.craigslist.org/d/jobs/search/jjj" rel="canonical"/>
<link href="https://toronto.craigslist.org/d/jobs/search/jjj?s=120" rel="next"/>
<script id="ld_breadcrumb_data" type="application/ld+json">
    {"@context":"https://schema.org","itemListElement":[{"item":{"name":"toronto.craigslist.org","@id":"https://toronto.craigslist.org"},"position":1,"@type":"ListI

In [15]:
# Finding all the lines which has tag (a) - Hyperlink tag

tags = soup.find_all('a')
tags

[<a class="appstorebtn" href="https://play.google.com/store/apps/details?id=org.craigslist.CraigslistMobile">
         Android
     </a>,
 <a class="appstorebtn" href="https://apps.apple.com/us/app/craigslist/id1336642410">
         iOS
     </a>,
 <a class="header-logo" href="https://toronto.craigslist.org/" name="logoLink">CL</a>,
 <a href="/">toronto</a>,
 <a href="https://post.craigslist.org/c/tor">post</a>,
 <a href="https://accounts.craigslist.org/login/home">account</a>,
 <a class="favlink" href="#"><span aria-hidden="true" class="icon icon-star fav"></span><span class="fav-number">0</span><span class="fav-label"> favorites</span></a>,
 <a class="to-banish-page-link" href="#">
 <span aria-hidden="true" class="icon icon-trash red"></span>
 <span class="banished_count">0</span>
 <span class="discards-label"> hidden</span>
 </a>,
 <a class="header-logo" href="https://toronto.craigslist.org/">CL</a>,
 <a class="saveme" data-action="save" href="https://accounts.craigslist.org/savesea

In [16]:
# Parsing the Hyperlink tag from a tag
for tag in tags:
    print(tag.get('href'))

https://play.google.com/store/apps/details?id=org.craigslist.CraigslistMobile
https://apps.apple.com/us/app/craigslist/id1336642410
https://toronto.craigslist.org/
/
https://post.craigslist.org/c/tor
https://accounts.craigslist.org/login/home
#
#
https://toronto.craigslist.org/
https://accounts.craigslist.org/savesearch/save?URL=https%3A%2F%2Ftoronto%2Ecraigslist%2Eorg%2Fd%2Fjobs%2Fsearch%2Fjjj
/d/jobs/search/jjj
/d/accounting-finance/search/acc
/d/admin-office/search/ofc
/d/architect-engineer-cad/search/egr
/d/art-media-design/search/med
/d/business-mgmt/search/bus
/d/customer-service/search/csr
/d/education-teaching/search/edu
/d/et-cetera/search/etc
/d/food-beverage-hospitality/search/fbh
/d/general-labor/search/lab
/d/government/search/gov
/d/healthcare/search/hea
/d/human-resource/search/hum
/d/legal-paralegal/search/lgl
/d/manufacturing/search/mnu
/d/marketing-advertising-pr/search/mar
/d/nonprofit/search/npo
/d/real-estate/search/rej
/d/retail-wholesale/search/ret
/d/sales/searc

In [17]:
# Find all tag (a) and filtering by class type - result-title
titles = soup.find_all('a', {'class': 'result-title'})

In [18]:
# Printing the Titles 
for title in titles:
    print(title.text)

Experienced Movers Wanted
Studying At York University? Easy Job!!!
Confident Sales Rep
High class luxury spa looking for Gorgeous Top Talent!
👠High End Spa Hiring Attractive Ladies!👠Busy!👠
Assistance Superintendent Couple - LIVE IN
Dispensary hiring delivery driver!
Full Time evening Dishwasher/Kitchen Helper wanted
Experienced Host @ La Palma
Server Support
☞☞Fully Licensed High End Spa Hiring!
Harrys Charbroiled is looking for a Line Cook!
VIRTUAL JOB FAIR - RESIDENT MANAGERS - FRIDAY, JULY 30
Barista  - Senior Barista
FT/PT Graphic Designer Wanted for Digital Ad Creation
Design + Build Landscape Architects - Crew Member /Lead Hand /Foreman
Hiring for Customer Service
Dish Washer
Pet grooming assistance
Bookkeeper 5 Years QuickBooks
Waitresses & Bartenders Wanted
High arched feet for Foot Sessions - women only $300 per session
General Labour - Spot Welder
Sales Associate at Caversham Booksellers
Detailer (FT) Brampton East Toyota
Multiple FOH Service and Management Positions Availabl

In [19]:
# Getting Jobs Title, Localtion and Link

jobs = soup.find_all('div', {'class':'result-info'})

for job in jobs:
    #print(job)
    #print("------")
    title = job.find('a', {'class': 'result-title'}).text
    location_tag = job.find('span', {'class': 'result-hood'})
    location = location_tag.text if location_tag is not None else "n/a"
    link = job.find('a', {'class': 'result-title'}).get('href')
    
    print(f"{title} | {location} | {link}")
    

Experienced Movers Wanted |  (Toronto city of toronto ) | https://toronto.craigslist.org/tor/lab/d/toronto-experienced-movers-wanted/7357272901.html
Studying At York University? Easy Job!!! |  (Toronto city of toronto ) | https://toronto.craigslist.org/tor/etc/d/toronto-studying-at-york-university/7357211879.html
Confident Sales Rep |  (Toronto city of toronto ) | https://toronto.craigslist.org/tor/sls/d/east-york-confident-sales-rep/7357211790.html
High class luxury spa looking for Gorgeous Top Talent! |  ( york region ) | https://toronto.craigslist.org/yrk/spa/d/maple-high-class-luxury-spa-looking-for/7357172898.html
👠High End Spa Hiring Attractive Ladies!👠Busy!👠 |  (Toronto city of toronto ) | https://toronto.craigslist.org/tor/spa/d/concord-high-end-spa-hiring-attractive/7357172352.html
Assistance Superintendent Couple - LIVE IN |  (Toronto city of toronto ) | https://toronto.craigslist.org/tor/lab/d/toronto-assistance-superintendent/7357167314.html
Dispensary hiring delivery drive