Skip to content

Web scraper using python to scrape a certain page on the Ontario Tech University website

Notifications You must be signed in to change notification settings

ZbonaL/WebScraper

Repository files navigation

Ontario Tech University Web Scrapers

Repository Contents:

Requirements:

  1. Python3.

  2. Python3 Libraries:

    • Pandas: For creating data frames.
    • bs4: For the package BeautifulSoup which parses web pages.
    • urllib.request: For the urlopen package to open links to the pages that need to be parsed.
  3. Special Import Case:

    • MySQLdb: for the important dates parser, helps with escaped strings.
    • re: used for regex matching.
    • copy: used to copy and manipulate data.
    • datetime: used to convert to datetime.

Web Scrapers:

  1. Important Dates Scraper:

    • Used for parsing and creating MySQL queries from the Important Dates page.
    • Link to Important Dates page: https://bit.ly/37RmY4m
    • Produces a .sql file to be uploaded for mobile app calendar data.
  2. Accordion Parser:

    • This scraper is used parse any FAQ's that use Accordion's.
    • Example of Accordion page: https://bit.ly/33xnEsD
    • Produces a csv file of intents to be uploaded to Watson Assistant
  3. Strong Tags Parser:

    • This scraper parses pages that have information in <strong></strong> tags.
    • Example of page with strong tags: https://bit.ly/2OALU8Z
    • Produces a csv file of intents to be uploaded to Watson Assistant

About

Web scraper using python to scrape a certain page on the Ontario Tech University website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages