# Job Portal - WORKBANK

## Imports used (to be described)

* `os` - a module that provides functions to interact with the operating system.
* `pandas` - is a tool that helps analyze data.
* `numpy` - Library that contains multiple functions that help ease the work with arrays, matrices, and alike to better reassemble data.
* `json` - enables import and export from and to JSON files
* `re` - Short for Regular Expressions, help recognize patterns on strings of data and is used to orderly reassemble them.
* `gensim` - Library that efficiently handles large, unmanaged text collections of data.
* `nltk` - Short for Natural Language Toolkit. It helps the program to apply human language data to statistical natural language.
* `requests` - Requests allows the program to send HTTP requests easily.
* `Seaborn` - A library in python that is used to better visualize data through drawing informative graphs.
* `math` - Imported library that allows quick computations of mathematical tasks
* `gensim.utils` `simple_preprocess` - used to preprocess text by making them lower-cased, and transforming the words to their original form (de-tokenizing)
* `gensim.parsing.preprocessing` `STOPWORDS` - stop words common words that do not have value and are often removed in pre-processing
* `gensim` `corpora` - used to work with corpus and words
* `gensim` `models` - used for topic modelling and model training
* `nltk.stem` `WordNetLemmatizer` - used for grouping similar strings together
* `bs4` `BeautifulSoup` - library used to web scrape HTML from websites
* `datetime` `datetime` - An imported module in python to create an object that properly resembles date and time. Used for converting string of time into datetime format to month, day, and year.
* `datetime` `timedelta` - used for finding delta of time ago with time scraped if date has minutes, hours, days, or weeks ago
* `dateutil.relativedelta` `relativedelta` - used for finding delta of time ago with time scraped if date has months and years

In [1]:
import os
import pandas as pd
import numpy as np
import json
import re
import gensim
import nltk
import requests
import datetime
import seaborn as sns
import math

from gensim.utils import simple_preprocess
from gensim.parsing.preprocessing import STOPWORDS
from gensim import corpora, models
from nltk.stem import WordNetLemmatizer
from bs4 import BeautifulSoup
from datetime import datetime
from datetime import timedelta
from dateutil.relativedelta import relativedelta

today = datetime.today()



We will be using Beautiful Soup to scrape job posts from WorkBank.com
The following are the relevant data that we will need to take from the website.
- WORKBANK_JOB_TITLE - The title of the Job Post
- WORKBANK_JOB_CATEGORY - The category of the Job Post with relation to the STEM field 
- WORKBANK_JOB_COMPANY - The Company which is looking for applications for the Job Post
- WORKBANK_JOB_DATE - Date and time the Job Post was posted
- WORKBANK_JOB_LOCATION - Location where the Job Post is assigned to
- WORKBANK_JOB_STATUS - This determines the type of the job whether it is full time or not
- WORKBANK_JOB_SALARY - Monthly salary of the job listing in Philippine Pesos (PHP)
- WORKBANK_JOB_EDUCATION - Educational attainment requirements for the applicant
- WORKBANK_JOB_DESCRIPTION - A detailed job description
- WORKBANK_JOB_YEAR_WE - Years of Work Experience required for the job

In [None]:
WORKBANK_JOB_TITLE = []
WORKBANK_JOB_CATEGORY = []
WORKBANK_JOB_COMPANY = []
WORKBANK_JOB_DATE = []
WORKBANK_JOB_LOCATION = []
WORKBANK_JOB_STATUS = []
WORKBANK_JOB_SALARY = []
WORKBANK_JOB_EDUCATION = []
WORKBANK_JOB_DESCRIPTION = []
WORKBANK_JOB_YEAR_WE = []

### CATEGORY - Information Communications Technology

Since we are only looking for Job Posts in the STEM field, we will need to go over the website WorkBank.com and choose only the relevant categories, such as the Information Communications Technology. We will need to scrape every single page in that category result in the following code.

In [None]:
#Information Communications Technology
IT_WORKBANK_URL = 'https://www.workbank.com/job/information-communications-technology-job-openings?wb_q='
IT_WORKBANK = requests.get(IT_WORKBANK_URL)
IT_WORKBANK_soup = BeautifulSoup(IT_WORKBANK.content, 'html.parser')
IT_WORKBANK_PP = IT_WORKBANK_soup.find_all('select',{'class':'wb-pagination-select'})
if (len(IT_WORKBANK_PP)!=0):
    IT_WORKBANK_NUMPAGES=len(IT_WORKBANK_PP[0].select("option"))
else:
    IT_WORKBANK_NUMPAGES=0
IT_WORKBANK_PAGES=[]
if (IT_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        IT_WORKBANK_PAGES.append('https://www.workbank.com/job/information-communications-technology-job-openings?page='+ str(i))
else:
    IT_WORKBANK_PAGES.append('https://www.workbank.com/job/information-communications-technology-job-openings?page=1')
for i in range(len(IT_WORKBANK_PAGES)):
    IT_WORKBANK_URLs = IT_WORKBANK_PAGES[i]
    IT_WORKBANK_PAGE = requests.get(IT_WORKBANK_URLs)
    IT_WORKBANK_PAGE_soup = BeautifulSoup(IT_WORKBANK_PAGE.content, 'html.parser')
    IT_WORKBANK_JOBS = IT_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    IT_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(IT_WORKBANK_JOBS))
    IT_WORKBANK_JOB_DATEs = IT_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(IT_WORKBANK_JOB_URLs)):
        IT_WORKBANK_JOB_PAGE = requests.get(IT_WORKBANK_JOB_URLs[j])
        IT_WORKBANK_JOB_PAGE_soup = BeautifulSoup(IT_WORKBANK_JOB_PAGE.content, 'html.parser')
        IT_WORKBANK_JOB_PAGE_INFO1 = IT_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        IT_WORKBANK_JOB_TITLE = IT_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        IT_WORKBANK_JOB_COMPANY = IT_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        IT_WORKBANK_JOB_SALARY = IT_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        IT_WORKBANK_JOB_DATEPOSTED = IT_WORKBANK_JOB_DATEs[j].text.strip()
        IT_WORKBANK_JOB_LOCATION = IT_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        IT_WORKBANK_JOB_INFO2 = IT_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        IT_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        IT_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        IT_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        IT_WORKBANK_JOB_CATEGORY = "Information and Communications Technology"
        IT_WORKBANK_JOB_INFO3 = IT_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        IT_WORKBANK_JOB_DESCRIPTION = IT_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(IT_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(IT_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(IT_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(IT_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(IT_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(IT_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(IT_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_EDUCATION.append(IT_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_YEAR_WE.append(IT_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_DESCRIPTION.append(IT_WORKBANK_JOB_DESCRIPTION)

### Souptest

Getting the html of the URL of the Information Communications Technology job openings, it can be observed that it contains the list of jobs that we interested in.

In [None]:
souptest = BeautifulSoup(IT_WORKBANK.content, 'html.parser')
souptest

### CATEGORY - Construction
The same process is done for the other sources to expand the dataset.

In [None]:
#Construction
CONSTRUCTION_WORKBANK_URL = 'https://www.workbank.com/job/construction-job-openings?wb_q='
CONSTRUCTION_WORKBANK = requests.get(CONSTRUCTION_WORKBANK_URL)
CONSTRUCTION_WORKBANK_soup = BeautifulSoup(CONSTRUCTION_WORKBANK.content, 'html.parser')
CONSTRUCTION_WORKBANK_PP = CONSTRUCTION_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(CONSTRUCTION_WORKBANK_PP)!=0):
    CONSTRUCTION_WORKBANK_NUMPAGES=len(CONSTRUCTION_WORKBANK_PP[0].select("option"))
else:
    CONSTRUCTION_WORKBANK_NUMPAGES=0
CONSTRUCTION_WORKBANK_PAGES=[]
if (CONSTRUCTION_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        CONSTRUCTION_WORKBANK_PAGES.append('https://www.workbank.com/job/construction-job-openings?page='+ str(i))
else:
    CONSTRUCTION_WORKBANK_PAGES.append('https://www.workbank.com/job/construction-job-openings?page=1')
for i in range(len(CONSTRUCTION_WORKBANK_PAGES)):
    CONSTRUCTION_WORKBANK_URLs = CONSTRUCTION_WORKBANK_PAGES[i]
    CONSTRUCTION_WORKBANK_PAGE = requests.get(CONSTRUCTION_WORKBANK_URLs)
    CONSTRUCTION_WORKBANK_PAGE_soup = BeautifulSoup(CONSTRUCTION_WORKBANK_PAGE.content, 'html.parser')
    CONSTRUCTION_WORKBANK_JOBS = CONSTRUCTION_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    CONSTRUCTION_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(CONSTRUCTION_WORKBANK_JOBS))
    CONSTRUCTION_WORKBANK_JOB_DATEs = CONSTRUCTION_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(CONSTRUCTION_WORKBANK_JOB_URLs)):
        CONSTRUCTION_WORKBANK_JOB_PAGE = requests.get(CONSTRUCTION_WORKBANK_JOB_URLs[j])
        CONSTRUCTION_WORKBANK_JOB_PAGE_soup = BeautifulSoup(CONSTRUCTION_WORKBANK_JOB_PAGE.content, 'html.parser')
        CONSTRUCTION_WORKBANK_JOB_PAGE_INFO1 = CONSTRUCTION_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        CONSTRUCTION_WORKBANK_JOB_TITLE = CONSTRUCTION_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        CONSTRUCTION_WORKBANK_JOB_COMPANY = CONSTRUCTION_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        CONSTRUCTION_WORKBANK_JOB_SALARY = CONSTRUCTION_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        CONSTRUCTION_WORKBANK_JOB_DATEPOSTED = CONSTRUCTION_WORKBANK_JOB_DATEs[j].text.strip()
        CONSTRUCTION_WORKBANK_JOB_LOCATION = CONSTRUCTION_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        CONSTRUCTION_WORKBANK_JOB_INFO2 = CONSTRUCTION_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        CONSTRUCTION_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(CONSTRUCTION_WORKBANK_JOB_INFO2))[0]
        CONSTRUCTION_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(CONSTRUCTION_WORKBANK_JOB_INFO2))[0]
        CONSTRUCTION_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        CONSTRUCTION_WORKBANK_JOB_CATEGORY = "Construction"
        CONSTRUCTION_WORKBANK_JOB_INFO3 = CONSTRUCTION_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        CONSTRUCTION_WORKBANK_JOB_DESCRIPTION = CONSTRUCTION_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(CONSTRUCTION_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(CONSTRUCTION_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(CONSTRUCTION_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(CONSTRUCTION_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(CONSTRUCTION_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(CONSTRUCTION_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(CONSTRUCTION_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_EDUCATION.append(CONSTRUCTION_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_YEAR_WE.append(CONSTRUCTION_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_DESCRIPTION.append(CONSTRUCTION_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Design and Architecture

In [None]:
#Design and Architecture
ARCHITECTURE_WORKBANK_URL = 'https://www.workbank.com/job/design-architecture-job-openings?wb_q='
ARCHITECTURE_WORKBANK = requests.get(ARCHITECTURE_WORKBANK_URL)
ARCHITECTURE_WORKBANK_soup = BeautifulSoup(ARCHITECTURE_WORKBANK.content, 'html.parser')
ARCHITECTURE_WORKBANK_PP = ARCHITECTURE_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(ARCHITECTURE_WORKBANK_PP)!=0):
    ARCHITECTURE_WORKBANK_NUMPAGES=len(ARCHITECTURE_WORKBANK_PP[0].select("option"))
else:
    ARCHITECTURE_WORKBANK_NUMPAGES=0
ARCHITECTURE_WORKBANK_PAGES=[]
if (ARCHITECTURE_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        ARCHITECTURE_WORKBANK_PAGES.append('https://www.workbank.com/job/design-architecture-job-openings?page='+ str(i))
else:
    ARCHITECTURE_WORKBANK_PAGES.append('https://www.workbank.com/job/design-architecture-job-openings?page=1')
for i in range(len(ARCHITECTURE_WORKBANK_PAGES)):
    ARCHITECTURE_WORKBANK_URLs = ARCHITECTURE_WORKBANK_PAGES[i]
    ARCHITECTURE_WORKBANK_PAGE = requests.get(ARCHITECTURE_WORKBANK_URLs)
    ARCHITECTURE_WORKBANK_PAGE_soup = BeautifulSoup(ARCHITECTURE_WORKBANK_PAGE.content, 'html.parser')
    ARCHITECTURE_WORKBANK_JOBS = ARCHITECTURE_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    ARCHITECTURE_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(ARCHITECTURE_WORKBANK_JOBS))
    ARCHITECTURE_WORKBANK_JOB_DATEs = ARCHITECTURE_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(ARCHITECTURE_WORKBANK_JOB_URLs)):
        ARCHITECTURE_WORKBANK_JOB_PAGE = requests.get(ARCHITECTURE_WORKBANK_JOB_URLs[j])
        ARCHITECTURE_WORKBANK_JOB_PAGE_soup = BeautifulSoup(ARCHITECTURE_WORKBANK_JOB_PAGE.content, 'html.parser')
        ARCHITECTURE_WORKBANK_JOB_PAGE_INFO1 = ARCHITECTURE_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        ARCHITECTURE_WORKBANK_JOB_TITLE = ARCHITECTURE_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        ARCHITECTURE_WORKBANK_JOB_COMPANY = ARCHITECTURE_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        ARCHITECTURE_WORKBANK_JOB_SALARY = ARCHITECTURE_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        ARCHITECTURE_WORKBANK_JOB_DATEPOSTED = ARCHITECTURE_WORKBANK_JOB_DATEs[j].text.strip()
        ARCHITECTURE_WORKBANK_JOB_LOCATION = ARCHITECTURE_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        ARCHITECTURE_WORKBANK_JOB_INFO2 = ARCHITECTURE_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        ARCHITECTURE_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(ARCHITECTURE_WORKBANK_JOB_INFO2))[0]
        ARCHITECTURE_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(ARCHITECTURE_WORKBANK_JOB_INFO2))[0]
        ARCHITECTURE_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        ARCHITECTURE_WORKBANK_JOB_CATEGORY = "Design and Architecture"
        ARCHITECTURE_WORKBANK_JOB_INFO3 = ARCHITECTURE_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        ARCHITECTURE_WORKBANK_JOB_DESCRIPTION = ARCHITECTURE_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(ARCHITECTURE_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(ARCHITECTURE_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(ARCHITECTURE_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(ARCHITECTURE_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(ARCHITECTURE_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(ARCHITECTURE_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(ARCHITECTURE_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_EDUCATION.append(ARCHITECTURE_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_YEAR_WE.append(ARCHITECTURE_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_DESCRIPTION.append(ARCHITECTURE_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Agriculture and Wildlife Conservation

In [None]:
#Agriculture and Wildlife Conservation
AGRICULTURE_WORKBANK_URL = 'https://www.workbank.com/job/agriculture-wildlife-conservation-job-openings?wb_q='
AGRICULTURE_WORKBANK = requests.get(AGRICULTURE_WORKBANK_URL)
AGRICULTURE_WORKBANK_soup = BeautifulSoup(AGRICULTURE_WORKBANK.content, 'html.parser')
AGRICULTURE_WORKBANK_PP = AGRICULTURE_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(AGRICULTURE_WORKBANK_PP)!=0):
    AGRICULTURE_WORKBANK_NUMPAGES=len(AGRICULTURE_WORKBANK_PP[0].select("option"))
else:
    AGRICULTURE_WORKBANK_NUMPAGES=0
AGRICULTURE_WORKBANK_PAGES=[]
if (AGRICULTURE_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        AGRICULTURE_WORKBANK_PAGES.append('https://www.workbank.com/job/agriculture-wildlife-conservation-job-openings?page='+ str(i))
else:
    AGRICULTURE_WORKBANK_PAGES.append('https://www.workbank.com/job/agriculture-wildlife-conservation-job-openings?page=1')
for i in range(len(AGRICULTURE_WORKBANK_PAGES)):
    AGRICULTURE_WORKBANK_URLs = AGRICULTURE_WORKBANK_PAGES[i]
    AGRICULTURE_WORKBANK_PAGE = requests.get(AGRICULTURE_WORKBANK_URLs)
    AGRICULTURE_WORKBANK_PAGE_soup = BeautifulSoup(AGRICULTURE_WORKBANK_PAGE.content, 'html.parser')
    AGRICULTURE_WORKBANK_JOBS = AGRICULTURE_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    AGRICULTURE_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(AGRICULTURE_WORKBANK_JOBS))
    AGRICULTURE_WORKBANK_JOB_DATEs = AGRICULTURE_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(AGRICULTURE_WORKBANK_JOB_URLs)):
        AGRICULTURE_WORKBANK_JOB_PAGE = requests.get(AGRICULTURE_WORKBANK_JOB_URLs[j])
        AGRICULTURE_WORKBANK_JOB_PAGE_soup = BeautifulSoup(AGRICULTURE_WORKBANK_JOB_PAGE.content, 'html.parser')
        AGRICULTURE_WORKBANK_JOB_PAGE_INFO1 = AGRICULTURE_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        AGRICULTURE_WORKBANK_JOB_TITLE = AGRICULTURE_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        AGRICULTURE_WORKBANK_JOB_COMPANY = AGRICULTURE_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        AGRICULTURE_WORKBANK_JOB_SALARY = AGRICULTURE_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        AGRICULTURE_WORKBANK_JOB_DATEPOSTED = AGRICULTURE_WORKBANK_JOB_DATEs[j].text.strip()
        AGRICULTURE_WORKBANK_JOB_LOCATION = AGRICULTURE_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        AGRICULTURE_WORKBANK_JOB_INFO2 = AGRICULTURE_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        AGRICULTURE_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(AGRICULTURE_WORKBANK_JOB_INFO2))[0]
        AGRICULTURE_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(AGRICULTURE_WORKBANK_JOB_INFO2))[0]
        AGRICULTURE_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        AGRICULTURE_WORKBANK_JOB_CATEGORY = "Agriculture and Wildlife Conservation"
        AGRICULTURE_WORKBANK_JOB_INFO3 = AGRICULTURE_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        AGRICULTURE_WORKBANK_JOB_DESCRIPTION = AGRICULTURE_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(AGRICULTURE_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(AGRICULTURE_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(AGRICULTURE_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(AGRICULTURE_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(AGRICULTURE_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(AGRICULTURE_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(AGRICULTURE_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_YEAR_WE.append(AGRICULTURE_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_EDUCATION.append(AGRICULTURE_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_DESCRIPTION.append(AGRICULTURE_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Environmental and Health Safety

In [None]:
#Environmental and Health Safety
SAFETY_WORKBANK_URL = 'https://www.workbank.com/job/environmental-health-safety-job-openings?wb_q='
SAFETY_WORKBANK = requests.get(SAFETY_WORKBANK_URL)
SAFETY_WORKBANK_soup = BeautifulSoup(SAFETY_WORKBANK.content, 'html.parser')
SAFETY_WORKBANK_PP = SAFETY_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(SAFETY_WORKBANK_PP)!=0):
    SAFETY_WORKBANK_NUMPAGES=len(SAFETY_WORKBANK_PP[0].select("option"))
else:
    SAFETY_WORKBANK_NUMPAGES=0
SAFETY_WORKBANK_PAGES=[]
if (SAFETY_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        SAFETY_WORKBANK_PAGES.append('https://www.workbank.com/job/environmental-health-safety-job-openings?page='+ str(i))
else:
    SAFETY_WORKBANK_PAGES.append('https://www.workbank.com/job/environmental-health-safety-job-openings?page=1')
for i in range(len(SAFETY_WORKBANK_PAGES)):
    SAFETY_WORKBANK_URLs = SAFETY_WORKBANK_PAGES[i]
    SAFETY_WORKBANK_PAGE = requests.get(SAFETY_WORKBANK_URLs)
    SAFETY_WORKBANK_PAGE_soup = BeautifulSoup(SAFETY_WORKBANK_PAGE.content, 'html.parser')
    SAFETY_WORKBANK_JOBS = SAFETY_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    SAFETY_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(SAFETY_WORKBANK_JOBS))
    SAFETY_WORKBANK_JOB_DATEs = SAFETY_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(SAFETY_WORKBANK_JOB_URLs)):
        SAFETY_WORKBANK_JOB_PAGE = requests.get(SAFETY_WORKBANK_JOB_URLs[j])
        SAFETY_WORKBANK_JOB_PAGE_soup = BeautifulSoup(SAFETY_WORKBANK_JOB_PAGE.content, 'html.parser')
        SAFETY_WORKBANK_JOB_PAGE_INFO1 = SAFETY_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        SAFETY_WORKBANK_JOB_TITLE = SAFETY_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        SAFETY_WORKBANK_JOB_COMPANY = SAFETY_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        SAFETY_WORKBANK_JOB_SALARY = SAFETY_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        SAFETY_WORKBANK_JOB_DATEPOSTED = SAFETY_WORKBANK_JOB_DATEs[j].text.strip()
        SAFETY_WORKBANK_JOB_LOCATION = SAFETY_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        SAFETY_WORKBANK_JOB_INFO2 = SAFETY_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        SAFETY_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(SAFETY_WORKBANK_JOB_INFO2))[0]
        SAFETY_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(SAFETY_WORKBANK_JOB_INFO2))[0]
        SAFETY_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        SAFETY_WORKBANK_JOB_CATEGORY = "Environmental and Health Safety"
        SAFETY_WORKBANK_JOB_INFO3 = SAFETY_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        SAFETY_WORKBANK_JOB_DESCRIPTION = SAFETY_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(SAFETY_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(SAFETY_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(SAFETY_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(SAFETY_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(SAFETY_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(SAFETY_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(SAFETY_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_YEAR_WE.append(SAFETY_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_EDUCATION.append(SAFETY_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_DESCRIPTION.append(SAFETY_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Medical and Healthcare

In [None]:
#Medical and Healthcare
HEALTH_WORKBANK_URL = 'https://www.workbank.com/job/medical-healthcare-job-openings?wb_q='
HEALTH_WORKBANK = requests.get(HEALTH_WORKBANK_URL)
HEALTH_WORKBANK_soup = BeautifulSoup(HEALTH_WORKBANK.content, 'html.parser')
HEALTH_WORKBANK_PP = HEALTH_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(HEALTH_WORKBANK_PP)!=0):
    HEALTH_WORKBANK_NUMPAGES=len(HEALTH_WORKBANK_PP[0].select("option"))
else:
    HEALTH_WORKBANK_NUMPAGES=0
HEALTH_WORKBANK_PAGES=[]
if (HEALTH_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        HEALTH_WORKBANK_PAGES.append('https://www.workbank.com/job/medical-healthcare-job-openings?page='+ str(i))
else:
    HEALTH_WORKBANK_PAGES.append('https://www.workbank.com/job/medical-healthcare-job-openings?page=1')
for i in range(len(HEALTH_WORKBANK_PAGES)):
    HEALTH_WORKBANK_URLs = HEALTH_WORKBANK_PAGES[i]
    HEALTH_WORKBANK_PAGE = requests.get(HEALTH_WORKBANK_URLs)
    HEALTH_WORKBANK_PAGE_soup = BeautifulSoup(HEALTH_WORKBANK_PAGE.content, 'html.parser')
    HEALTH_WORKBANK_JOBS = HEALTH_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    HEALTH_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(HEALTH_WORKBANK_JOBS))
    HEALTH_WORKBANK_JOB_DATEs = HEALTH_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(HEALTH_WORKBANK_JOB_URLs)):
        HEALTH_WORKBANK_JOB_PAGE = requests.get(HEALTH_WORKBANK_JOB_URLs[j])
        HEALTH_WORKBANK_JOB_PAGE_soup = BeautifulSoup(HEALTH_WORKBANK_JOB_PAGE.content, 'html.parser')
        HEALTH_WORKBANK_JOB_PAGE_INFO1 = HEALTH_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        HEALTH_WORKBANK_JOB_TITLE = HEALTH_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        HEALTH_WORKBANK_JOB_COMPANY = HEALTH_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        HEALTH_WORKBANK_JOB_SALARY = HEALTH_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        HEALTH_WORKBANK_JOB_DATEPOSTED = HEALTH_WORKBANK_JOB_DATEs[j].text.strip()
        HEALTH_WORKBANK_JOB_LOCATION = HEALTH_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        HEALTH_WORKBANK_JOB_INFO2 = HEALTH_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        HEALTH_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(HEALTH_WORKBANK_JOB_INFO2))[0]
        HEALTH_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(HEALTH_WORKBANK_JOB_INFO2))[0]
        HEALTH_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        HEALTH_WORKBANK_JOB_CATEGORY = "Medical and Healthcare"
        HEALTH_WORKBANK_JOB_INFO3 = HEALTH_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        HEALTH_WORKBANK_JOB_DESCRIPTION = HEALTH_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(HEALTH_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(HEALTH_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(HEALTH_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(HEALTH_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(HEALTH_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(HEALTH_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(HEALTH_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_YEAR_WE.append(HEALTH_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_EDUCATION.append(HEALTH_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_DESCRIPTION.append(HEALTH_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Sciences

In [None]:
#Sciences
SCIENCES_WORKBANK_URL = 'https://www.workbank.com/job/sciences-job-openings?wb_q='
SCIENCES_WORKBANK = requests.get(SCIENCES_WORKBANK_URL)
SCIENCES_WORKBANK_soup = BeautifulSoup(SCIENCES_WORKBANK.content, 'html.parser')
SCIENCES_WORKBANK_PP = SCIENCES_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(SCIENCES_WORKBANK_PP)!=0):
    SCIENCES_WORKBANK_NUMPAGES=len(SCIENCES_WORKBANK_PP[0].select("option"))
else:
    SCIENCES_WORKBANK_NUMPAGES=0
SCIENCES_WORKBANK_PAGES=[]
if (SCIENCES_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        SCIENCES_WORKBANK_PAGES.append('https://www.workbank.com/job/sciences-job-openings?page='+ str(i))
else:
    SCIENCES_WORKBANK_PAGES.append('https://www.workbank.com/job/sciences-job-openings?page=1')
for i in range(len(SCIENCES_WORKBANK_PAGES)):
    SCIENCES_WORKBANK_URLs = SCIENCES_WORKBANK_PAGES[i]
    SCIENCES_WORKBANK_PAGE = requests.get(SCIENCES_WORKBANK_URLs)
    SCIENCES_WORKBANK_PAGE_soup = BeautifulSoup(SCIENCES_WORKBANK_PAGE.content, 'html.parser')
    SCIENCES_WORKBANK_JOBS = SCIENCES_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    SCIENCES_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(SCIENCES_WORKBANK_JOBS))
    SCIENCES_WORKBANK_JOB_DATEs = SCIENCES_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(SCIENCES_WORKBANK_JOB_URLs)):
        SCIENCES_WORKBANK_JOB_PAGE = requests.get(SCIENCES_WORKBANK_JOB_URLs[j])
        SCIENCES_WORKBANK_JOB_PAGE_soup = BeautifulSoup(SCIENCES_WORKBANK_JOB_PAGE.content, 'html.parser')
        SCIENCES_WORKBANK_JOB_PAGE_INFO1 = SCIENCES_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        SCIENCES_WORKBANK_JOB_TITLE = SCIENCES_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        SCIENCES_WORKBANK_JOB_COMPANY = SCIENCES_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        SCIENCES_WORKBANK_JOB_SALARY = SCIENCES_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        SCIENCES_WORKBANK_JOB_DATEPOSTED = SCIENCES_WORKBANK_JOB_DATEs[j].text.strip()
        SCIENCES_WORKBANK_JOB_LOCATION = SCIENCES_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        SCIENCES_WORKBANK_JOB_INFO2 = SCIENCES_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        SCIENCES_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(SCIENCES_WORKBANK_JOB_INFO2))[0]
        SCIENCES_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(SCIENCES_WORKBANK_JOB_INFO2))[0]
        SCIENCES_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        SCIENCES_WORKBANK_JOB_CATEGORY = "Sciences"
        SCIENCES_WORKBANK_JOB_INFO3 = SCIENCES_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        SCIENCES_WORKBANK_JOB_DESCRIPTION = SCIENCES_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(SCIENCES_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(SCIENCES_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(SCIENCES_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(SCIENCES_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(SCIENCES_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(SCIENCES_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(SCIENCES_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_YEAR_WE.append(SCIENCES_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_EDUCATION.append(SCIENCES_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_DESCRIPTION.append(SCIENCES_WORKBANK_JOB_DESCRIPTION)

### CATEGORY - Actuarial

In [None]:
#Actuarial
ACTUARIAL_WORKBANK_URL = 'https://www.workbank.com/job/hiring-actuarial'
ACTUARIAL_WORKBANK = requests.get(ACTUARIAL_WORKBANK_URL)
ACTUARIAL_WORKBANK_soup = BeautifulSoup(ACTUARIAL_WORKBANK.content, 'html.parser')
ACTUARIAL_WORKBANK_PP = ACTUARIAL_WORKBANK_soup.find_all('div',{'class':'wb-pagination'})
if (len(ACTUARIAL_WORKBANK_PP)!=0):
    ACTUARIAL_WORKBANK_NUMPAGES=len(ACTUARIAL_WORKBANK_PP[0].select("option"))
else:
    ACTUARIAL_WORKBANK_NUMPAGES=0
ACTUARIAL_WORKBANK_PAGES=[]
if (ACTUARIAL_WORKBANK_NUMPAGES!=0):
    for i in range(1,IT_WORKBANK_NUMPAGES+1):
        ACTUARIAL_WORKBANK_PAGES.append('https://www.workbank.com/job/hiring-actuarial?page='+ str(i))
else:
    ACTUARIAL_WORKBANK_PAGES.append('https://www.workbank.com/job/hiring-actuarial?page=1')
for i in range(len(ACTUARIAL_WORKBANK_PAGES)):
    ACTUARIAL_WORKBANK_URLs = ACTUARIAL_WORKBANK_PAGES[i]
    ACTUARIAL_WORKBANK_PAGE = requests.get(ACTUARIAL_WORKBANK_URLs)
    ACTUARIAL_WORKBANK_PAGE_soup = BeautifulSoup(ACTUARIAL_WORKBANK_PAGE.content, 'html.parser')
    ACTUARIAL_WORKBANK_JOBS = ACTUARIAL_WORKBANK_PAGE_soup.find_all('a',{'class':'clearfix'})
    ACTUARIAL_WORKBANK_JOB_URLs = re.findall(r'(?s)(?<=href=").*?(?="><h5)',str(ACTUARIAL_WORKBANK_JOBS))
    ACTUARIAL_WORKBANK_JOB_DATEs = ACTUARIAL_WORKBANK_PAGE_soup.find_all('p',{'class':'publish-date-card mt-1 text-left mb-0'})
    for j in range(len(ACTUARIAL_WORKBANK_JOB_URLs)):
        ACTUARIAL_WORKBANK_JOB_PAGE = requests.get(ACTUARIAL_WORKBANK_JOB_URLs[j])
        ACTUARIAL_WORKBANK_JOB_PAGE_soup = BeautifulSoup(ACTUARIAL_WORKBANK_JOB_PAGE.content, 'html.parser')
        ACTUARIAL_WORKBANK_JOB_PAGE_INFO1 = ACTUARIAL_WORKBANK_JOB_PAGE_soup.find('article',{'class':'job-ad-text-center pl-3'})
        ACTUARIAL_WORKBANK_JOB_TITLE = ACTUARIAL_WORKBANK_JOB_PAGE_INFO1.contents[0].text.strip()
        ACTUARIAL_WORKBANK_JOB_COMPANY = ACTUARIAL_WORKBANK_JOB_PAGE_INFO1.contents[1].text.strip()
        ACTUARIAL_WORKBANK_JOB_SALARY = ACTUARIAL_WORKBANK_JOB_PAGE_INFO1.contents[4].text.strip()
        ACTUARIAL_WORKBANK_JOB_DATEPOSTED = ACTUARIAL_WORKBANK_JOB_DATEs[j].text.strip()
        ACTUARIAL_WORKBANK_JOB_LOCATION = ACTUARIAL_WORKBANK_JOB_PAGE_soup.find('a',{'class':'cls-links'}).text.strip()
        ACTUARIAL_WORKBANK_JOB_INFO2 = ACTUARIAL_WORKBANK_JOB_PAGE_soup.find('ul',{'class':'job-ad-des-ul mb-0'})
        ACTUARIAL_WORKBANK_JOB_STATUS = re.findall(r'(?s)(?<=Job Type</h5><p>).*?(?=</p>)',str(ACTUARIAL_WORKBANK_JOB_INFO2))[0]
        ACTUARIAL_WORKBANK_JOB_EDUCATION = re.findall(r'(?s)(?<=Educational Attainment</h5><p>).*?(?=</p>)',str(ACTUARIAL_WORKBANK_JOB_INFO2))[0]
        ACTUARIA_WORKBANK_JOB_YEARS_WE = re.findall(r'(?s)(?<=Years of Work Experience</h5><p>).*?(?=</p>)',str(IT_WORKBANK_JOB_INFO2))[0]
        ACTUARIAL_WORKBANK_JOB_CATEGORY = "Actuarial"
        ACTUARIAL_WORKBANK_JOB_INFO3 = ACTUARIAL_WORKBANK_JOB_PAGE_soup.find('article',{'class':'pl-4 pr-4 pb-0 pt-4'})
        ACTUARIAL_WORKBANK_JOB_DESCRIPTION = ACTUARIAL_WORKBANK_JOB_INFO3.contents[1].getText(separator=u' ')
        WORKBANK_JOB_TITLE.append(ACTUARIAL_WORKBANK_JOB_TITLE)
        WORKBANK_JOB_CATEGORY.append(ACTUARIAL_WORKBANK_JOB_CATEGORY)
        WORKBANK_JOB_COMPANY.append(ACTUARIAL_WORKBANK_JOB_COMPANY)
        WORKBANK_JOB_DATE.append(ACTUARIAL_WORKBANK_JOB_DATEPOSTED)
        WORKBANK_JOB_LOCATION.append(ACTUARIAL_WORKBANK_JOB_LOCATION)
        WORKBANK_JOB_STATUS.append(ACTUARIAL_WORKBANK_JOB_STATUS)
        WORKBANK_JOB_SALARY.append(ACTUARIAL_WORKBANK_JOB_SALARY)
        WORKBANK_JOB_YEAR_WE.append(ACTUARIA_WORKBANK_JOB_YEARS_WE)
        WORKBANK_JOB_EDUCATION.append(ACTUARIAL_WORKBANK_JOB_EDUCATION)
        WORKBANK_JOB_DESCRIPTION.append(ACTUARIAL_WORKBANK_JOB_DESCRIPTION)

### Data check
Print the acquired data set to check the gathered data.

In [None]:
workbank={'Website': "Workbank" ,
          'Job Title': WORKBANK_JOB_TITLE, 
          'Category': WORKBANK_JOB_CATEGORY, 
          'Company': WORKBANK_JOB_COMPANY, 
          'Date Posted': WORKBANK_JOB_DATE, 
          'Location': WORKBANK_JOB_LOCATION, 
          'Status': WORKBANK_JOB_STATUS, 
          'Salary': WORKBANK_JOB_SALARY, 
          'Education': WORKBANK_JOB_EDUCATION, 
          'Years of Work Experience': WORKBANK_JOB_YEAR_WE,
          'Job Description': WORKBANK_JOB_DESCRIPTION}
workbank_df = pd.DataFrame(data=workbank)
workbank_df

### Parsing the Data to JSON

Store the gathered data into a json file

In [None]:
data = workbank_df.to_json(orient='records')
parsed = json.loads(data)
json.dumps(parsed, indent=4) 
with open('workbank.json', 'w') as json_file:
    json.dump(parsed, json_file)