# WEB SCRAPING NI ASSEMBLY WEBSITE
Here we will create a web scraping model that not only extracts data from the current year, but also extracts data from the previous years as well.

Defining the URL:

In [1]:
url = 'http://aims.niassembly.gov.uk/officialreport/reports.aspx'

## DEPENDENCIES REQUIRED

In [2]:
pip install selenium

Collecting selenium
[?25l  Downloading https://files.pythonhosted.org/packages/80/d6/4294f0b4bce4de0abf13e17190289f9d0613b0a44e5dd6a7f5ca98459853/selenium-3.141.0-py2.py3-none-any.whl (904kB)
[K     |████████████████████████████████| 911kB 2.7MB/s 
Installing collected packages: selenium
Successfully installed selenium-3.141.0


In [3]:
import pandas as pd
import numpy as np

import requests
from bs4 import BeautifulSoup

from selenium import webdriver
from selenium.webdriver.common.keys import Keys    
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from tqdm import tqdm
import time

For using selenium explicitly we need to make some changes. 

This isn't required if we use a local or virtual machine to run our web-scraping model

In [4]:
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!pip install selenium

options = webdriver.ChromeOptions()
options.add_argument('-headless')
options.add_argument('-no-sandbox')
options.add_argument('-disable-dev-shm-usage')

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
The following additional packages will be installed:
  chromium-browser chromium-browser-l10n chromium-codecs-ffmpeg-extra
Suggested packages:
  webaccounts-chromium-extension unity-chromium-extension adobe-flashplugin
The following NEW packages will be installed:
  chromium-browser chromium-browser-l10n chromium-chromedriver
  chromium-codecs-ffmpeg-extra
0 upgraded, 4 newly installed, 0 to remove and 35 not upgraded.
Need to get 75.5 MB of archives.
After this operation, 256 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 chromium-codecs-ffmpeg-extra amd64 83.0.4103.61-0ubuntu0.18.04.1 [1,119 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 chromium-browser amd64 83.0.410

By inspecting the NI Assembly Hansard website we can see that each of the assembly sessions can be viewed in the website itself and also can be downloaded in pdf format. Lets view the page source of this website

![alt text](https://i.imgur.com/1s7qO7R.png)

Here we initialize a driver (virtual browser). This is done so that we are able to maneuver within the NI Assembly website. 

In [5]:
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)
# print(driver.page_source) # uncomment to see the the whole page source of the website

We will create an empty DataFrame to save all the data extracted from our web scraping model.

In [6]:
ni_dataframe = pd.DataFrame(columns=['date', 'heading_1', 'heading_2', 'heading_3', 'contribution_minister_name', 'contribution', 'procedure', 'motion'], index = [0])
ni_dataframe

Unnamed: 0,date,heading_1,heading_2,heading_3,contribution_minister_name,contribution,procedure,motion
0,,,,,,,,


Next we will extract the link address of all the NI assembly sessions that is available in a single page. This is stored in a form of list which can be later iterated using a for-loop to extract the the data (date, minister_name, his/her statements) from individual sessions.

In [7]:
# all_views = driver.find_elements_by_tag_name("a")
all_views = driver.find_elements_by_xpath('//*[@href]')

# print(all_views[197].get_attribute('innerHTML'))
# print(all_views[199].get_attribute("textContent"))
# # print(all_views[199].get_attribute('value'))
# # print(all_views[199].text)
print(all_views[200].get_attribute("href"))
print(all_views[200].get_attribute("id"))
print(len(all_views))

final_view_href = []

for i in range(len(all_views)):
  if all_views[i].get_attribute('textContent') == 'View':
    # print(all_views[i].get_attribute("href"))
    final_view_href.append(all_views[i].get_attribute("href"))
print("Number of NI Assembly session links available on the page: ",len(final_view_href))

# OR

final_view_id = []
for i in range(len(all_views)):
  if all_views[i].get_attribute('textContent') == 'View':
    # print(all_views[i].get_attribute("id"))
    final_view_id.append(all_views[i].get_attribute("id"))

# OR

final_view_full = []
for i in range(len(all_views)):
  if all_views[i].get_attribute('textContent') == 'View':
    # print(all_views[i])
    final_view_full.append(all_views[i])



"""We can iterate over the session reports in two ways.
1. By the href values that we filtered out which had 'View' in its contents. This
is then passed through a for loop and clicked into using a try/exception block system.
Then essential data is extracted.
2. By filtering out of href values and then getting the id of each. This is then passed through a 
clicker function where the virtual browser goes inside the session report page, then parse the
html to extract the required information.
The fastest method will be implemented. We will use tqdm for this purpose.
"""


print("Number of NI Assembly session links available on the page: ",len(final_view_id))

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/12&docID=300902
ctl00_MainContentPlaceHolder_OfficialReportsGridView_ctl13_HTMLViewButton1
292
Number of NI Assembly session links available on the page:  30
Number of NI Assembly session links available on the page:  30


We were able to extract 30 links from a single page. By manually cross-checking the links myself, this came out to be accurate. Thus, we will use this code-snippet for all the sub-pages as well as when we iterate through different years.

We can iterate over the session reports in two ways.
1. By the href values that we filtered out which had 'View' in its contents. This
is then passed through a for loop and clicked into using a try/exception block system.
Then essential data is extracted.
2. By filtering out of href values and then getting the id of each. This is then passed through a 
clicker function where the virtual browser goes inside the session report page, then parse the
html to extract the required information.

The fastest method will be implemented. We will use tqdm for this purpose.

In [8]:
# METHOD 1 (using href)
try:
    for i in tqdm(final_view_href, total=len(final_view_href)) :
      driver = webdriver.Chrome('chromedriver',options=options)
      driver.get(i)

      ni_text = driver.find_element_by_tag_name("main")
      # print(ni_text.text)

except:
    print("error!")
    driver.quit()

100%|██████████| 30/30 [06:09<00:00, 12.32s/it]


In [9]:
# METHOD 2 (using id and clicker)
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)
try:
  for h in tqdm(final_view_id, total=len(final_view_id)):

    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, h))
        )
    element.click()
    ni_text = driver.find_element_by_tag_name("main")      # tag_name
    # ni_header = driver.find_elements_by_tag_name("p")
    # print(ni_header.text)
    driver.back()

except:
    print("error!")
    driver.quit()

100%|██████████| 30/30 [00:46<00:00,  1.55s/it]


We can see that method 2 takes significantly lesser time

Next step we will  try to extract specific parts of the corpus. For example lets try to filter out all the Headings for the whole 30 Hansard reports from a single page of the year 2019-2020.

In [10]:
# METHOD 2 (using id and clicker)- filtering out Header1
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)
try:
  for h in tqdm(final_view_id, total=len(final_view_id)):

    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, h))
        )
    element.click()
    main = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, "main"))
        )
    ni_header_1 = main.find_elements_by_class_name("Header1")
    ni_header_2 = main.find_elements_by_class_name("Header2")
    ni_header_3 = main.find_elements_by_class_name("Header3")
    print("X"*50, "HEADER 1", "X"*50)    
    for h in ni_header_1:
        print(h.text)
    print("X"*50, "HEADER 2", "X"*50)
    for i in ni_header_2:
        print(i.text)
    print("X"*50, "HEADER 3", "X"*50)
    for j in ni_header_3:
        print(j.text)
         
    driver.back()

except:
    print("error!")
    driver.quit()

  0%|          | 0/30 [00:00<?, ?it/s]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Committee Business
Question for Urgent Oral Answer
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Agriculture, Environment and Rural Affairs
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Standing Order 20(1): Suspension
Northern Ireland Assembly Commissioner for Standards: Appointment
Health Protection (Coronavirus, Restrictions) (Amendment No. 9) Regulations (Northern Ireland) 2020
Health Protection (Coronavirus, Restrictions) (Amendment No. 10) Regulations (Northern Ireland) 2020
Executive Committee (Functions) Bill: Consideration Stage
Climate Change and the Introduction of a Climate Change Act
Waste Storage at Edenderry Industrial Estate


  3%|▎         | 1/30 [00:01<00:42,  1.47s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Committee Business
Oral Answers to Questions
Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The Executive Office
Health
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Standing Order 20A: Suspension
Pension Schemes Bill: Second Stage
Sea Fish Industry (Coronavirus) (Fixed Costs) Scheme (Northern Ireland) 2020
Sea Fish Industry (Coronavirus) (Fixed Costs) (Amendment) Scheme (Northern Ireland) 2020
The Rates (Exemption for Automatic Telling Machines in Rural Areas) Order (Northern Ireland) 2020
Business and Planning Bill: Legislative Consent Motion
COVID-19 Guidance and Financial Support to Industry Sectors
Victims' Payment Scheme
Hart Inquiry
Urban Villages: Nort

  7%|▋         | 2/30 [00:05<01:05,  2.32s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Question for Urgent Oral Answer
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The Executive Office
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Standing Order 20(1): Suspension
Standing Order 15(1): Suspension
Committee Membership
Northern Ireland Public Services Ombudsman: Nomination
The Executive Committee (Functions) Bill: First Stage
The Health Protection (Coronavirus, Restrictions) (Amendment No. 7) Regulations (Northern Ireland) 2020
Health Protection (Coronavirus, Restrictions) (Amendment No. 8) Regulations (Northern Ireland) 2020
Executive Committee (Functions) Bill: Accelerated Passage
Standing Order 42(1): Suspension
Executive Committee (Functions) Bill: Second Stage
Funeral of Bobby Storey


 10%|█         | 3/30 [00:07<01:01,  2.29s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Matter of the Day
Assembly Business
Executive Committee Business
Oral Answers to Questions
Ministerial Statement
Executive Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Education
Finance
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Condolences to the Family of Noah Donohoe
Conferral of Functions on the Northern Ireland Assembly Commission
The Health Protection (Coronavirus, Restrictions) (Amendment No. 5) Regulations (Northern Ireland) 2020
Schools: Reopening Arrangements
Schools: Reopening Arrangements
Schools: Social Distancing Guidelines
Youth Organisations: Funding
Autism-specific Learning Centres: Newry and Mourne
SEN: Post-COVID-19 Support
Schools: Reopening Arrangements
School Closures: Long-term 

 13%|█▎        | 4/30 [00:10<01:04,  2.47s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Executive Committee Business
Assembly Business
Ministerial Statement
Oral Answers to Questions
Question for Urgent Oral Answer
Ministerial Statement
Executive Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Communities
Economy
Health
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Budget (No. 2) Bill: Royal Assent
Committee on Procedures
Temporary Speakers
NSMC Institutional Meeting
Green Growth Strategy and Delivery Framework
Major Capital Works Programme
Advice Sector: Support
Model Engineers' Society NI: Miniature Railway
Social Supermarkets Pilot Programme
Local Government: COVID-19 Support
Social Security Entitlements
Food Poverty
Tenants' Rights: Private Rented Sector
Food Parcels
Sports Sector: COVID-1

 17%|█▋        | 5/30 [00:13<00:59,  2.39s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Speaker's Business
Assembly Business
Executive Committee Business
Oral Answers to Questions
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The late Mr Billy Bell
The Executive Office
Agriculture, Environment and Rural Affairs
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Committee Membership
Housing (Amendment) Bill: Consideration Stage
Social Security Benefits Up-rating Order (Northern Ireland) 2020
Social Security Benefits Up-rating Regulations (Northern Ireland) 2020
Mesothelioma Lump Sum Payments (Conditions and Amounts) (Amendment) Regulations (Northern Ireland) 2020
COVID-19: Public Communications
Victims' Payment Scheme
Ad Hoc Committee on a Bill of Rights
COVID-19: Executive Response
New Decade, New Approach: Cost

 20%|██        | 6/30 [00:15<00:54,  2.26s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Ministerial Statement
Executive Committee Business
Private Members' Business
Questions for Urgent Oral Answer
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Justice
Economy
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Standing Order 20A: Suspension
Rebuilding HSC Services
Northern Ireland Criminal Injuries Compensation (Amendment 2020) Scheme (2009)
Birmingham Commonwealth Games Bill: Legislative Consent Motion
COVID-19 Pandemic: Support for Sheep and Beef Farmers
Police Enforcement of Belfast Mass Gathering
Job Losses at Thompson Aero Seating
COVID-19 Pandemic: Support for Sheep and Beef Farmers


 23%|██▎       | 7/30 [00:18<00:58,  2.56s/it]

Northern Ireland Prison Service: Staff Stress
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Executive Committee Business
Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The Health Protection (Coronavirus, Restrictions) (Amendment No. 2) Regulations (Northern Ireland) 2020.
The Health Protection (Coronavirus, Restrictions) (Amendment No. 3) Regulations (Northern Ireland) 2020
Sentencing (Pre-consolidation Amendments) Bill: Legislative Consent Motion
Corporate Insolvency and Governance Bill: Legislative Consent Motion
Budget (No. 2) Bill: Further Consideration Stage
Budget (No. 2) Bill: Final Stage
Domestic Abuse and Family Proceedings Bill: Extension of Committee Stage
EU Withdrawal Transition Period: Extens

 27%|██▋       | 8/30 [00:20<00:53,  2.42s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Matter of the Day
Executive Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Death of George Floyd
Budget (No. 2) Bill: Consideration Stage
Housing (Amendment) Bill: Accelerated Passage
Housing (Amendment) Bill: Second Stage
Child Support (Miscellaneous Amendments No. 3) Regulations (Northern Ireland) 2019
Child Support (Miscellaneous Amendments No. 4) Regulations (Northern Ireland) 2019
Pension Schemes Bill: Legislative Consent Motion
Planning a Just Economic Recovery after the COVID-19 Crisis


 30%|███       | 9/30 [00:21<00:43,  2.08s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Matter of the Day
Committee Business
Ministerial Statement
Question for Urgent Oral Answer
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
New Assembly Member: Ms Cara Hunter
Troubles Victims: Pension
Committee Membership
COVID-19 Response
Supply Resolution for the Northern Ireland Estimates Further Vote on Account 2020-2021
Budget (No. 2) Bill: First Stage
Housing (Amendment) Bill: First Stage
Standing Order 42(5): Suspension
The Executive Office
Interim Advocate's Office: Data Breach


 33%|███▎      | 10/30 [00:23<00:37,  1.89s/it]

Budget (No. 2) Bill: Second Stage
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Ministerial Statements
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Committee Membership
COVID-19: Department for Communities Response
COVID-19: Update on the Financial Position
The Health Protection (Coronavirus, Restrictions) (Amendment) Regulations (Northern Ireland) 2020
The Direct Payments to Farmers (Crop Diversification Derogation) Regulations (Northern Ireland) 2020
Private International Law (Implementation of Agreements) Bill: Legislative Consent Motion


 37%|███▋      | 11/30 [00:24<00:32,  1.71s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Ministerial Statement
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Mr John Dallat MLA
Standing Order 18A(5): Suspension
The Electrically Assisted Pedal Cycles (Construction and Use) Regulations (Northern Ireland) 2020
Coronavirus: Executive Approach to Decision-making


 40%|████      | 12/30 [00:25<00:28,  1.59s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Assembly Business
Ministerial Statements
Executive Committee Business
Assembly Business
Executive Committee Business
Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Committee for the Executive Office: Deputy Chairperson
Private Tenancies (Coronavirus Modifications) Bill: Royal Assent
Standing Order 10(2)(a) and Standing Orders 20 and 20A: Suspension
COVID-19
School Enhancement Programme
Census Order (Northern Ireland) 2020
Mr John Dallat MLA
Census Order (Northern Ireland) 2020
Rates (Regional Rates) Order (Northern Ireland) 2020
Budget 2020-21
Functioning of Government (Miscellaneous Provisions) Bill: Extension of Committee Stage


 43%|████▎     | 13/30 [00:27<00:26,  1.55s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Private Tenancies (Coronavirus Modifications) Bill: Consideration Stage
Domestic Abuse and Family Proceedings Bill: Second Stage
Private Tenancies (Coronavirus Modifications) Bill: Final Stage


 47%|████▋     | 14/30 [00:28<00:23,  1.47s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Committee Business
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Private Tenancies (Coronavirus Modifications) Bill: First Stage
Health Protection (Coronavirus, Restrictions) Regulations (Northern Ireland) 2020
Discretionary Support (Amendment No. 2) (COVID-19) Regulations (Northern Ireland) 2020
The Private Tenancies (Coronavirus Modifications) Bill: Accelerated Passage
Private Tenancies (Coronavirus Modifications) Bill: Second Stage
Standing Orders 31(d), 37, 39(1) and 42(5): Suspension


 50%|█████     | 15/30 [00:29<00:21,  1.44s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Matter of the Day
Assembly Business
Ministerial Statement
Executive Committee Business
Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Budget Bill: Royal Assent
Abortion Regulations
Committee Membership
Assembly Members' Pension Scheme: Trustees
Ad Hoc Committee on COVID-19
Budget
Domestic Abuse and Family Proceedings Bill: First Stage
Agriculture Bill: Legislative Consent Motion
Discretionary Support (Amendment No. 2) (COVID-19) Regulations (Northern Ireland) 2020
Standing Orders


 53%|█████▎    | 16/30 [00:31<00:19,  1.42s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Executive Committee Business
Ministerial Statement
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Coronavirus Bill: Legislative Consent Motion
Discretionary Support (Amendment) (COVID-19) Regulations (Northern Ireland) 2020
The Economy in Light of COVID-19


 57%|█████▋    | 17/30 [00:32<00:17,  1.37s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Assembly Business
Executive Committee Business
Ministerial Statements
Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Mr Ivan Davis
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Committee Membership
Standing Order 10(2)(a) and Standing Orders 20 and 20A: Suspension
Standing Order 15(1): Suspension
Pneumoconiosis, etc., (Workers' Compensation) (Payment of Claims) (Amendment) Regulations (Northern Ireland) 2020
COVID-19 Preparations
Response to COVID-19
Standing Orders 49(2)(a) and 52(2)(a)


 60%|██████    | 18/30 [00:33<00:16,  1.38s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Oral Answers to Questions
Questions for Urgent Oral Answer
Executive Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Health
Infrastructure
Education
Economy
Finance
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
COVID-19
Audit Committee: Deputy Chairperson
Question Time
Committee Membership
Standing Orders 10(2) to 10(4): Suspension
Assembly Commission Budget 2020-21
Renewable Heat Incentive Inquiry Report
COVID-19: Cross-border Coordination
COVID-19: Testing
Multi-disciplinary Teams
Infant Mortality
COVID-19: GP Surgeries
COVID-19: Routine GP Services
Health Centre: Carrick and Larne
COVID-19: Communication and Information
COVID-19: Executive Response
COVID-19:

 63%|██████▎   | 19/30 [00:36<00:17,  1.60s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Matter of the Day
Executive Committee Business
Private Members' Business
Oral Answers to Questions
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Education
Finance
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Barney Eastwood
Pension Schemes Act 2015 (Transitional Provisions and Appropriate Independent Advice) (Amendment No. 2) Regulations (Northern Ireland) 2019
Independent Review of Education
Childcare Strategy
SEN Statements: Waiting List
Shared Education: Funding
Coronavirus: Schools
Carrickfergus Academy: Investment
Irish-medium Workforce Strategy
Strule Shared Education Campus
Area Planning: SEN Schools
Schools: Medical Interventions
School Trips: DE Ban
Battlefields Project
Exams: Coronavirus Contingency Plans
Corpora

 67%|██████▋   | 20/30 [00:37<00:16,  1.67s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Oral Answers to Questions
Question for Urgent Oral Answer
Ministerial Statement
Matter of the Day
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Communities
Economy
Education
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Committee Membership
Budget Bill: Final Stage
Bereavement Support Payment (No. 2) Regulations (Northern Ireland) 2019
The Social Security Benefits Up-rating (No. 2) Order (Northern Ireland) 2019
Social Security Benefits Up-rating (No. 2) Regulations (Northern Ireland) 2019
Mesothelioma Lump Sum Payments (Conditions and Amounts) (Amendment No. 2) Regulations (Northern Ireland) 2019
Social Housing: East Antrim
Liquor Licensing: Reform
Regional Stadium Fund: Casement Park
Urban Regeneration

 70%|███████   | 21/30 [00:39<00:15,  1.69s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Executive Committee Business
Private Members' Business
Oral Answers to Questions
Private Members' Business
Adjournment
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Justice
Agriculture, Environment and Rural Affairs
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Budget Bill: Further Consideration Stage
Crime and Older People
Areas of Natural Constraint
Domestic Homicide Reviews
Antisocial Behaviour
Gillen Review: Recommendations
Helen's Law
Driving Offences: Sentencing
Community Policing: DOJ Investment
Terrorist Offenders (Restriction of Early Release) Bill
Stalking: Legislation
Domestic Abuse: Safe Accommodation
Abortion Services: Anti-harassment Measures
Magilligan Prison: Update
Domestic Violence
Mobuoy Dump: Mills Review
Coastal Erosion
Slurry S

 73%|███████▎  | 22/30 [00:41<00:13,  1.69s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Ministerial Statements
Executive Committee Business
Oral Answers to Questions
Ministerial Statement
Executive Committee Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The Executive Office
Infrastructure
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Inter-ministerial Group for Environment, Food and Rural Affairs
Afforestation Programme
Budget Bill: Consideration Stage
Legislative Programme
Bill of Rights: Panel of Experts
Northern Ireland Centenary 2021
Protocol on Ireland/Northern Ireland: Executive Commitment
Refugee Integration Strategy: Update
Victims' Payment Scheme: Funding
T:BUC: New Strategy
Maze/Long Kesh Development
NDNA: UK Government Commitments
Coronavirus
RHI Report
Veterans
Irish Language Legislation: 

 77%|███████▋  | 23/30 [00:43<00:12,  1.75s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Executive Committee Business
Oral Answers to Questions
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Finance
Health
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Budget Bill: Second Stage
Fiscal Balance
Lone Pensioner Allowance
Ministerial Code
Shared Prosperity Fund
Victims' Payment Scheme
Dormant Account Fund
Reval2020: Public Houses and Hotels
PSNI: Funding
Brexit: Funding Losses
Victims’ Pension Costs
Social Enterprise
Translation Hub: Update
Skills Deficit: Senior Civil Servants
NDNA: Financial Commitments
Contaminated Blood Scandal
Cancer Strategy: Update
Mental Health Street Triage Project
Muckamore Abbey Hospital: Patients
Health Trusts: Car Parking
Rural GP Practices: Capital Spending
Nurses
Crisis Intervention 

 80%|████████  | 24/30 [00:45<00:10,  1.78s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Oral Answers to Questions
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Economy
Education
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Bill of Rights: Ad Hoc Committee
Standing Orders 10(2) to 10(4): Suspension
Supply Resolution for the Spring Supplementary Estimates 2019-2020 and Supply Resolution for the Northern Ireland Estimates and Vote on Account 2020-21
Mineral Prospecting
Brexit: Departmental Business Plan
EU Work-life Balance Directive
Northern Ireland Centenary Celebrations
Tuition Fees
Derry City and Strabane District Council: Project Approval
Viewable Media UK Ltd/Grenke
Immigration: Impact on Local Businesses
Climate Emergency
Minor Works Programme: Funding
Sch

 83%|████████▎ | 25/30 [00:46<00:08,  1.72s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Matter of the Day
Assembly Business
Committee Business
Oral Answers to Questions
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The Executive Office
Justice
Communities
Agriculture, Environment and Rural Affairs
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Speaker's Rulings
Tributes to Former Members
Harry Gregg OBE
Committee Membership
Sea Fish Licensing Order (Northern Ireland) 2019: Prayer of Annulment
Sea Fishing (Licences and Notices) (Amendment) Regulations (Northern Ireland) 2019: Prayer of Annulment
Executive Subcommittee on Brexit: Update
Paramilitarism: Action Plan
Civil Service Reform
Attorney General: Recruitment Plans
Civic Advisory Panel
NSMC/BIC: Future Meetings
Well-being and Resilience Working Group
Anti-poverty St

 87%|████████▋ | 26/30 [00:48<00:07,  1.90s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
Private Members' Business
Oral Answers to Questions
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Education
Finance
Health
Infrastructure
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Mr Francie Brolly
New Assembly Member: Ms Martina Anderson
Speaker's Rulings
Business Committee Membership
Committee Membership
The Pension Schemes Act 2015 (Judicial Pensions) (Consequential Provision No. 2) Regulations (Northern Ireland) 2019
Abuse of Service Animals
NICE Guidance on Fertility
Carrickfergus Academy: New Build
Clifton School, Bangor
Hardy Memorial Primary School
SEN Assessments: Electronic Record-keeping
Academic Selection: CREU Report
Autism Training: Trade Union Discussions
Pea

 90%|█████████ | 27/30 [00:50<00:05,  1.93s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Speaker's Business
Matter of the Day
Assembly Business
Private Members' Business
Oral Answers to Questions
Private Members' Business
Assembly Business
Private Members' Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Private Members' Bills
The Executive Office
Agriculture, Environment and Rural Affairs
Communities
Economy
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
European Union Withdrawal
Northern Ireland Public Services Ombudsman
Functioning of Government (Miscellaneous Provisions) Bill: First Stage
Autism Training in Schools
Brexit: Movement of Goods
HIA Victims: Compensation
New Decade, New Approach: Coordination
Brexit: Executive Subcommittee
Regional Trauma Network
Mental Well-being and Resilience: Working Group
Brexit: Ethnic Minorit

 93%|█████████▎| 28/30 [00:53<00:04,  2.04s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Ministerial Statement
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Standing Order 20(1): Suspension
Tributes to former deputy First Minister Séamus Mallon
Committee for Health: Deputy Chairperson
Assembly Commission: Appointments
Extension of Sitting
Public Expenditure: 2019-2020 January Monitoring Round
Direct Payments to Farmers (Legislative Continuity) Bill: Legislative Consent Motion


 97%|█████████▋| 29/30 [00:54<00:01,  1.85s/it]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Business
Executive Committee Business
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HEADER 3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Assembly Commission: Vacancies
Standing Order 20(1): Suspension
Business Committee Membership
Standing Order 49(2)(a) and Standing Order 52(2)(a): Suspension
Statutory Committee Membership
Standing Committee Membership
European Union (Withdrawal Agreement) Bill
European Union (Withdrawal Agreement) Bill: Motion to Delay
European Union (Withdrawal Agreement) Bill


100%|██████████| 30/30 [00:55<00:00,  1.86s/it]


In this step we will create an architecture model that extracts data in a specific Dataframe format which will help us for creating our KG as well as for future NLP analysis.

In this section we will figure out how our virtual browser will click through different pages in a same year to extract data.

In [11]:
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)


class_pages = driver.find_element_by_class_name('paging')
all_pages = class_pages.find_elements_by_tag_name('a')
final_pages_id = []
for i in range(len(all_pages)):
  final_pages_id.append(all_pages[i].get_attribute("href"))


all_views = driver.find_elements_by_xpath('//*[@href]')     # EXTRACTING SESSION LINK OF A PAGE
final_view_id = []
for i in range(len(all_views)):
  if all_views[i].get_attribute('textContent') == 'View':
    final_view_id.append(all_views[i].get_attribute("id"))
print("Number of Session links on this page: ", len(final_view_id))


element = driver.find_element_by_xpath("//td/a[text()='2']").click()

WebDriverWait(driver, 10).until(EC.staleness_of(driver.find_element_by_xpath("//td/a[text()='2']")))

all_views = driver.find_elements_by_xpath('//*[@href]')     # EXTRACTING SESSION LINK OF A PAGE

final_view_id = []
for j in range(len(all_views)):
  if all_views[j].get_attribute('textContent') == 'View':
    final_view_id.append(all_views[j].get_attribute("id"))
print("Number of Session links on this page: ", len(final_view_id))

driver.quit()

Number of Session links on this page:  30
Number of Session links on this page:  3


In [12]:
# METHOD 2 (using id and clicker)- main
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)
ni_dataframe = pd.DataFrame(columns=['date', 'heading_1', 'heading_2', 'heading_3', 'contribution_minister_name', 'contribution', 'procedure', 'motion'])

all_views = driver.find_elements_by_xpath('//*[@href]')     # EXTRACTING SESSION LINK OF A PAGE
final_view_id = []
for i in range(len(all_views)):
  if all_views[i].get_attribute('textContent') == 'View':
    final_view_id.append(all_views[i].get_attribute("id"))
print("Number of Session links on this page: ", len(final_view_id))

try:
  for h in tqdm(final_view_id, total=len(final_view_id)):

    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, h))
        )
    element.click()
    main = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, "main"))
        )

    ni_date = main.find_element_by_tag_name("h1")
    new_row = {'date':'', 'heading_1':'', 'heading_2':'', 'heading_3':'', 'contribution_minister_name':'', 'contribution':'', 'procedure':'', 'motion':''}
    new_row['date'] = ni_date.text

    ni_div = main.find_elements_by_tag_name("div")
    
    for i in range(len(ni_div)):
      if ni_div[i].get_attribute('class') == 'Header1':
        new_row['heading_1'] = ni_div[i].text
      if ni_div[i].get_attribute('class') == 'Header2':
        new_row['heading_2'] = ni_div[i].text 
      if ni_div[i].get_attribute('class') == 'Header3':
        new_row['heading_3'] = ni_div[i].text
      if ni_div[i].get_attribute('class') == 'Procedure':
        new_row['procedure'] = new_row['procedure'] + ni_div[i].text +" "
      if ni_div[i].get_attribute('class') == 'Motion':
        new_row['motion'] = new_row['motion'] + ni_div[i].text +" "
      
      if ni_div[i].get_attribute('class') == 'Contribution':
        ni_name_and_contribution = ni_div[i].text
        if (ni_name_and_contribution.find(':') != -1):
          ni_name, ni_contribution = ni_name_and_contribution.split(":", 1)
          new_row['contribution_minister_name'] = ni_name
          new_row['contribution'] = ni_contribution

        else:
          new_row['contribution_minister_name'] = ni_name
          new_row['contribution'] = ni_name_and_contribution

        ni_dataframe = ni_dataframe.append(new_row, ignore_index=True)

        new_row['procedure'] = ''
        new_row['motion'] = ''
          
    driver.back()

except:
    print("error!")
    driver.quit()

  0%|          | 0/30 [00:00<?, ?it/s]

Number of Session links on this page:  30


100%|██████████| 30/30 [12:25<00:00, 24.85s/it]


In [13]:
ni_dataframe.head(10)

Unnamed: 0,date,heading_1,heading_2,heading_3,contribution_minister_name,contribution,procedure,motion
0,Official Report: Tuesday 21 July 2020,Assembly Business,,,Mr Deputy Speaker (Mr Beggs),"Before the first item of business, I remind M...",,
1,Official Report: Tuesday 21 July 2020,Assembly Business,,Standing Order 20(1): Suspension,Mr Butler,I beg to move,,
2,Official Report: Tuesday 21 July 2020,Assembly Business,,Standing Order 20(1): Suspension,Mr Deputy Speaker (Mr Beggs),"Before we proceed to the Question, I remind M...",,That Standing Order 20(1) be suspended for 21 ...
3,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr Deputy Speaker (Mr Beggs),The next two motions are to approve statutory...,Question put and agreed to.\n\nResolved (with ...,That Standing Order 20(1) be suspended for 21 ...
4,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...","Mr Kearney (Junior Minister, The Executive Off...",Éirím leis an rún a chur chun cinn. I beg to ...,,
5,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr Deputy Speaker (Mr Beggs),The Business Committee has agreed that there ...,,"That the Health Protection (Coronavirus, Restr..."
6,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr Kearney,There are two motions before the Assembly tod...,,
7,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr Kearney,"Regrettably, other places have not experienced...",,
8,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr McGrath (The Chairperson of the Committee f...,I am speaking on behalf of the Committee for ...,,
9,Official Report: Tuesday 21 July 2020,Executive Committee Business,,"Health Protection (Coronavirus, Restrictions) ...",Mr McGrath (The Chairperson of the Committee f...,I will now make a few remarks as an SDLP repre...,,


In [14]:
ni_dataframe.shape

(8996, 8)

## XXXXXXX WORK IN PROGRESS XXXXXXX

Since all of the NI assembly sessions in a particular  year cannot be contained in a single html page NI assembly website shows them in several other pages that we can navigate through. Our Web scraping model should go to these pages and perform the same data extraction technique that we developed previously.

In [16]:
# METHOD 2 (using id and clicker)- main
driver = webdriver.Chrome('chromedriver',options=options)
driver.get(url)
ni_dataframe = pd.DataFrame(columns=['date', 'heading_1', 'heading_2', 'heading_3', 'contribution_minister_name', 'contribution', 'procedure', 'motion'])

#### WORK SPACE FOR SCROLLABLE YEARS


#### WORK SPACE FOR SCROLLABLE YEARS

# We get the number of pages in a particular year (if any)
class_pages = driver.find_element_by_class_name('paging')  
all_pages = class_pages.find_elements_by_tag_name('a')
final_pages_id = []
for h in range(len(all_pages)):
  final_pages_id.append(all_pages[h].get_attribute("href"))
print(len(all_pages))

# MAIN
try:
  for i in range(len(final_pages_id)+1):     # LOOPING THROUGH PAGES

    all_views = driver.find_elements_by_xpath('//*[@href]')     # EXTRACTING ASSEMBLY SESSION LINKS OF A PAGE
    final_view_id = []
    for j in range(len(all_views)):
      if all_views[j].get_attribute('textContent') == 'View':
        final_view_id.append(all_views[j].get_attribute("id"))
        print(all_views[j].get_attribute("href"))
    print("Number of Session links on this page: ", len(final_view_id))
    
    for k in tqdm(final_view_id, total=len(final_view_id)):     # LOOPING THROUGH SESSION LINKS
      element = WebDriverWait(driver, 10).until(
          EC.presence_of_element_located((By.ID, k))
          )
      print(element.get_attribute("href"))
      driver.execute_script("arguments[0].click();", element)
      main = WebDriverWait(driver, 10).until(
          EC.presence_of_element_located((By.TAG_NAME, "main"))
          )

      ni_date = main.find_element_by_tag_name("h1")
      new_row = {'date':'', 'heading_1':'', 'heading_2':'', 'heading_3':'', 'contribution_minister_name':'', 'contribution':'','quote':'', 'procedure':'', 'motion':''}
      new_row['date'] = ni_date.text

      ni_div = main.find_elements_by_tag_name("div")
      print(len(ni_div))
      
      for m in range(len(ni_div)):
        if ni_div[m].get_attribute('class') == 'Header1':
          new_row['heading_1'] = ni_div[m].text
        if ni_div[m].get_attribute('class') == 'Header2':
          new_row['heading_2'] = ni_div[m].text 
        if ni_div[m].get_attribute('class') == 'Header3':
          new_row['heading_3'] = ni_div[m].text
        if ni_div[m].get_attribute('class') == 'Procedure':
          new_row['procedure'] = new_row['procedure'] + ni_div[m].text +" "
        if ni_div[m].get_attribute('class') == 'Motion':
          new_row['motion'] = new_row['motion'] + ni_div[m].text +" "
        if ni_div[m].get_attribute('class') == 'Quote':
          new_row['quote'] = ni_div[m].text 
        
        if ni_div[m].get_attribute('class') == 'Contribution':
          ni_name_and_contribution = ni_div[m].text
          if (ni_name_and_contribution.find(':') != -1):
            ni_name, ni_contribution = ni_name_and_contribution.split(":", 1)
            new_row['contribution_minister_name'] = ni_name
            new_row['contribution'] = ni_contribution.rstrip()

          else:
            new_row['contribution_minister_name'] = ni_name
            new_row['contribution'] = ni_name_and_contribution.rstrip()

          ni_dataframe = ni_dataframe.append(new_row, ignore_index=True)

          new_row['procedure'] = ''
          new_row['motion'] = ''
          new_row['quote'] = ''
            
      driver.back()
      if i != 0:
        element_1 = driver.find_element_by_xpath(f"//td/a[text()='2']").click()
        WebDriverWait(driver, 10).until(EC.staleness_of(driver.find_element_by_xpath("//td/a[text()='2']")))
    if i != len(final_pages_id):
      element_1 = driver.find_element_by_xpath(f"//td/a[text()='2']").click()
      WebDriverWait(driver, 10).until(EC.staleness_of(driver.find_element_by_xpath("//td/a[text()='2']")))
except:
    print("error!")
    driver.quit()

1
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/21&docID=304884
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/07&docID=304152
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/06&docID=304151
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/30&docID=303726
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/23&docID=302713
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/16&docID=302204
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/09&docID=301801
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/02&docID=301413
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/01&docID=301412
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/26&docID=301336
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/19&docID=301084
http://a

  0%|          | 0/30 [00:00<?, ?it/s]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/21&docID=304884
248


  3%|▎         | 1/30 [00:14<06:51, 14.18s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/07&docID=304152
714


  7%|▋         | 2/30 [00:58<10:47, 23.11s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/07/06&docID=304151
352


 10%|█         | 3/30 [01:17<09:49, 21.85s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/30&docID=303726
770


 13%|█▎        | 4/30 [02:04<12:45, 29.43s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/23&docID=302713
610


 17%|█▋        | 5/30 [02:40<13:07, 31.50s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/16&docID=302204
588


 20%|██        | 6/30 [03:18<13:23, 33.49s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/09&docID=301801
442


 23%|██▎       | 7/30 [03:44<11:58, 31.24s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/02&docID=301413
540


 27%|██▋       | 8/30 [04:17<11:41, 31.88s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/06/01&docID=301412
275


 30%|███       | 9/30 [04:34<09:32, 27.26s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/26&docID=301336
413


 33%|███▎      | 10/30 [04:59<08:53, 26.65s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/19&docID=301084
351


 37%|███▋      | 11/30 [05:21<08:00, 25.28s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/12&docID=300902
211


 40%|████      | 12/30 [05:35<06:32, 21.83s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/05/05&docID=300622
408


 43%|████▎     | 13/30 [06:00<06:24, 22.62s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/04/28&docID=300528
314


 47%|████▋     | 14/30 [06:17<05:38, 21.13s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/04/21&docID=300445
215


 50%|█████     | 15/30 [06:30<04:41, 18.75s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/31&docID=299867
213


 53%|█████▎    | 16/30 [06:44<04:00, 17.16s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/24&docID=299356
253


 57%|█████▋    | 17/30 [06:59<03:36, 16.68s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/23&docID=299355
179


 60%|██████    | 18/30 [07:10<02:59, 14.99s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/16&docID=298480
786


 63%|██████▎   | 19/30 [08:09<05:08, 28.06s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/10&docID=297457
521


 67%|██████▋   | 20/30 [08:43<04:59, 29.97s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/09&docID=297456
431


 70%|███████   | 21/30 [09:10<04:20, 28.97s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/03&docID=296420
506


 73%|███████▎  | 22/30 [09:43<04:02, 30.30s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/03/02&docID=296419
728


 77%|███████▋  | 23/30 [10:38<04:22, 37.51s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/02/25&docID=295484
632


 80%|████████  | 24/30 [11:20<03:53, 38.99s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/02/24&docID=295483
373


 83%|████████▎ | 25/30 [11:44<02:52, 34.42s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/02/17&docID=294390
543


 87%|████████▋ | 26/30 [12:19<02:18, 34.68s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/02/10&docID=293242
539


 90%|█████████ | 27/30 [12:53<01:43, 34.37s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/02/03&docID=292480
713


 93%|█████████▎| 28/30 [13:45<01:19, 39.78s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/27&docID=291486
227


 97%|█████████▋| 29/30 [14:00<00:32, 32.32s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/20&docID=291366
182


100%|██████████| 30/30 [14:11<00:00, 28.38s/it]
  0%|          | 0/3 [00:00<?, ?it/s]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/14&docID=290860
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/11&docID=290745
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2019/10/21&docID=290622
Number of Session links on this page:  3
http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/14&docID=290860
390


 33%|███▎      | 1/3 [00:25<00:51, 25.74s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2020/01/11&docID=290745
219


 67%|██████▋   | 2/3 [00:40<00:22, 22.48s/it]

http://aims.niassembly.gov.uk/officialreport/report.aspx?&eveDate=2019/10/21&docID=290622
101


100%|██████████| 3/3 [00:48<00:00, 16.18s/it]


In [17]:
ni_dataframe.to_csv(r'/content/drive/My Drive/Colab Notebooks/QARIK_placement_project/web_scraping/output/ni_assembly_hansard.csv', index = False)

In [18]:
ni_dataframe.shape

(9522, 9)

In [19]:
""" THINGS TO DO & CORRECT:
1) Very few contribution blocks are partially extracted. This is due to the 'new line' code present and our model ignores the rest text when it detects a "\n.
We need to figure out a way to combine paragraphs seperated by a line. (tried rstrip() to remove the "\n", didn't change the outcome)
2) Procedure, motion, quotes appear in one cell after it should originally be located. ( This issue doesn't effect our goal but will correct this)
3) The page number in the javascript clicker should be replaced by the for loop number (i) instead of being manually written.
4) Scrolling through year"""

' THINGS TO DO & CORRECT:\n1) Very few contribution blocks are partially extracted. This is due to the \'new line\' code present and our model ignores the rest text when it detects a "\n.\nWe need to figure out a way to combine paragraphs seperated by a line. (tried rstrip() to remove the "\n", didn\'t change the outcome)\n2) Procedure, motion, quotes appear in one cell after it should originally be located. ( This issue doesn\'t effect our goal but will correct this)\n3) The page number in the javascript clicker should be replaced by the for loop number (i) instead of being manually written.\n4) Scrolling through year'