# QLD Data Pipeline

This data pipeline extracts and prepares the QLD donations, lobbyist, and their client data. \
The outputs of this pipeline includes **qldlobbyist.csv, qldlbClient.csv, qld_donations.csv, g_node.csv and g_edge.csv** \
which can be found in the 'data' folder in this repository.

## Install and load Packages

### Prerequisites

- A **cloned government_transparency public Github repository on local directory** where this notebook file is excuted from.
- A correct version of chromedriver is required on the working directory. \
Please install the version that matches your browser.
https://chromedriver.chromium.org/downloads

### Load Common Libraries

In [1]:
import pandas as pd
import numpy as np
from IPython.display import display, HTML
try:
    import scrapy # scrape webpage
except:
    !pip install scrapy
    import scrapy
from scrapy.crawler import CrawlerProcess
# text cleaning
import re
# Settings for notebook
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# Show Python version
import platform
print("Installed Python version" , platform.python_version())

Installed Python version 3.8.8


## Extract & Transform
Process all files into CSV and load CSVs

### Extract Lobbyist Data

The Register of Lobbyists is a list of professional lobbyists who wish to lobby Government representatives. \
Extracting the QLD lobbysit data from https://lobbyists.integrity.qld.gov.au/register-details/list-companies.aspx

#### middlewares

In [2]:
from scrapy import signals

# useful for handling different item types with a single interface
from itemadapter import is_item, ItemAdapter


class QldlobbyistSpiderMiddleware:
    # Not all methods need to be defined. If a method is not defined,
    # scrapy acts as if the spider middleware does not modify the
    # passed objects.

    @classmethod
    def from_crawler(cls, crawler):
        # This method is used by Scrapy to create your spiders.
        s = cls()
        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
        return s

    def process_spider_input(self, response, spider):
        # Called for each response that goes through the spider
        # middleware and into the spider.

        # Should return None or raise an exception.
        return None

    def process_spider_output(self, response, result, spider):
        # Called with the results returned from the Spider, after
        # it has processed the response.

        # Must return an iterable of Request, or item objects.
        for i in result:
            yield i

    def process_spider_exception(self, response, exception, spider):
        # Called when a spider or process_spider_input() method
        # (from other spider middleware) raises an exception.

        # Should return either None or an iterable of Request or item objects.
        pass

    def process_start_requests(self, start_requests, spider):
        # Called with the start requests of the spider, and works
        # similarly to the process_spider_output() method, except
        # that it doesn’t have a response associated.

        # Must return only requests (not items).
        for r in start_requests:
            yield r

    def spider_opened(self, spider):
        spider.logger.info('Spider opened: %s' % spider.name)


class QldlobbyistDownloaderMiddleware:
    # Not all methods need to be defined. If a method is not defined,
    # scrapy acts as if the downloader middleware does not modify the
    # passed objects.

    @classmethod
    def from_crawler(cls, crawler):
        # This method is used by Scrapy to create your spiders.
        s = cls()
        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
        return s

    def process_request(self, request, spider):
        # Called for each request that goes through the downloader
        # middleware.

        # Must either:
        # - return None: continue processing this request
        # - or return a Response object
        # - or return a Request object
        # - or raise IgnoreRequest: process_exception() methods of
        #   installed downloader middleware will be called
        return None

    def process_response(self, request, response, spider):
        # Called with the response returned from the downloader.

        # Must either;
        # - return a Response object
        # - return a Request object
        # - or raise IgnoreRequest
        return response

    def process_exception(self, request, exception, spider):
        # Called when a download handler or a process_request()
        # (from other downloader middleware) raises an exception.

        # Must either:
        # - return None: continue processing this exception
        # - return a Response object: stops process_exception() chain
        # - return a Request object: stops process_exception() chain
        pass

    def spider_opened(self, spider):
        spider.logger.info('Spider opened: %s' % spider.name)

#### setup a pipeline

In [3]:
from itemadapter import ItemAdapter

class QldlobbyistPipeline:
    def process_item(self, item, spider):
        return item

#### model for scraped items

In [4]:
class QldlobbyistItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    pass


#### Define the spider

In [5]:
from twisted.internet import reactor
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging

# Reactor restart
from crochet import setup, wait_for
setup()

class QldlobbySpider(scrapy.Spider):
    name = 'qldlobbyist'
    allowed_domains = ['lobbyists.integrity.qld.gov.au']
    start_urls = ['https://lobbyists.integrity.qld.gov.au/register-details/list-lobbyists.aspx']
    custom_settings = {
        'DOWNLOAD_DELAY': '0',
        'BOT_NAME': 'qldlobbyist',
        'SPIDER_MODULES': 'qldlobbyist.spiders',
        'NEWSPIDER_MODULE': 'qldlobbyist.spiders',
        'ROBOTSTXT_OBEY': 'False',
        'FEEDS': {
            'data/qldlobbyist.csv': { # csv output
                'format': 'csv',
                'overwrite': True
            }
        }
    }
    def parse(self, response):
            url =[]
            endings = response.xpath('//*[@id="ListView"]/li/a/@href')
            for ending in endings:
                url.append('https://lobbyists.integrity.qld.gov.au/register-details/'+ending.get())

            for u in url:
                yield scrapy.Request(url=u, callback = self.parse_client_data)
    
    def parse_client_data(self, response):
        yield {
            'Lobbyist Name' : response.xpath('//*[@id="ctl00_ContentPlaceholder1_lblName"]/text()').get(),
            'Lobbying Firm' : response.xpath('//*[@id="article"]/div/table/tr[2]/td[2]//text()').get(),
            'ABN' : response.xpath('//*[@id="article"]/div/table/tr[3]/td[2]//text()').get(),
            'Position' : response.xpath('//*[@id="article"]/div/table/tr[4]/td[2]//text()').get(),
            }
        
def run_spider():
    """run spider with qldlobbyist"""
    crawler = CrawlerProcess()
    d = crawler.crawl(QldlobbySpider)
    return d

#### Start the crawler

In [6]:
run_spider()

2022-10-14 23:25:31 [scrapy.utils.log] INFO: Scrapy 2.6.3 started (bot: scrapybot)
2022-10-14 23:25:31 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 2.0.1, Twisted 22.8.0, Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.1.0 (OpenSSL 3.0.5 5 Jul 2022), cryptography 38.0.1, Platform Windows-10-10.0.19041-SP0
2022-10-14 23:25:31 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'qldlobbyist',
 'DOWNLOAD_DELAY': '0',
 'NEWSPIDER_MODULE': 'qldlobbyist.spiders',
 'ROBOTSTXT_OBEY': 'False',
 'SPIDER_MODULES': 'qldlobbyist.spiders'}
2022-10-14 23:25:31 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-10-14 23:25:31 [scrapy.extensions.telnet] INFO: Telnet Password: b9a7944603cc0234
2022-10-14 23:25:31 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.fe

<Deferred at 0x1cc0cae4220>

2022-10-14 23:25:32 [filelock] DEBUG: Attempting to acquire lock 1975902692400 on C:\Users\Owner\anaconda3\lib\site-packages\tldextract\.suffix_cache/publicsuffix.org-tlds\de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2022-10-14 23:25:32 [filelock] INFO: Lock 1975902692400 acquired on C:\Users\Owner\anaconda3\lib\site-packages\tldextract\.suffix_cache/publicsuffix.org-tlds\de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2022-10-14 23:25:32 [filelock] DEBUG: Attempting to release lock 1975902692400 on C:\Users\Owner\anaconda3\lib\site-packages\tldextract\.suffix_cache/publicsuffix.org-tlds\de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2022-10-14 23:25:32 [filelock] INFO: Lock 1975902692400 released on C:\Users\Owner\anaconda3\lib\site-packages\tldextract\.suffix_cache/publicsuffix.org-tlds\de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2022-10-14 23:25:32 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/l

2022-10-14 23:25:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3190%20%20%20> (referer: None)
2022-10-14 23:25:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2682%20%20%20>
{'Lobbyist Name': 'Leo Zussino', 'Lobbying Firm': 'Suncoast Business Consultants Pty Ltd', 'ABN': '90304761881', 'Position': 'Managing Director'}
2022-10-14 23:25:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2533%20%20%20> (referer: None)
2022-10-14 23:25:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3190%20%20%20>
{'Lobbyist Name': 'Eliza Woods', 'Lobbying Firm': 'BBS Communications Group Pty Ltd', 'ABN': '34 010 899 779', 'Position': 'Consultant'}
2022-10-14 23:25:34 [scrapy.core.

2022-10-14 23:25:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2984%20%20%20> (referer: None)
2022-10-14 23:25:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3043%20%20%20>
{'Lobbyist Name': 'Elizabeth Walker', 'Lobbying Firm': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822', 'Position': 'Associate'}
2022-10-14 23:25:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3140%20%20%20>
{'Lobbyist Name': 'Ethan Wales', 'Lobbying Firm': 'Fifty Acres - The Communications Agency', 'ABN': '29 145 634 224', 'Position': 'Executive Assistant'}
2022-10-14 23:25:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2734%20%20%20> (referer: None)
2022-10-14 23:25:35 [scrapy.core.scra

2022-10-14 23:25:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3220%20%20%20> (referer: None)
2022-10-14 23:25:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3078%20%20%20>
{'Lobbyist Name': 'Edward Strong', 'Lobbying Firm': 'Nexus APAC Pty Ltd', 'ABN': '76 615 655 699', 'Position': 'Consultant'}
2022-10-14 23:25:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2973%20%20%20> (referer: None)
2022-10-14 23:25:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3220%20%20%20>
{'Lobbyist Name': 'Andrew Stewart', 'Lobbying Firm': 'Michelson Alexander Pty Ltd', 'ABN': '50 660 359 866', 'Position': 'Senior Associate'}
2022-10-14 23:25:36 [scrapy.core.engine] DEBUG: Cr

2022-10-14 23:25:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3223%20%20%20>
{'Lobbyist Name': 'Adam Seaton', 'Lobbying Firm': 'Adams + Sparkes Town Planning and Development', 'ABN': '39 290 334 500', 'Position': 'Senior Town Planner'}
2022-10-14 23:25:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2581%20%20%20>
{'Lobbyist Name': 'Russell Scoular', 'Lobbying Firm': 'Chatto Creek Advisory', 'ABN': '34613546142', 'Position': 'Chairman'}
2022-10-14 23:25:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2732%20%20%20> (referer: None)
2022-10-14 23:25:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2890%20%20%20> (referer: None)
2022-10-14 23:25:38 [scrapy.core

2022-10-14 23:25:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2660%20%20%20> (referer: None)
2022-10-14 23:25:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=564%20%20%20> (referer: None)
2022-10-14 23:25:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2660%20%20%20>
{'Lobbyist Name': 'Nick Trainor', 'Lobbying Firm': 'Australian Public Affairs', 'ABN': '20 098 705 403', 'Position': 'Director'}
2022-10-14 23:25:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=564%20%20%20>
{'Lobbyist Name': 'Peter  Sparkes', 'Lobbying Firm': 'Adams + Sparkes Town Planning and Development', 'ABN': '39 290 334 500', 'Position': 'Director'}
2022-10-14 23:25:40 [scrapy.core.engin

2022-10-14 23:25:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=334%20%20%20>
{'Lobbyist Name': 'Damian Power', 'Lobbying Firm': 'Govstrat Pty Ltd', 'ABN': '64 964 952 044', 'Position': 'Government Relations Consultant'}
2022-10-14 23:25:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3202%20%20%20> (referer: None)
2022-10-14 23:25:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3228%20%20%20>
{'Lobbyist Name': 'Hayley Schubert', 'Lobbying Firm': 'Sling and Stone', 'ABN': '87 145 965 466', 'Position': 'Senior Account Director'}
2022-10-14 23:25:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2967%20%20%20> (referer: None)
2022-10-14 23:25:41 [scrapy.core.scra

2022-10-14 23:25:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2673%20%20%20> (referer: None)
2022-10-14 23:25:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3113%20%20%20>
{'Lobbyist Name': 'Avanti Oberoi', 'Lobbying Firm': 'CT Research Strategies Results', 'ABN': '58 101 934 454', 'Position': 'Consultant'}
2022-10-14 23:25:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3191%20%20%20> (referer: None)
2022-10-14 23:25:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2673%20%20%20>
{'Lobbyist Name': 'Fiona Scott', 'Lobbying Firm': 'PremierNational', 'ABN': '71 619 450 841', 'Position': 'Executive Director'}
2022-10-14 23:25:43 [scrapy.core.engine] DEBUG: Cra

2022-10-14 23:25:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2681%20%20%20> (referer: None)
2022-10-14 23:25:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3239%20%20%20> (referer: None)
2022-10-14 23:25:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2681%20%20%20>
{'Lobbyist Name': 'Stephen Milgate', 'Lobbying Firm': 'S A Milgate & Associates Pty Ltd', 'ABN': '54 078 858 401', 'Position': 'Principal and Director'}
2022-10-14 23:25:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=1095%20%20%20> (referer: None)
2022-10-14 23:25:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id

2022-10-14 23:25:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2937%20%20%20>
{'Lobbyist Name': 'Daniel McDougall', 'Lobbying Firm': 'Daniel McDougall', 'ABN': '22 535 615 464', 'Position': 'Communications Consultant'}
2022-10-14 23:25:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2809%20%20%20> (referer: None)
2022-10-14 23:25:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2641%20%20%20>
{'Lobbyist Name': 'Patrick McClelland', 'Lobbying Firm': 'Porter Novelli Australia Pty Ltd', 'ABN': '40079616050', 'Position': 'Communications Consultant'}
2022-10-14 23:25:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2576%20%20%20> (referer: None)
2022-10-14 23:25:46

2022-10-14 23:25:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3183%20%20%20> (referer: None)
2022-10-14 23:25:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3172%20%20%20>
{'Lobbyist Name': 'Reginald Mickel', 'Lobbying Firm': 'Rowland Pty Ltd', 'ABN': '59 011 033 364', 'Position': 'Strategic Director'}
2022-10-14 23:25:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2718%20%20%20> (referer: None)
2022-10-14 23:25:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3183%20%20%20>
{'Lobbyist Name': 'Nino Lalic', 'Lobbying Firm': 'advico', 'ABN': '52658982471', 'Position': 'Director'}
2022-10-14 23:25:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobb

2022-10-14 23:25:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3210%20%20%20> (referer: None)
2022-10-14 23:25:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2597%20%20%20>
{'Lobbyist Name': 'Cameron Jones', 'Lobbying Firm': 'CMAX Advisory', 'ABN': '73 130 740 546', 'Position': 'Government Relations Adviser'}
2022-10-14 23:25:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=1097%20%20%20> (referer: None)
2022-10-14 23:25:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3210%20%20%20>
{'Lobbyist Name': 'Alasdair Jeffrey', 'Lobbying Firm': 'Rowland Pty Ltd', 'ABN': '59 011 033 364', 'Position': 'Managing Director'}
2022-10-14 23:25:48 [scrapy.core.engine] DEBUG

2022-10-14 23:25:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2690%20%20%20>
{'Lobbyist Name': 'George Hazim', 'Lobbying Firm': 'Media and Public Affairs', 'ABN': '49622979177', 'Position': 'Director'}
2022-10-14 23:25:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=1275%20%20%20> (referer: None)
2022-10-14 23:25:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2525%20%20%20>
{'Lobbyist Name': 'Dolan Hayes', 'Lobbying Firm': 'Empower Pty Ltd', 'ABN': '33633821366', 'Position': 'Director'}
2022-10-14 23:25:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3237%20%20%20> (referer: None)
2022-10-14 23:25:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://l

2022-10-14 23:25:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2651%20%20%20> (referer: None)
2022-10-14 23:25:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3026%20%20%20>
{'Lobbyist Name': 'Andrew Hargrave', 'Lobbying Firm': 'CMAX Advisory', 'ABN': '73 130 740 546', 'Position': 'Government Relations Adviser'}
2022-10-14 23:25:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2956%20%20%20> (referer: None)
2022-10-14 23:25:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2651%20%20%20>
{'Lobbyist Name': 'David Gracey', 'Lobbying Firm': 'Strategic Political Counsel', 'ABN': '24613884763', 'Position': 'Senior Counsel - Health'}
2022-10-14 23:25:51 [scrapy.core.

2022-10-14 23:25:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3114%20%20%20>
{'Lobbyist Name': 'Paula Gelo', 'Lobbying Firm': 'Australian Public Affairs', 'ABN': '20 098 705 403', 'Position': 'Senior Consultant'}
2022-10-14 23:25:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2938%20%20%20> (referer: None)
2022-10-14 23:25:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3049%20%20%20>
{'Lobbyist Name': 'Ross Dennis', 'Lobbying Firm': 'FPL Advisory', 'ABN': '34123819385', 'Position': 'Policy Analyst'}
2022-10-14 23:25:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3132%20%20%20> (referer: None)
2022-10-14 23:25:52 [scrapy.core.scraper] DEBUG: Scraped from 

2022-10-14 23:25:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3244%20%20%20> (referer: None)
2022-10-14 23:25:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=230%20%20%20>
{'Lobbyist Name': 'Peter Costantini', 'Lobbying Firm': 'SAS Group', 'ABN': '33 136 520 548', 'Position': 'Managing Director'}
2022-10-14 23:25:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2675%20%20%20> (referer: None)
2022-10-14 23:25:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3244%20%20%20>
{'Lobbyist Name': 'Joshua Copland', 'Lobbying Firm': 'CMAX Advisory', 'ABN': '73 130 740 546', 'Position': 'Government Relations Adviser'}
2022-10-14 23:25:53 [scrapy.core.engine] DEBUG: Craw

2022-10-14 23:25:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3236%20%20%20>
{'Lobbyist Name': 'Alistair Coleman', 'Lobbying Firm': 'SEC Newgate', 'ABN': '38162366056', 'Position': 'Consultant'}
2022-10-14 23:25:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3162%20%20%20> (referer: None)
2022-10-14 23:25:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3259%20%20%20>
{'Lobbyist Name': 'Timothy Cook', 'Lobbying Firm': 'The Civic Partnership', 'ABN': '71652574171', 'Position': 'Senior Account Executive'}
2022-10-14 23:25:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2985%20%20%20> (referer: None)
2022-10-14 23:25:55 [scrapy.core.scraper] DEBUG: Scraped fro

2022-10-14 23:25:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2859%20%20%20> (referer: None)
2022-10-14 23:25:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3252%20%20%20>
{'Lobbyist Name': 'Philippa Bosquet', 'Lobbying Firm': 'London Agency', 'ABN': '73145352147', 'Position': 'Senior Account Manager'}
2022-10-14 23:25:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2798%20%20%20> (referer: None)
2022-10-14 23:25:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2859%20%20%20>
{'Lobbyist Name': 'PAOLO BINI', 'Lobbying Firm': 'CRISIS&COMMS CO', 'ABN': '56 637 786 899', 'Position': 'PARTNER'}
2022-10-14 23:25:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET h

2022-10-14 23:25:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3243%20%20%20>
{'Lobbyist Name': 'Joshua Copland', 'Lobbying Firm': 'CMAX Advisory', 'ABN': '73 130 740 546', 'Position': 'Government Relations Adviser'}
2022-10-14 23:25:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2867%20%20%20> (referer: None)
2022-10-14 23:25:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=3256%20%20%20>
{'Lobbyist Name': 'Leon Beswick', 'Lobbying Firm': 'The Civic Partnership', 'ABN': '71652574171', 'Position': 'Managing Partner'}
2022-10-14 23:25:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/lobbyist-details.aspx?id=2867%20%20%20>
{'Lobbyist Name': 'Sara Benallack', 'Lobbying Firm': 'BBS Commu

#### Read the output

In [7]:
# please run once the 'run_spider()' is completed.
lobbyist = pd.read_csv('data/qldlobbyist.csv')
lobbyist.head()

Unnamed: 0,Lobbyist Name,Lobbying Firm,ABN,Position
0,Kirby Anderson,PolicyWonks,50 463 070 316,Director
1,Richard Amos,Royce Comm,91 167 042 408,Managing Director
2,Jason Aldworth,The Civic Partnership,71652574171,Managing Partner
3,Cameron Adams,Adams + Sparkes Town Planning and Development,39 290 334 500,Managing Director
4,Karly Abbott,The Inner Circle Strategic Advisory,75634265311,Partner


### Extract Lobbyist Client Data

This data consists of third party clients of the Lobbyist that currently retain the services of the business to provide paid or unpaid lobbyist services. \
Extracting the QLD lobbysit clients data from https://lobbyists.integrity.qld.gov.au/register-details/list-clients.aspx

####  The process for scraping the lobbyist client data uses the existing settings, and classes from the lobbyist data scraping process

#### Define the spider

In [8]:
from scrapy.crawler import CrawlerProcess

# Reactor restart
from crochet import setup, wait_for
setup()

class QldlobClientSpider(scrapy.Spider):
    name = 'qldlobbyclient'
    allowed_domains = ['lobbyists.integrity.qld.gov.au']
    start_urls = ['https://lobbyists.integrity.qld.gov.au/register-details/list-clients.aspx']
    custom_settings = {
                'DOWNLOAD_DELAY': '0',
        'BOT_NAME': 'qldlobbyist',
        'SPIDER_MODULES': 'qldlobbyist.spiders',
        'NEWSPIDER_MODULE': 'qldlobbyist.spiders',
        'ROBOTSTXT_OBEY': 'False',
        'FEEDS': {
            'data/qldlbClient.csv': { # csv output
                'format': 'csv',
                'overwrite': True
            }
        }
    }
    def parse(self, response):
            url =[]
            endings = response.xpath('//*[@id="ListView"]/li/a/@href')
            for ending in endings:
                url.append('https://lobbyists.integrity.qld.gov.au/register-details/'+ending.get())

            for u in url:
                yield scrapy.Request(url=u, callback = self.parse_client_data)
    
    def parse_client_data(self, response):
        yield {
            'Client Name' : response.xpath('//*[@id="ctl00_ContentPlaceholder1_lblCName"]/text()').get(),
            'Lobbyist' : response.xpath('//*[@id="article"]/div/table/tr[2]/td[2]//text()').get(),
            'ABN' : response.xpath('//*[@id="article"]/div/table/tr[3]/td[2]//text()').get(),
            }

def run_spider():
    """run spider with qldlobbyclient"""
    crawler = CrawlerProcess()
    d = crawler.crawl(QldlobClientSpider)
    return d

#### Start the crawler

In [9]:
run_spider()

2022-10-14 23:26:38 [scrapy.utils.log] INFO: Scrapy 2.6.3 started (bot: scrapybot)
2022-10-14 23:26:38 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 2.0.1, Twisted 22.8.0, Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.1.0 (OpenSSL 3.0.5 5 Jul 2022), cryptography 38.0.1, Platform Windows-10-10.0.19041-SP0
2022-10-14 23:26:38 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'qldlobbyist',
 'DOWNLOAD_DELAY': '0',
 'NEWSPIDER_MODULE': 'qldlobbyist.spiders',
 'ROBOTSTXT_OBEY': 'False',
 'SPIDER_MODULES': 'qldlobbyist.spiders'}
2022-10-14 23:26:38 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-10-14 23:26:38 [scrapy.extensions.telnet] INFO: Telnet Password: 60fccf142a698ecf
2022-10-14 23:26:38 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.fe

<Deferred at 0x1cc0d2cb550>

2022-10-14 23:26:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/list-clients.aspx> (referer: None)
2022-10-14 23:26:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6925%20%20%20> (referer: None)
2022-10-14 23:26:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4194%20%20%20> (referer: None)
2022-10-14 23:26:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6925%20%20%20>
{'Client Name': '1st Group', 'Lobbyist': 'TG Public Affairs', 'ABN': '23630677673'}
2022-10-14 23:26:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6586%20%20%20> (referer: None)
2022-10-14 23:26:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET

2022-10-14 23:26:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7112%20%20%20> (referer: None)
2022-10-14 23:26:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6727%20%20%20>
{'Client Name': 'Zero Mass Water (Australia) Pty Limited', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:26:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6895%20%20%20> (referer: None)
2022-10-14 23:26:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7112%20%20%20>
{'Client Name': 'Ampol', 'Lobbyist': 'Northstar Public Affairs Pty Ltd', 'ABN': '25 638 744 046'}
2022-10-14 23:26:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/reg

2022-10-14 23:26:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7044%20%20%20> (referer: None)
2022-10-14 23:26:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7514%20%20%20> (referer: None)
2022-10-14 23:26:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7044%20%20%20>
{'Client Name': 'WorkHaven Pty Ltd', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:26:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6452%20%20%20> (referer: None)
2022-10-14 23:26:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7514%20%20%20>
{'Client Name': 'Within Energy Pty Ltd', 'Lo

2022-10-14 23:26:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=870%20%20%20> (referer: None)
2022-10-14 23:26:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6645%20%20%20> (referer: None)
2022-10-14 23:26:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=1458%20%20%20>
{'Client Name': 'Wesfarmers Limited', 'Lobbyist': 'DPG Advisory Solutions Pty Ltd', 'ABN': '14 634 403 115'}
2022-10-14 23:26:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=2615%20%20%20>
{'Client Name': 'W & M Carnall Pty Ltd ', 'Lobbyist': 'Adams + Sparkes Town Planning and Development', 'ABN': '39 290 334 500'}
2022-10-14 23:26:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.int

2022-10-14 23:26:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6459%20%20%20> (referer: None)
2022-10-14 23:26:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7229%20%20%20>
{'Client Name': 'Vast Solar Pty Ltd', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:26:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5649%20%20%20> (referer: None)
2022-10-14 23:26:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6459%20%20%20>
{'Client Name': 'Varley Rafael Australia', 'Lobbyist': 'Outcomes & Strategies Group', 'ABN': '20 018 352 673'}
2022-10-14 23:26:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/reg

2022-10-14 23:26:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=867%20%20%20> (referer: None)
2022-10-14 23:26:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6643%20%20%20>
{'Client Name': 'TWITTER AUSTRALIA HOLDINGS PTY LIMITED', 'Lobbyist': 'Sling and Stone', 'ABN': '87 145 965 466'}
2022-10-14 23:26:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7344%20%20%20> (referer: None)
2022-10-14 23:26:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=867%20%20%20>
{'Client Name': 'Tuckeria Pty Ltd', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:26:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.

2022-10-14 23:26:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6752%20%20%20> (referer: None)
2022-10-14 23:26:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6464%20%20%20>
{'Client Name': 'WE Platt', 'Lobbyist': 'Kurrajong Strategic Counsel', 'ABN': '14 625 954 912'}
2022-10-14 23:26:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7278%20%20%20> (referer: None)
2022-10-14 23:26:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6752%20%20%20>
{'Client Name': 'TheirCare Pty Ltd', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:26:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-det

2022-10-14 23:26:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7239%20%20%20>
{'Client Name': 'The McKinnon Institute for Political Leadership', 'Lobbyist': 'Barton Deakin', 'ABN': '65 140 067 287'}
2022-10-14 23:26:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6005%20%20%20>
{'Client Name': 'The Linden Group Pty Ltd', 'Lobbyist': 'GR Solutions', 'ABN': '28 125 233 543'}
2022-10-14 23:26:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6463%20%20%20> (referer: None)
2022-10-14 23:26:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6977%20%20%20> (referer: None)
2022-10-14 23:26:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/

2022-10-14 23:26:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7304%20%20%20> (referer: None)
2022-10-14 23:26:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6899%20%20%20> (referer: None)
2022-10-14 23:26:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7469%20%20%20>
{'Client Name': 'Test Client 2 Only', 'Lobbyist': 'Test Only', 'ABN': '12 345 678 910'}
2022-10-14 23:26:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=610%20%20%20> (referer: None)
2022-10-14 23:26:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6753%20%20%20> (referer: None)
2022-10-14 23:26:50 [scrapy.core.scraper] DE

2022-10-14 23:26:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7340%20%20%20> (referer: None)
2022-10-14 23:26:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7367%20%20%20>
{'Client Name': 'Swyftx', 'Lobbyist': 'DPG Advisory Solutions Pty Ltd', 'ABN': '14 634 403 115'}
2022-10-14 23:26:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6259%20%20%20> (referer: None)
2022-10-14 23:26:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7340%20%20%20>
{'Client Name': 'Sunshine Coast Orthopaedic Group', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:26:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/reg

2022-10-14 23:26:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6897%20%20%20>
{'Client Name': 'Sunshine Hospice', 'Lobbyist': 'Project Urban Pty Ltd', 'ABN': '97 608 895 923'}
2022-10-14 23:26:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3728%20%20%20> (referer: None)
2022-10-14 23:26:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7254%20%20%20>
{'Client Name': 'Story Bridge Adventure Climb', 'Lobbyist': 'Forrester Consulting', 'ABN': '48 697 260 710'}
2022-10-14 23:26:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7223%20%20%20> (referer: None)
2022-10-14 23:26:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-de

2022-10-14 23:26:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6840%20%20%20> (referer: None)
2022-10-14 23:26:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=587%20%20%20>
{'Client Name': 'Springfield Land Corporation', 'Lobbyist': 'Govstrat Pty Ltd', 'ABN': '64 964 952 044'}
2022-10-14 23:26:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6999%20%20%20> (referer: None)
2022-10-14 23:26:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6840%20%20%20>
{'Client Name': 'Spotlight Property Group', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:26:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-detail

2022-10-14 23:26:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3097%20%20%20>
{'Client Name': 'South Sky Assets Pty Ltd ', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:26:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6662%20%20%20>
{'Client Name': 'Soldier On', 'Lobbyist': 'CMAX Advisory', 'ABN': '73 130 740 546'}
2022-10-14 23:26:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7348%20%20%20> (referer: None)
2022-10-14 23:26:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7480%20%20%20>
{'Client Name': 'Smiling Mind', 'Lobbyist': 'Primary Communication', 'ABN': '36617864347'}
2022-10-14 23:26:55 [scrapy.core.scraper] DEBUG

2022-10-14 23:26:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7059%20%20%20>
{'Client Name': 'Sign Site Group Pty Ltd', 'Lobbyist': 'Santoro Consulting', 'ABN': '21 131 482 230'}
2022-10-14 23:26:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=2709%20%20%20> (referer: None)
2022-10-14 23:26:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3113%20%20%20>
{'Client Name': 'Showbar140 Pty Ltd ', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:26:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7027%20%20%20> (referer: None)
2022-10-14 23:26:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/regi

2022-10-14 23:26:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=564%20%20%20>
{'Client Name': 'Seacliff Developments Pty Ltd', 'Lobbyist': 'GR Solutions', 'ABN': '28 125 233 543'}
2022-10-14 23:26:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4002%20%20%20> (referer: None)
2022-10-14 23:26:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7303%20%20%20>
{'Client Name': 'SEA Electric', 'Lobbyist': 'PolicyWonks', 'ABN': '50 463 070 316'}
2022-10-14 23:26:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4639%20%20%20> (referer: None)
2022-10-14 23:26:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.a

2022-10-14 23:26:58 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6279%20%20%20>
{'Client Name': 'St Baker Energy Innovation Fund', 'Lobbyist': 'John Short', 'ABN': '62880904917'}
2022-10-14 23:26:58 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6716%20%20%20> (referer: None)
2022-10-14 23:26:58 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4312%20%20%20>
{'Client Name': 'Samsung Electronics Australia', 'Lobbyist': 'Edelman Australia Pty Ltd', 'ABN': '40 004 846 100'}
2022-10-14 23:26:58 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6775%20%20%20> (referer: None)
2022-10-14 23:26:58 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/regi

2022-10-14 23:27:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7506%20%20%20> (referer: None)
2022-10-14 23:27:00 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3771%20%20%20>
{'Client Name': 'Riverside Marine', 'Lobbyist': 'PolicyWonks', 'ABN': '50 463 070 316'}
2022-10-14 23:27:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7427%20%20%20> (referer: None)
2022-10-14 23:27:00 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7506%20%20%20>
{'Client Name': 'Riverside Industrial Sands', 'Lobbyist': 'PolicyWonks', 'ABN': '50 463 070 316'}
2022-10-14 23:27:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.

2022-10-14 23:27:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7381%20%20%20>
{'Client Name': 'RSM Australia Pty Ltd', 'Lobbyist': 'BBS Communications Group Pty Ltd', 'ABN': '34 010 899 779'}
2022-10-14 23:27:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7094%20%20%20>
{'Client Name': 'RG Property', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:27:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7488%20%20%20> (referer: None)
2022-10-14 23:27:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7410%20%20%20>
{'Client Name': 'Responsible Wagering Australia Holdings Ltd', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:27:01

2022-10-14 23:27:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6263%20%20%20> (referer: None)
2022-10-14 23:27:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7399%20%20%20>
{'Client Name': 'Responsible Energy', 'Lobbyist': 'Jutsum Advisory Pty Ltd', 'ABN': '68 973 583 422'}
2022-10-14 23:27:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6521%20%20%20>
{'Client Name': 'RateSetter Australia RE Limited', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:27:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7190%20%20%20> (referer: None)
2022-10-14 23:27:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/registe

2022-10-14 23:27:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4057%20%20%20>
{'Client Name': 'Rainbow Bay Surf Life Savers Supporters Association Inc', 'Lobbyist': 'DWS Hospitality Specialists', 'ABN': '50 113 985 247'}
2022-10-14 23:27:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7370%20%20%20> (referer: None)
2022-10-14 23:27:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7339%20%20%20>
{'Client Name': 'QER Pty Ltd', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6979%20%20%20> (referer: None)
2022-10-14 23:27:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists

2022-10-14 23:27:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6937%20%20%20> (referer: None)
2022-10-14 23:27:05 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4111%20%20%20>
{'Client Name': 'Pratt Holdings Pty Ltd', 'Lobbyist': 'The Fifth Estate', 'ABN': '51 069 838 222'}
2022-10-14 23:27:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6605%20%20%20> (referer: None)
2022-10-14 23:27:05 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=2167%20%20%20>
{'Client Name': 'QBE Insurance Australia Ltd', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:27:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/

2022-10-14 23:27:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7363%20%20%20> (referer: None)
2022-10-14 23:27:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7342%20%20%20>
{'Client Name': 'Pinssar Pty Ltd', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6554%20%20%20> (referer: None)
2022-10-14 23:27:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7363%20%20%20>
{'Client Name': 'Plenti Pty Limited', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:27:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/clie

2022-10-14 23:27:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7495%20%20%20> (referer: None)
2022-10-14 23:27:08 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7049%20%20%20>
{'Client Name': 'PLATEAU PLANTS', 'Lobbyist': 'Walsh Stevens', 'ABN': '18084661441'}
2022-10-14 23:27:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7496%20%20%20> (referer: None)
2022-10-14 23:27:08 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7495%20%20%20>
{'Client Name': 'Pathology Technology Australia', 'Lobbyist': 'London Agency', 'ABN': '73145352147'}
2022-10-14 23:27:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.

2022-10-14 23:27:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7326%20%20%20>
{'Client Name': 'Orion Health', 'Lobbyist': 'WELLS HASLEM MAYHEW STRATEGIC PUBLIC AFFAIRS', 'ABN': '52 159 456 685'}
2022-10-14 23:27:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4469%20%20%20>
{'Client Name': 'Orica Australia Pty Ltd', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:27:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6932%20%20%20> (referer: None)
2022-10-14 23:27:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7474%20%20%20> (referer: None)
2022-10-14 23:27:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/reg

2022-10-14 23:27:10 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6696%20%20%20> (referer: None)
2022-10-14 23:27:10 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6770%20%20%20> (referer: None)
2022-10-14 23:27:10 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6696%20%20%20>
{'Client Name': 'Nufarm Australia', 'Lobbyist': 'CMAX Advisory', 'ABN': '73 130 740 546'}
2022-10-14 23:27:10 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7277%20%20%20> (referer: None)
2022-10-14 23:27:10 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6770%20%20%20>
{'Client Name': 'PepsiCo Australia & New Zealand', 'Lobby

2022-10-14 23:27:11 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3480%20%20%20> (referer: None)
2022-10-14 23:27:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7122%20%20%20>
{'Client Name': 'NATIONAL ROAD TRANSPORT ASSOCIATION LIMITED', 'Lobbyist': 'Primary Communication', 'ABN': '36617864347'}
2022-10-14 23:27:11 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7271%20%20%20> (referer: None)
2022-10-14 23:27:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7518%20%20%20>
{'Client Name': 'One Park Lane Developments Pty Ltd', 'Lobbyist': 'Jutsum Advisory Pty Ltd', 'ABN': '68 973 583 422'}
2022-10-14 23:27:11 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists

2022-10-14 23:27:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5828%20%20%20> (referer: None)
2022-10-14 23:27:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=443%20%20%20>
{'Client Name': 'Moondaze Pty Ltd ', 'Lobbyist': 'Liquor Licensing Consultants', 'ABN': '58 679 042 968'}
2022-10-14 23:27:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7483%20%20%20> (referer: None)
2022-10-14 23:27:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5828%20%20%20>
{'Client Name': 'Northrop Grumman International', 'Lobbyist': 'CMAX Advisory', 'ABN': '73 130 740 546'}
2022-10-14 23:27:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-

2022-10-14 23:27:14 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7019%20%20%20> (referer: None)
2022-10-14 23:27:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4433%20%20%20>
{'Client Name': 'Mental Health Commission of NSW', 'Lobbyist': 'Primary Communication', 'ABN': '36617864347'}
2022-10-14 23:27:14 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3435%20%20%20> (referer: None)
2022-10-14 23:27:14 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7019%20%20%20>
{'Client Name': 'MediaCom', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:27:14 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-d

#### Read the output

In [10]:
# please run once the 'run_spider()' is completed.
client = pd.read_csv('data/qldlbClient.csv')
client.head()

2022-10-14 23:27:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7252%20%20%20>
{'Client Name': 'Martinus Rail Pty Ltd', 'Lobbyist': 'Northstar Public Affairs Pty Ltd', 'ABN': '25 638 744 046'}


Unnamed: 0,Client Name,Lobbyist,ABN
0,1st Group,TG Public Affairs,23630677673
1,American Express,SEC Newgate,38162366056
2,Amicus Therapeutics,Ogilvy PR Agency Pty Limited,89 096 965 794
3,ANDHealth,CPR Communications and Public Relations Pty Lt...,"ABN 94064357544, ACN 064357544"
4,AMAZON COMMERCIAL SERVICES PTY LTD,Sling and Stone,87 145 965 466


2022-10-14 23:27:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7056%20%20%20>
{'Client Name': 'National Electrical and Communications Association', 'Lobbyist': 'Wilkinson Butler', 'ABN': '95164204111'}
2022-10-14 23:27:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6710%20%20%20>
{'Client Name': 'MANSKI Pty Ltd', 'Lobbyist': 'TG Public Affairs', 'ABN': '23630677673'}
2022-10-14 23:27:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7443%20%20%20>
{'Client Name': 'Manawari Pty Ltd', 'Lobbyist': 'Precise Hospitality Licensing', 'ABN': '46220117164'}
2022-10-14 23:27:15 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7257%20%20%20> (referer: None)
2022-10-14 23:27:15 [scr

2022-10-14 23:27:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3108%20%20%20> (referer: None)
2022-10-14 23:27:17 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7481%20%20%20>
{'Client Name': "Montezuma's Mt Gravatt Pty Ltd", 'Lobbyist': 'Precise Hospitality Licensing', 'ABN': '46220117164'}
2022-10-14 23:27:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6628%20%20%20> (referer: None)
2022-10-14 23:27:17 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3108%20%20%20>
{'Client Name': 'LP Investments PTY LTD ATF The Rains Family Trust NO2 & WMO Exchange PTY LTD ATF The WMO Exchange Trust ', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14

### Extract Donations Data

A disclosure return is the reporting of all donations, loans, and expenditure incurred for an election campaign. 
These must be reported to the ECQ under the Electoral Act 1992 and the Local Government Electoral Act 2011. 
All disclosures are made through the Electronic Disclosure System (EDS) and are available to the public.

This data is scraped from the EDS (https://disclosures.ecq.qld.gov.au/Map)

The process below scrapes the source for new data, then appends to the existing data at '**data/qld_donations.csv**'.

In [11]:
# Import Packages
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import re
import pandas as pd
import datetime

# Define the browser to use and then open up the donations homepage
driver = webdriver.Chrome()
driver.get('https://disclosures.ecq.qld.gov.au/Map')

# Empty List to collect new urls
recipient_url = []

# Read the latest donations csv and determine latest date
df_date = pd.DataFrame()
df_donations = pd.read_csv('data/qld_donations.csv')
df_date['date'] = df_donations.date.dropna()
df_date['date'] = pd.to_datetime(df_date['date'], format='%d/%m/%Y')
last_date = max(df_date.date)
newest_date = (last_date.date().strftime('%d-%m-%Y'))

# Input the last date into the website, then apply filter
date_from = driver.find_element(By.XPATH, "//*[@id='ViewFilter_DateFrom']")
date_from.send_keys(newest_date)
# date_from.send_keys('01-09-2022') # for manual filter
apply_button = driver.find_element(By.XPATH, "//*[@id='maps-form']/div/div/div[2]/div[1]/button")
apply_button.click()
time.sleep(2)

# 2. Begin scraping the urls
while True:
    
     # This code selects all the URLs from the main table, then appends URLs to a list
    recip_urls = driver.find_elements(By.XPATH, "//tr/td[3]/a")
    for url in recip_urls:
        #print(url.get_attribute("href"))
        recipient_url.append(url.get_attribute("href"))
    
    # for current page, grab the page items at bottom
    page_items_element = driver.find_element(By.XPATH, "//*[@id='map-page-header']/div[2]/div[3]/div/div/div/div/div/div[2]/small")
    page_items = page_items_element.text
    page_item_nos = re.findall(r'\d+', page_items)
    
    # pagination if page items are not at the end
    if page_item_nos[1] != page_item_nos[2]:
        # looks for the next button
        button = driver.find_element(By.CLASS_NAME, "fa.fa-chevron-right")
        button.click()
        time.sleep(2)
    
    else:
        # exits when the button no longer exists
        break

# Instantiate all lists        
all_urls = recipient_url
date = []
donor_name = []
donor_type = []
recipient_name = []
recipient_type = []
agent_names = []
gift_type = []
gift_value = []
page_url= []


# Scrape the pages from each of the just scraped urls
for url in all_urls:
    driver.get(url)
    
    # Donor
    try: 
        driver.find_element(By.XPATH, "//*[@class='form-group']/div/input")
    except NoSuchElementException:
        donor_name.append(None)
    else:
        donor_dim = driver.find_element(By.XPATH, "//*[@class='form-group']/div/input")
        donor_name.append(donor_dim.get_attribute("value"))
    
    
    # Donor_Types
    try: 
        driver.find_element(By.XPATH, "//*[@id='disclosureEntries']/div/div/div[1]/div/span[4]")
    except NoSuchElementException:
        donor_type.append(None)
    else:
        donor_type_dim = driver.find_element(By.XPATH, "//*[@id='disclosureEntries']/div/div/div[1]/div/span[4]")
        donor_type.append(donor_type_dim.text)

    # Recipient
    try:
        driver.find_element(By.XPATH, "//*[@id='Head_ElectorFullName']")
    except NoSuchElementException:
        try:
            driver.find_element(By.XPATH, "//*[@id='Head_PoliticalPartyTitle']")
        except NoSuchElementException:
            try:
                driver.find_element(By.XPATH, "//*[@id='Head_Title']")
            except NoSuchElementException:
                recipient_name.append(None)
            else:
                recipient_dim = driver.find_element(By.XPATH, "//*[@id='Head_Title']")
                recipient_name.append(recipient_dim.get_attribute("value"))
        else:
            recipient_dim = driver.find_element(By.XPATH, "//*[@id='Head_PoliticalPartyTitle']")
            recipient_name.append(recipient_dim.get_attribute("value"))
    else:
        recipient_dim = driver.find_element(By.XPATH, "//*[@id='Head_ElectorFullName']")
        recipient_name.append(recipient_dim.get_attribute("value"))    
    
    
    
    # Recipient Type
    try:
        driver.find_element(By.XPATH, "//*[@id='content']/div/div[1]/div/div/div/div/div[1]/h1")
    except NoSuchElementException:
        recipient_type.append(None)
    else:
        recip_type_dim = driver.find_element(By.XPATH, "//*[@id='content']/div/div[1]/div/div/div/div/div[1]/h1")
        recipient_type.append(recip_type_dim.text)
    
    # Date
    try:
        driver.find_element(By.XPATH, "//*[@class='form-control datepicker gift-date-received special-reporting-event-check-trigger']")
    except NoSuchElementException:
        date.append(None)
    else:
        date_dim = driver.find_element(By.XPATH, "//*[@class='form-control datepicker gift-date-received special-reporting-event-check-trigger']")
        date.append(date_dim.get_attribute("value"))
    
    # Gift value
    try:
        driver.find_element(By.XPATH, "//*[@class='form-control currencyFormat text-right gift-amount special-reporting-event-check-trigger']")
    except NoSuchElementException:
        gift_value.append(None)
    else:
        gift_value_dim = driver.find_element(By.XPATH, "//*[@class='form-control currencyFormat text-right gift-amount special-reporting-event-check-trigger']")
        gift_value.append(gift_value_dim.get_attribute("value"))
    
    # Gift Type
    try:
        driver.find_element(By.XPATH, "//*[@id='disclosureEntries']/div/div/div[1]/div/span[2]")
    except NoSuchElementException:
        gift_type.append(None)
    else:
        gift_type_dim = driver.find_element(By.XPATH, "//*[@id='disclosureEntries']/div/div/div[1]/div/span[2]")
        gift_type.append(gift_type_dim.text)
    
    # Agent Names
    try:
        driver.find_element(By.XPATH, "//*[@id='Head_RepresentativeFullName']")
    except NoSuchElementException:
        try: 
            driver.find_element(By.XPATH, "//*[@id='Head_AgentFullName']")
        except NoSuchElementException:
            agent_names.append(None)
        else:
            agent_name = driver.find_element(By.XPATH, "//*[@id='Head_AgentFullName']")
            agent_names.append(agent_name.get_attribute("value"))   
    else:
        agent_name = driver.find_element(By.XPATH, "//*[@id='Head_RepresentativeFullName']")
        agent_names.append(agent_name.get_attribute("value"))
        
        
    # original url
    page_url.append(driver.current_url)
        
    time.sleep(0.5)
    
    
# Create dataframe for the newly scraped pages
df = pd.DataFrame(list(zip(date, donor_name, donor_type, recipient_name, recipient_type, agent_names, gift_type, gift_value, page_url)), 
                  columns = ["date", "donor_name", "donor_type", "recipient_name", "recipient_type", "agent_names", "gift_type", "gift_value", "page_url"])

# Clean any whitespaces
df['recipient_name'] = df['recipient_name'].str.strip()
df['donor_name'] = df['donor_name'].str.strip()
df['agent_names'] = df['agent_names'].str.strip()

# Import existing data, concatenate the new data with existing dataset, remove duplicates
df_concat = pd.concat([df, df_donations], ignore_index=True)
df_dupes_dropped = df_concat.drop_duplicates(subset=['page_url'])
df_dupes_dropped = df_dupes_dropped[df_dupes_dropped['gift_value'].notna()]

# Save new file
df_dupes_dropped.to_csv('data/qld_donations.csv', header=True, index=False)

# close the browser
driver.quit()

2022-10-14 23:27:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7002%20%20%20> (referer: None)
2022-10-14 23:27:17 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=842%20%20%20>
{'Client Name': 'Ludlow Hospitality Qld Pty Ltd atf The Ludlow Family Trust', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:27:17 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6652%20%20%20>
{'Client Name': 'LiveNation', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6695%20%20%20> (referer: None)
2022-10-14 23:27:18 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lob

2022-10-14 23:27:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3817%20%20%20>
{'Client Name': 'Leda Holdings Pty Ltd', 'Lobbyist': 'Staerk Government and Media', 'ABN': '32 844 592 574'}
2022-10-14 23:27:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6765%20%20%20>
{'Client Name': 'Lion Pty Ltd', 'Lobbyist': 'PremierNational', 'ABN': '71 619 450 841'}
2022-10-14 23:27:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7258%20%20%20>
{'Client Name': 'LC Distributors', 'Lobbyist': 'PolicyWonks', 'ABN': '50 463 070 316'}
2022-10-14 23:27:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6278%20%20%20> (referer: None)
2022-10-14 23:27:19 [scrapy.core.scraper] DEBUG: Scraped

2022-10-14 23:27:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6697%20%20%20> (referer: None)
2022-10-14 23:27:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7321%20%20%20>
{'Client Name': 'KIMBERLY-CLARK AUSTRALIA PTY LTD', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:27:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6093%20%20%20> (referer: None)
2022-10-14 23:27:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6697%20%20%20>
{'Client Name': 'Kevin Rayner T/A Earth Espresso Bar', 'Lobbyist': 'Jacaranda Advisory', 'ABN': '41 673 550 487'}
2022-10-14 23:27:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.g

2022-10-14 23:27:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6655%20%20%20> (referer: None)
2022-10-14 23:27:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7371%20%20%20>
{'Client Name': 'KUMANO ATF K.KUMANO TRUST', 'Lobbyist': 'Precise Hospitality Licensing', 'ABN': '46220117164'}
2022-10-14 23:27:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7050%20%20%20> (referer: None)
2022-10-14 23:27:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6655%20%20%20>
{'Client Name': 'John Laing', 'Lobbyist': 'Spring Street Advisory', 'ABN': '92 603 411 650'}
2022-10-14 23:27:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-deta

2022-10-14 23:27:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6997%20%20%20> (referer: None)
2022-10-14 23:27:22 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5857%20%20%20>
{'Client Name': 'Ipswich RSL', 'Lobbyist': 'Zealifi', 'ABN': '66 129 799 408'}
2022-10-14 23:27:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7245%20%20%20> (referer: None)
2022-10-14 23:27:22 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:22 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"0299d989-6b7d-47c6-9469-95105fa20e7c"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Ty

2022-10-14 23:27:23 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6596%20%20%20>
{'Client Name': 'Johnson & Johnson Medical Pty Ltd', 'Lobbyist': 'Ogilvy PR Agency Pty Limited', 'ABN': '89 096 965 794'}
2022-10-14 23:27:23 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7093%20%20%20> (referer: None)
2022-10-14 23:27:23 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7106%20%20%20>
{'Client Name': 'Integrous Consulting Pty Ltd', 'Lobbyist': 'Jutsum Advisory Pty Ltd', 'ABN': '68 973 583 422'}
2022-10-14 23:27:23 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6363%20%20%20> (referer: None)
2022-10-14 23:27:23 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integ

2022-10-14 23:27:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7129%20%20%20> (referer: None)
2022-10-14 23:27:24 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6766%20%20%20>
{'Client Name': 'Imperium Tourism Holdings Pty Ltd', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6706%20%20%20> (referer: None)
2022-10-14 23:27:24 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7135%20%20%20>
{'Client Name': 'integratedliving Australia Ltd (ABN 95130 530 844)', 'Lobbyist': 'Australian Public Affairs', 'ABN': '20 098 705 403'}
2022-10-14 23:27:24 [scrapy.core.scraper] DEBUG: Scraped from <200 ht

2022-10-14 23:27:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4008%20%20%20>
{'Client Name': 'Hunter Institute of Mental Health', 'Lobbyist': 'Primary Communication', 'ABN': '36617864347'}
2022-10-14 23:27:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7358%20%20%20> (referer: None)
2022-10-14 23:27:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7354%20%20%20>
{'Client Name': 'Humanetix Pty Limited (ABN 58 147 390 056)', 'Lobbyist': 'Australian Public Affairs', 'ABN': '20 098 705 403'}
2022-10-14 23:27:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6639%20%20%20> (referer: None)
2022-10-14 23:27:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists

2022-10-14 23:27:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6567%20%20%20>
{'Client Name': 'Ingenia Communities Holdings Ltd', 'Lobbyist': 'BBS Communications Group Pty Ltd', 'ABN': '34 010 899 779'}
2022-10-14 23:27:26 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 100
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"https://disclosures.ecq.qld.gov.au/forms/recipients/5b07d98a-f744-415a-ded1-08daa7ead867"} | headers=HTTPHeaderDict({'Content-Length': '100', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="3270f394-87a6-4d2e-83ed-7dfcbec15496")>

2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@class='form-group']/div/input"}
2022-10-14 23:27:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7073%20%20%20> (referer: None)
2022-10-14 23:27:26 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"3270f394-87a6-4d2e-83ed-7dfcbec15496"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:26 [selenium.webdriver.r

2022-10-14 23:27:26 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 26
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"Ellen MacQueen"} | headers=HTTPHeaderDict({'Content-Length': '26', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:26 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='disclosureEntries']/div/div/div[1]/div/span[4]"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | dat

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="4452c92c-df8b-440b-92e4-bf9c01f9a7cb")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='disclosureEntries']/div/div/div[1]/div/span[4]"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"4452c92c-df8b-440b-92e4-bf9c01f9a7cb"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: GET http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element/4452c92c-df8b-440b-92e4-bf9c01f9a7cb/text {"id":

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="39d71be1-46cf-4232-be2d-a4310b0c54df")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='Head_PoliticalPartyTitle']"}
2022-10-14 23:27:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6640%20%20%20> (referer: None)
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"39d71be1-46cf-4232-be2d-a4310b0c54df"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.

2022-10-14 23:27:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5923%20%20%20> (referer: None)
2022-10-14 23:27:27 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7163%20%20%20>
{'Client Name': 'Heritage Minerals', 'Lobbyist': 'PolicyWonks', 'ABN': '50 463 070 316'}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 29
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"Queensland Greens"} | headers=HTTPHeaderDict({'Content-Length': '29', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connect

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="baf45aaa-ec8e-4a21-be5a-5dea966778a3")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='content']/div/div[1]/div/div/div/div/div[1]/h1"}
2022-10-14 23:27:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7466%20%20%20> (referer: None)
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"baf45aaa-ec8e-4a21-be5a-5dea966778a3"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="6713c418-0a81-4cc1-9ac7-2a63c4cee505")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@class='form-control datepicker gift-date-received special-reporting-event-check-trigger']"}
2022-10-14 23:27:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7434%20%20%20> (referer: None)
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"6713c418-0a81-4cc1-9ac7-2a63c4cee505"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG

2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 21
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"7/10/2022"} | headers=HTTPHeaderDict({'Content-Length': '21', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@class='form-control currencyFormat text-right gift-amount special-reporting-event-check-trigger']"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="41323512-83d9-4630-823c-f63f032ef127")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@class='form-control currencyFormat text-right gift-amount special-reporting-event-check-trigger']"}
2022-10-14 23:27:27 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7466%20%20%20>
{'Client Name': 'Hutchison Ports Australia', 'Lobbyist': 'Barton Deakin', 'ABN': '65 140 067 287'}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"41323512-83d9-4630-823c-f63f032ef127"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-con

2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 19
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"1000.00"} | headers=HTTPHeaderDict({'Content-Length': '19', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='disclosureEntries']/div/div/div[1]/div/span[2]"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"val

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="62d3778f-863e-40c9-a4b9-b0dcb2f24015")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='disclosureEntries']/div/div/div[1]/div/span[2]"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"62d3778f-863e-40c9-a4b9-b0dcb2f24015"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: GET http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element/62d3778f-863e-40c9-a4b9-b0dcb2f24015/text {"id":

<selenium.webdriver.remote.webelement.WebElement (session="b75a9149253371ad5e06e48bc5d3890d", element="686a3a57-9d69-4104-93a4-54973d0ceac5")>

2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/element {"using": "xpath", "value": "//*[@id='Head_RepresentativeFullName']"}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/element HTTP/1.1" 200 88
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"686a3a57-9d69-4104-93a4-54973d0ceac5"}} | headers=HTTPHeaderDict({'Content-Length': '88', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/execute/sync {"script": "return (function(){return (function(){var h=thi

2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "POST /session/b75a9149253371ad5e06e48bc5d3890d/execute/sync HTTP/1.1" 200 27
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"SUSAN ETHERIDGE"} | headers=HTTPHeaderDict({'Content-Length': '27', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: GET http://localhost:61234/session/b75a9149253371ad5e06e48bc5d3890d/url {}
2022-10-14 23:27:27 [urllib3.connectionpool] DEBUG: http://localhost:61234 "GET /session/b75a9149253371ad5e06e48bc5d3890d/url HTTP/1.1" 200 100
2022-10-14 23:27:27 [selenium.webdriver.remote.remote_connection] DEBUG: Remote response: status=200 | data={"value":"https://disclosures.ecq.qld.gov.au/forms/recipients/5b07d98a-f744-415a-ded1-08da

2022-10-14 23:27:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7492%20%20%20>
{'Client Name': 'Hanson Pty Ltd', 'Lobbyist': 'Rowland Pty Ltd', 'ABN': '59 011 033 364'}
2022-10-14 23:27:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=323%20%20%20> (referer: None)
2022-10-14 23:27:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7068%20%20%20>
{'Client Name': 'Hanseath Pty Ltd', 'Lobbyist': 'Project Urban Pty Ltd', 'ABN': '97 608 895 923'}
2022-10-14 23:27:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7467%20%20%20> (referer: None)
2022-10-14 23:27:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details

2022-10-14 23:27:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7516%20%20%20> (referer: None)
2022-10-14 23:27:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6944%20%20%20>
{'Client Name': 'Greg Clark', 'Lobbyist': 'Project Urban Pty Ltd', 'ABN': '97 608 895 923'}
2022-10-14 23:27:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4181%20%20%20> (referer: None)
2022-10-14 23:27:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7516%20%20%20>
{'Client Name': 'Greenvale Mining', 'Lobbyist': 'Staerk Government and Media', 'ABN': '32 844 592 574'}
2022-10-14 23:27:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/clien

2022-10-14 23:27:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6745%20%20%20> (referer: None)
2022-10-14 23:27:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6956%20%20%20>
{'Client Name': 'Golf Australia', 'Lobbyist': 'Next Level Strategic Services', 'ABN': '27 613 857 668'}
2022-10-14 23:27:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6104%20%20%20>
{'Client Name': 'Grifols Australia', 'Lobbyist': 'Statecraft Pty Ltd', 'ABN': '90 116 598 924'}
2022-10-14 23:27:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3768%20%20%20> (referer: None)
2022-10-14 23:27:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/c

In [12]:
donations = df_dupes_dropped
donations.head()

Unnamed: 0,date,donor_name,donor_type,recipient_name,recipient_type,agent_names,gift_type,gift_value,page_url
0,7/10/2022,Ellen MacQueen,an individual,Queensland Greens,Agent for political party,SUSAN ETHERIDGE,a Gift,1000.0,https://disclosures.ecq.qld.gov.au/forms/recip...
2,6/10/2022,JENNIFER ANNE HORSBURGH,an individual,Animal Justice Party (Queensland),Agent for political party,Lindon Cox,a Gift,300.0,https://disclosures.ecq.qld.gov.au/forms/recip...
3,5/10/2022,JONATHAN SRIRANGANATHAN,an individual,Queensland Greens,Agent for political party,SUSAN ETHERIDGE,a Gift,5177.0,https://disclosures.ecq.qld.gov.au/forms/recip...
4,30/09/2022,AUSTRALIAN GREENS THE GREENS INCORPORATED,a corporation,Queensland Greens,Agent for political party,SUSAN ETHERIDGE,a Gift,185440.0,https://disclosures.ecq.qld.gov.au/forms/recip...
5,30/09/2022,Roger Welch,an individual,Liberal National Party of Queensland,Agent for political party,MICHAEL NEGEREVICH,a Gift,1000.0,https://disclosures.ecq.qld.gov.au/forms/recip...


2022-10-14 23:27:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=831%20%20%20> (referer: None)
2022-10-14 23:27:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6518%20%20%20>
{'Client Name': 'Genex Power Limited', 'Lobbyist': 'PremierNational', 'ABN': '71 619 450 841'}
2022-10-14 23:27:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7234%20%20%20>
{'Client Name': 'Gondwana Testing Technology Circuits Queensland', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7333%20%20%20> (referer: None)
2022-10-14 23:27:32 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.ql

2022-10-14 23:27:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6911%20%20%20> (referer: None)
2022-10-14 23:27:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3102%20%20%20>
{'Client Name': 'FZZ Pty Ltd ATF The Crust Pizza B H Unit Trust ', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:27:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4390%20%20%20> (referer: None)
2022-10-14 23:27:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6911%20%20%20>
{'Client Name': 'Future Battery Industries Cooperative Research Centre', 'Lobbyist': 'Cannings Purple', 'ABN': '37 108 802 366'}
2022-10-14 23:27:33 [scrapy.core.engine] DEBUG: Crawled

# Pre-processing
To extract 'source', 'target', and 'weight' columns

In [13]:
# lobbyist
ag_lobbyist = lobbyist.groupby(['Lobbying Firm','Lobbyist Name']).size().reset_index(name='edge weight')
ag_lobbyist = ag_lobbyist.rename(columns={'Lobbying Firm': 'source', 'Lobbyist Name': 'target'})
ag_lobbyist['node weight'] = '1' # default weight
ag_lobbyist['color'] = '#BEBEBE'
ag_lobbyist['source type'] = 'Lobbying Firm'
ag_lobbyist['target type'] = 'Lobbyist'
ag_lobbyist.head()

Unnamed: 0,source,target,edge weight,node weight,color,source type,target type
0,Adams + Sparkes Town Planning and Development,Adam Seaton,1,1,#BEBEBE,Lobbying Firm,Lobbyist
1,Adams + Sparkes Town Planning and Development,Cameron Adams,1,1,#BEBEBE,Lobbying Firm,Lobbyist
2,Adams + Sparkes Town Planning and Development,Michael Lyell,1,1,#BEBEBE,Lobbying Firm,Lobbyist
3,Adams + Sparkes Town Planning and Development,Peter Sparkes,1,1,#BEBEBE,Lobbying Firm,Lobbyist
4,Alistair Nicholas Consulting Pty Ltd,Alistair Nicholas,1,1,#BEBEBE,Lobbying Firm,Lobbyist


2022-10-14 23:27:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4600%20%20%20>
{'Client Name': 'FRV Services Australia Pty Ltd', 'Lobbyist': 'GRACosway Pty Ltd', 'ABN': '50 082 123 822'}
2022-10-14 23:27:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7264%20%20%20> (referer: None)
2022-10-14 23:27:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7302%20%20%20>
{'Client Name': 'foodtuber Pty Ltd', 'Lobbyist': 'Liquor Licensing Consultants', 'ABN': '58 679 042 968'}
2022-10-14 23:27:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=830%20%20%20> (referer: None)
2022-10-14 23:27:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/regis

2022-10-14 23:27:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6002%20%20%20> (referer: None)
2022-10-14 23:27:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=268%20%20%20>
{'Client Name': 'FIPRA International UK', 'Lobbyist': 'FIPRA Australia', 'ABN': '50 078 482 596'}
2022-10-14 23:27:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7200%20%20%20> (referer: None)
2022-10-14 23:27:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6002%20%20%20>
{'Client Name': 'Expedia Inc', 'Lobbyist': 'Barton Deakin', 'ABN': '65 140 067 287'}
2022-10-14 23:27:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx

2022-10-14 23:27:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7407%20%20%20>
{'Client Name': 'Essen Pty Ltd', 'Lobbyist': 'Precise Hospitality Licensing', 'ABN': '46220117164'}
2022-10-14 23:27:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7491%20%20%20>
{'Client Name': 'ENTAIN', 'Lobbyist': 'Govstrat Pty Ltd', 'ABN': '64 964 952 044'}
2022-10-14 23:27:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7423%20%20%20> (referer: None)
2022-10-14 23:27:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5663%20%20%20>
{'Client Name': 'Engineers Australia', 'Lobbyist': 'MCM Strategic Communications Pty Ltd', 'ABN': '63272133108'}
2022-10-14 23:27:36 [scrapy.core.engine] DEB

2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7432%20%20%20> (referer: None)
2022-10-14 23:27:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6544%20%20%20>
{'Client Name': 'Electro Optic Systems Pty Ltd', 'Lobbyist': 'CMAX Advisory', 'ABN': '73 130 740 546'}
2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7352%20%20%20> (referer: None)
2022-10-14 23:27:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7432%20%20%20>
{'Client Name': "Enzo's Cucina Surfers Paradise Pty Ltd", 'Lobbyist': 'Liquor Licensing Consultants', 'ABN': '58 679 042 968'}
2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity

In [14]:
# client
ag_client = client.groupby(['Lobbyist','Client Name']).size().reset_index(name='edge weight')
ag_client = ag_client.rename(columns={'Lobbyist': 'source', 'Client Name': 'target'})
ag_client['node weight'] = '1' # default weight
ag_client['color'] = '#BEBEBE'
ag_client['source type'] = 'Lobbyist'
ag_client['target type'] = 'Client'
ag_client.head()

Unnamed: 0,source,target,edge weight,node weight,color,source type,target type
0,Adams + Sparkes Town Planning and Development,W & M Carnall Pty Ltd,1,1,#BEBEBE,Lobbyist,Client
1,Anacta Strategies Pty Ltd,Plenary Group,1,1,#BEBEBE,Lobbyist,Client
2,Anacta Strategies Pty Ltd,Poseidon Sea Pilots Pty Ltd,1,1,#BEBEBE,Lobbyist,Client
3,Anacta Strategies Pty Ltd,QER Pty Ltd,1,1,#BEBEBE,Lobbyist,Client
4,Anacta Strategies Pty Ltd,Queensland Motorways Pty Ltd,1,1,#BEBEBE,Lobbyist,Client


2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5964%20%20%20> (referer: None)
2022-10-14 23:27:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6689%20%20%20>
{'Client Name': 'Enphase Energy', 'Lobbyist': 'Spring Street Advisory', 'ABN': '92 603 411 650'}
2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7195%20%20%20> (referer: None)
2022-10-14 23:27:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5964%20%20%20>
{'Client Name': 'eBay International AG', 'Lobbyist': 'Willard Public Affairs Pty Ltd', 'ABN': '17 165 851 856'}
2022-10-14 23:27:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-

2022-10-14 23:27:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6150%20%20%20>
{'Client Name': 'Eaton Industries', 'Lobbyist': 'Primary Communication', 'ABN': '36617864347'}
2022-10-14 23:27:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6630%20%20%20> (referer: None)
2022-10-14 23:27:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5802%20%20%20>
{'Client Name': 'DIMENSION DATA AUSTRALIA PTY LIMITED', 'Lobbyist': 'Sling and Stone', 'ABN': '87 145 965 466'}
2022-10-14 23:27:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=827%20%20%20> (referer: None)
2022-10-14 23:27:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-det

In [15]:
# donations
donations["gift_value"] = pd.to_numeric(donations["gift_value"])
ag_donations = donations.groupby(['donor_name','recipient_name','donor_type','recipient_type'])['gift_value'].agg(['sum', 'count']).reset_index()
ag_donations = ag_donations.rename(columns={'donor_name': 'source', 'recipient_name': 'target',
                                            'donor_type': 'source type', 'recipient_type': 'target type', 
                                            'sum':'node weight', 'count':'edge weight'})
ag_donations['color'] = '#BEBEBE'

ag_donations = ag_donations[['source', 'target', 'edge weight', 'node weight','color',
                             'source type', 'target type']]

# transform donor type
ag_donations["source type"].replace({"a corporation": "Corporation", "an individual": "Individual",
                           "another type of entity": "Other", "other": "Other",
                           "a trust fund or foundation": "Trust/Foundation", 
                           "an unincorporated association": "Unincorporated Association"}, inplace=True)

ag_donations.head()

2022-10-14 23:27:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6923%20%20%20> (referer: None)
2022-10-14 23:27:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6630%20%20%20>
{'Client Name': 'Diggers Services Club', 'Lobbyist': 'Jacaranda Advisory', 'ABN': '41 673 550 487'}


Unnamed: 0,source,target,edge weight,node weight,color,source type,target type
0,141 Abbott Street Pty Ltd,Australian Labor Party (State of Queensland),1,2000.0,#BEBEBE,Corporation,Agent for political party
1,188 Edward Pty Ltd,Liberal National Party of Queensland,1,2000.0,#BEBEBE,Corporation,Agent for political party
2,1st Class Food Pty Ltd,Liberal National Party of Queensland,2,12000.0,#BEBEBE,Corporation,Agent for political party
3,235L Projects Pty Ltd,Liberal National Party of Queensland,1,5500.0,#BEBEBE,Corporation,Agent for political party
4,24 Kurilpa Street West End Pty Ltd,Liberal National Party of Queensland,2,5000.0,#BEBEBE,Corporation,Agent for political party


2022-10-14 23:27:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=827%20%20%20>
{'Client Name': 'Dreamtint Pty Ltd', 'Lobbyist': 'Commercial Licensing Specialists', 'ABN': '33 134 318 595'}
2022-10-14 23:27:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7465%20%20%20>
{'Client Name': 'Dietitians Australia', 'Lobbyist': 'Mission Consulting Solutions', 'ABN': '32660516974'}
2022-10-14 23:27:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=252%20%20%20> (referer: None)
2022-10-14 23:27:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6923%20%20%20>
{'Client Name': 'Diatreme Resources', 'Lobbyist': 'Anacta Strategies Pty Ltd', 'ABN': '64 633 978 677'}
2022-10-14 23:27:39 [s

2022-10-14 23:27:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7324%20%20%20> (referer: None)
2022-10-14 23:27:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4549%20%20%20>
{'Client Name': 'Dennis Family Corporation', 'Lobbyist': 'Hawker Britton', 'ABN': '79 109 681 405'}
2022-10-14 23:27:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7388%20%20%20> (referer: None)
2022-10-14 23:27:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7324%20%20%20>
{'Client Name': 'Dangi Pub Pty Ltd', 'Lobbyist': 'Precise Hospitality Licensing', 'ABN': '46220117164'}
2022-10-14 23:27:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-detai

2022-10-14 23:27:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4565%20%20%20> (referer: None)
2022-10-14 23:27:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7197%20%20%20>
{'Client Name': 'Cornerstone Properties', 'Lobbyist': 'Next Level Strategic Services', 'ABN': '27 613 857 668'}
2022-10-14 23:27:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3505%20%20%20> (referer: None)
2022-10-14 23:27:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7140%20%20%20> (referer: None)
2022-10-14 23:27:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=826%20%20%20> (referer: None)
2022-10-14 23:27:41 

2022-10-14 23:27:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6865%20%20%20> (referer: None)
2022-10-14 23:27:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4456%20%20%20>
{'Client Name': 'Clontarf Foundation', 'Lobbyist': 'SEC Newgate', 'ABN': '38162366056'}
2022-10-14 23:27:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7026%20%20%20> (referer: None)
2022-10-14 23:27:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=3377%20%20%20>
{'Client Name': 'Commitee for Southport', 'Lobbyist': 'Staerk Government and Media', 'ABN': '32 844 592 574'}
2022-10-14 23:27:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/cli

2022-10-14 23:27:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7523%20%20%20>
{'Client Name': 'Centre for Indigenous Training', 'Lobbyist': 'Staerk Government and Media', 'ABN': '32 844 592 574'}
2022-10-14 23:27:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7522%20%20%20> (referer: None)
2022-10-14 23:27:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=7522%20%20%20>
{'Client Name': 'Centre for Indigenous Policy', 'Lobbyist': 'Staerk Government and Media', 'ABN': '32 844 592 574'}
2022-10-14 23:27:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=5702%20%20%20> (referer: None)
2022-10-14 23:27:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integ

2022-10-14 23:27:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6982%20%20%20>
{'Client Name': 'Caravan Industry Association of Australia', 'Lobbyist': 'Royce Comm', 'ABN': '91 167 042 408'}
2022-10-14 23:27:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=4418%20%20%20>
{'Client Name': 'Caption It Pty Ltd', 'Lobbyist': 'Cornerstone Group', 'ABN': '9915 3936 719'}
2022-10-14 23:27:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6620%20%20%20> (referer: None)
2022-10-14 23:27:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://lobbyists.integrity.qld.gov.au/register-details/client-details.aspx?id=6784%20%20%20>
{'Client Name': 'Catholic Education', 'Lobbyist': 'DPG Advisory Solutions Pty Ltd', 'ABN': '14 634 403 115'}
2022-10-14 23:27:46 [scr

## Stack all data and output
output qld_networkg.csv

In [27]:
map_dat = pd.concat([ag_lobbyist,ag_client,ag_donations],ignore_index=True)
map_dat.head()

Unnamed: 0,source,target,edge weight,node weight,color,source type,target type
0,Adams + Sparkes Town Planning and Development,Adam Seaton,1,1,#BEBEBE,Lobbying Firm,Lobbyist
1,Adams + Sparkes Town Planning and Development,Cameron Adams,1,1,#BEBEBE,Lobbying Firm,Lobbyist
2,Adams + Sparkes Town Planning and Development,Michael Lyell,1,1,#BEBEBE,Lobbying Firm,Lobbyist
3,Adams + Sparkes Town Planning and Development,Peter Sparkes,1,1,#BEBEBE,Lobbying Firm,Lobbyist
4,Alistair Nicholas Consulting Pty Ltd,Alistair Nicholas,1,1,#BEBEBE,Lobbying Firm,Lobbyist


#### Customise colour

In [28]:
# custom colours
map_dat.loc[map_dat['source type'] == 'Lobbying Firm', 'color'] = "#9d94ff" # purple
map_dat.loc[map_dat['source type'] == 'Lobbyist', 'color'] = "#c780e8" # purple alt
map_dat.loc[map_dat['source type'] == 'Corporation', 'color'] = "#ff6961" # red
map_dat.loc[map_dat['source type'] == 'Individual', 'color'] = "#42d6a4" # green
map_dat.loc[map_dat['source type'] == 'Other', 'color'] = "#BEBEBE" # grey 
map_dat.loc[map_dat['source type'] == 'Trust/Foundation', 'color'] = "#59adf6" # blue
map_dat.loc[map_dat['source type'] == 'Unincorporated Association', 'color'] = "#ffb480" # orange
map_dat = map_dat.rename(columns={'color': 'source color'})

map_dat.loc[map_dat['target type'] == 'Lobbyist', 'target color'] = "#c780e8" # purple alt
map_dat.loc[map_dat['target type'] == 'Client', 'target color'] = "#ff6961" # red
map_dat.loc[map_dat['target type'] == 'Agent for political party', 'target color'] = "#08cad1" # aqua
map_dat.loc[map_dat['target type'] == 'State candidate', 'target color'] = "#0565f7" # deep blue 
map_dat.loc[map_dat['target type'] == 'Agent for state candidate', 'target color'] = "#08cad1" # aqua
map_dat.loc[map_dat['target type'] == 'Organisation', 'target color'] = "#ff6961" # red

map_dat.head()

Unnamed: 0,source,target,edge weight,node weight,source color,source type,target type,target color
0,Adams + Sparkes Town Planning and Development,Adam Seaton,1,1,#9d94ff,Lobbying Firm,Lobbyist,#c780e8
1,Adams + Sparkes Town Planning and Development,Cameron Adams,1,1,#9d94ff,Lobbying Firm,Lobbyist,#c780e8
2,Adams + Sparkes Town Planning and Development,Michael Lyell,1,1,#9d94ff,Lobbying Firm,Lobbyist,#c780e8
3,Adams + Sparkes Town Planning and Development,Peter Sparkes,1,1,#9d94ff,Lobbying Firm,Lobbyist,#c780e8
4,Alistair Nicholas Consulting Pty Ltd,Alistair Nicholas,1,1,#9d94ff,Lobbying Firm,Lobbyist,#c780e8


In [31]:
map_dat = map_dat[['source', 'target', 'edge weight', 'node weight',
                   'source type', 'target type', 'source color', 'target color']]

map_dat["edge weight"] = map_dat["edge weight"].astype(int)
map_dat["node weight"] = map_dat["node weight"].astype(int)
map_dat.head()

Unnamed: 0,source,target,edge weight,node weight,source type,target type,source color,target color
0,Adams + Sparkes Town Planning and Development,Adam Seaton,1,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
1,Adams + Sparkes Town Planning and Development,Cameron Adams,1,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
2,Adams + Sparkes Town Planning and Development,Michael Lyell,1,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
3,Adams + Sparkes Town Planning and Development,Peter Sparkes,1,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
4,Alistair Nicholas Consulting Pty Ltd,Alistair Nicholas,1,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8


#### Output network map data
output g_node.csv and g_edge.csv for Gephi Network map.

#### Nodes

In [40]:
# extract nodes
# source
source_node = map_dat[['source', 'source type', 'node weight', 'source color']]
source_node['Label'] = '['+ source_node['node weight'].astype(str) +'] '+ source_node['source']
# target
target_node = map_dat[['target', 'target type', 'node weight', 'target color']]
target_node['Label'] = '['+ target_node['node weight'].astype(str) +'] '+ target_node['target']
target_node = target_node.rename(columns={'target': 'source', 'target type': 'source type',                                            
                                            'target color':'source color'})
# merge
nodes = pd.concat([source_node,target_node],ignore_index=True)
nodes = nodes.rename(columns={'source': 'node', 'source type': 'node type',                                            
                                            'source color':'node color'})

# output
nodes.to_csv(r'data\g_node.csv', encoding='utf-8', index=False)
nodes.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  source_node['Label'] = '['+ source_node['node weight'].astype(str) +'] '+ source_node['source']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  target_node['Label'] = '['+ target_node['node weight'].astype(str) +'] '+ target_node['target']



Unnamed: 0,node,node type,node weight,node color,Label
0,Adams + Sparkes Town Planning and Development,Lobbying Firm,1,#9d94ff,[1] Adams + Sparkes Town Planning and Development
1,Adams + Sparkes Town Planning and Development,Lobbying Firm,1,#9d94ff,[1] Adams + Sparkes Town Planning and Development
2,Adams + Sparkes Town Planning and Development,Lobbying Firm,1,#9d94ff,[1] Adams + Sparkes Town Planning and Development
3,Adams + Sparkes Town Planning and Development,Lobbying Firm,1,#9d94ff,[1] Adams + Sparkes Town Planning and Development
4,Alistair Nicholas Consulting Pty Ltd,Lobbying Firm,1,#9d94ff,[1] Alistair Nicholas Consulting Pty Ltd


#### Edges

In [41]:
edges = map_dat[['source', 'target', 'edge weight', 'source type', 'target type', 'source color', 'target color']]
edges = edges.rename(columns={'edge weight':'Weight'})
# output
edges.to_csv(r'data\g_edge.csv', encoding='utf-8', index=False)
edges.head()

Unnamed: 0,source,target,Weight,source type,target type,source color,target color
0,Adams + Sparkes Town Planning and Development,Adam Seaton,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
1,Adams + Sparkes Town Planning and Development,Cameron Adams,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
2,Adams + Sparkes Town Planning and Development,Michael Lyell,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
3,Adams + Sparkes Town Planning and Development,Peter Sparkes,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8
4,Alistair Nicholas Consulting Pty Ltd,Alistair Nicholas,1,Lobbying Firm,Lobbyist,#9d94ff,#c780e8


### Only run below if you wish to open the folder location.

In [42]:
# open data folder on Windows
import os
path = "data"

if platform.system() == 
path = os.path.realpath(path)
os.startfile(path)

In [44]:
platform.system()

'Windows'

In [43]:
# for Mac
mport subprocess
import os
path = "data"
if os.path.exists(path):
    subprocess.call(["open", path])

SyntaxError: invalid syntax (<ipython-input-43-f3cbf8e7f9bc>, line 2)