# Capital IQ Webscraping | Key Developments

A demonstration for scraping key developments from the Capital IQ Website.

In [1]:
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import lxml
import getpass

### Website and url parameters

In [2]:
company = 18527 # ABB, Ltd
date_range = 'y1' # other options include: w1, d30, m3, m6, y1, y2, y3, y5, all
url = 'https://www.capitaliq.com/CIQDotNet/KeyDevs/KeyDevelopments.aspx?companyId={}&selDateRangeOption={}'

### Credentials for website authentication

In [3]:
username = input()

 israel.dryer@us.gt.com


In [4]:
password = getpass.getpass()

 ·········


### Create the browser bot

In [5]:
bot = webdriver.Chrome()

### Navigate to the website and login

In [6]:
bot.get(url.format(company, date_range))

In [7]:
bot.find_element_by_id('username').send_keys(username)

In [8]:
pwd = bot.find_element_by_id('password')
pwd.send_keys(password)
pwd.send_keys(Keys.RETURN)

### Extract data from the webpage

Show all records and expand all rows if available

In [9]:
# show all records
view_all = bot.find_element_by_id("Displaysection3_myKeyDevDataGrid_myDataGrid_viewall")
view_all.click()

In [10]:
# expand all rows
exp_rows = bot.find_element_by_id("Displaysection3_myKeyDevDataGrid_myDataGrid_Icon")
exp_rows.click()

In [11]:
soup = BeautifulSoup(bot.page_source, 'lxml')

### Find table details within html and parse

In [12]:
table = soup.find('table',{'class':'cTblListBody'}).tbody.find_all('td')

In [13]:
print(table[15])

<td align="left" style="width:200px;" valign="top">
<span>Business Expansion</span>
<span style="float:right;"><a data-ensho="18527,636689187" enableviewstate="false" id="636689187" name="KeyDev" onclick="KenshoService.openKenshoPopup(event)" style="float:right;cursor: pointer;" value="Business Expansion"><img alt="" src="https://w1.ciqimg.com/CIQDotNet/images/enzo.png?urwvid=805769356" style="display:none;float:right;" title="Kensho Analytics"/></a></span>
</td>


### Extract and strip the text from each of the < td > elements

In [14]:
table_rows = [x.text.strip() for x in table]

In [15]:
for i in range(13):
    print(i, ' ', table_rows[i])

0   
1   Date
2   Type
3   Headline
4   Other Parties
5   
6   Sep-18-2019
7   Client Announcement
8   ABB Signs Major Framework Contract with Austrian Power Grid for Largest Ever Grid Expansion in Austria
9   -
10   
11   Situation: ABB has signed a five-year framework contract with Austrian Power Grid (APG), potentially worth more than $100 million to supply gas-insulated switchgear (GIS), in the largest electricity grid expansion to date in Austria. ABB will be supplying GIS for the construction of a transmission grid that will help to strengthen the infrastructure in order to gradually integrate electricity generated by more renewable sources. Renewable energy is difficult to predict and causes load fluctuations. Its integration into the power grid requires a strong and resilient transmission infrastructure. The network connects wind power generators in eastern Austria to pumped storage power plants in the western part of the country. It will also transport surplus solar and wind p

### The records do not begin until index 6, so I can start there

In [16]:
table_rows = [x.text.strip() for x in table][6:]

### The last row contains extra irrelevant data, so I'll pop this from the list

In [17]:
print(table_rows.pop())

Viewing 1-150 of 150 Key Developments [View 1-25  | 26-50  | 51-75  | 76-100  | 101-125  | 126-150] [View All]


### Each record is a chunk of 8 list elements; the last 2 can be ignored

In [18]:
for i in range(8):
    print(i, table_rows[i])

0 Sep-18-2019
1 Client Announcement
2 ABB Signs Major Framework Contract with Austrian Power Grid for Largest Ever Grid Expansion in Austria
3 -
4 
5 Situation: ABB has signed a five-year framework contract with Austrian Power Grid (APG), potentially worth more than $100 million to supply gas-insulated switchgear (GIS), in the largest electricity grid expansion to date in Austria. ABB will be supplying GIS for the construction of a transmission grid that will help to strengthen the infrastructure in order to gradually integrate electricity generated by more renewable sources. Renewable energy is difficult to predict and causes load fluctuations. Its integration into the power grid requires a strong and resilient transmission infrastructure. The network connects wind power generators in eastern Austria to pumped storage power plants in the western part of the country. It will also transport surplus solar and wind power to pumped storage power plants in the Alps that act as "green batter

### Append record chunks to a new list

In [19]:
row_count = len(table_rows)

In [20]:
records = []

for i in range(0, row_count, 8):
    if table_rows[i:i+8]:
        records.append(table_rows[i:i+6])
    else:
        continue

In [21]:
for i, row in enumerate(records[0]):
    print(i, row)

0 Sep-18-2019
1 Client Announcement
2 ABB Signs Major Framework Contract with Austrian Power Grid for Largest Ever Grid Expansion in Austria
3 -
4 
5 Situation: ABB has signed a five-year framework contract with Austrian Power Grid (APG), potentially worth more than $100 million to supply gas-insulated switchgear (GIS), in the largest electricity grid expansion to date in Austria. ABB will be supplying GIS for the construction of a transmission grid that will help to strengthen the infrastructure in order to gradually integrate electricity generated by more renewable sources. Renewable energy is difficult to predict and causes load fluctuations. Its integration into the power grid requires a strong and resilient transmission infrastructure. The network connects wind power generators in eastern Austria to pumped storage power plants in the western part of the country. It will also transport surplus solar and wind power to pumped storage power plants in the Alps that act as "green batter

### Remove situation label

In [22]:
for row in records:
    row[5] = row[5].replace('Situation: ','')

### Remove the empty field between headline and situation

In [23]:
for row in records:
    row.pop(4)

### Remove duplicate events if reported more than once

In [33]:
data = list(set([tuple(row) for row in records]))

In [36]:
for i, row in enumerate(data[0]):
    print(i, row)

0 Sep-18-2019
1 Client Announcement
2 ABB Signs Major Framework Contract with Austrian Power Grid for Largest Ever Grid Expansion in Austria
3 -
4 ABB has signed a five-year framework contract with Austrian Power Grid (APG), potentially worth more than $100 million to supply gas-insulated switchgear (GIS), in the largest electricity grid expansion to date in Austria. ABB will be supplying GIS for the construction of a transmission grid that will help to strengthen the infrastructure in order to gradually integrate electricity generated by more renewable sources. Renewable energy is difficult to predict and causes load fluctuations. Its integration into the power grid requires a strong and resilient transmission infrastructure. The network connects wind power generators in eastern Austria to pumped storage power plants in the western part of the country. It will also transport surplus solar and wind power to pumped storage power plants in the Alps that act as "green batteries" for stori

### Create and preview dataframe

In [37]:
df = pd.DataFrame(data, columns=['Date','EventType','Headline','OtherParties','Situation'])

In [38]:
df.head(10)

Unnamed: 0,Date,EventType,Headline,OtherParties,Situation
0,Sep-18-2019,Client Announcement,ABB Signs Major Framework Contract with Austri...,-,ABB has signed a five-year framework contract ...
1,Jul-08-2019,Product-Related Announcement,ABB Pilots Automation Solution for the Next Ge...,-,ABB pilots automation solution for the next ge...
2,Oct-22-2018,Conference,"Constellation Research Inc., Constellation Res...","Credit Suisse Group AG (SWX:CSGN), Blue Prism ...","Constellation Research Inc., Constellation Res..."
3,Jul-25-2019,Announcement of Earnings,Abb Ltd Reports Earnings Results for the Secon...,-,ABB Ltd. announced earnings results for the se...
4,Jan-21-2019,Client Announcement,Unibap AB (publ.) Enters into an Agreement wit...,Unibap AB (publ) (OM:UNIBAP),Unibap AB (publ.) has been granted access to A...
5,Jun-09-2019,Conference,"Edison Electric Institute Inc., Edison Electri...","Landis+Gyr Group AG (SWX:LAND), Accenture plc ...","Edison Electric Institute Inc., Edison Electri..."
6,Apr-23-2019,Company Conference Presentation,"ABB Ltd Presents at Field Service USA, Apr-23-...",-,"ABB Ltd Presents at Field Service USA, Apr-23-..."
7,Dec-04-2018,Client Announcement,ABB Wins an Order to Supply its ABB Ability™ E...,-,ABB has won an order to supply its ABB Ability...
8,Jul-17-2019,Client Announcement,Dover Fueling Solutions Announces Cooperation ...,Dover Corporation (NYSE:DOV),"Dover Fueling Solutions (DFS), a part of Dover..."
9,May-10-2019,Client Announcement,ABB Wins Major Order to Transmit Wind Power fr...,-,ABB’s Power Grids business has been awarded a ...
