In [1]:
%pip install selenium webdriver-manager beautifulsoup4

Note: you may need to restart the kernel to use updated packages.


In [4]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
import time

# Set up ChromeDriver with Selenium
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

# Go to the TSX 60 page
url = "https://www.theglobeandmail.com/investing/markets/indices/TXSX/components/"
driver.get(url) # driver.get(url) tells Selenium to navigate to that page using Chrome.

# Wait for dynamic content to load
time.sleep(5)  # wait 5 seconds before scraping content. You can improve this with WebDriverWait

# Get the page source after JavaScript runs
html = driver.page_source # This HTML now includes all the dynamically loaded content (thanks to JavaScript).
soup = BeautifulSoup(html, 'html.parser') # Passes the HTML to BeautifulSoup to parse it and extract data in a more readable way (like .find() or .select()).

driver.quit()
# Closes the Chrome browser and ends the WebDriver session. Always good practice to clean up when you’re done scraping.

In [6]:
print(type(html))      # should be <class 'str'>

<class 'str'>


```from selenium import webdriver```

This imports the **Selenium WebDriver module**, which allows you to control a web browser(e.g., Chrome) via Python.


Selenium mimics a real user: it opens a browser window, clicks buttons, waits for pages to load, etc.

```from selenium.webdriver.chrome.service import Service```

Imports the Service class from Selenium, which is used to configure and manage the ChromeDriver service instance.


It handles the communication between your Python script and the Chrome browser.


```from webdriver_manager.chrome import ChromeDriverManager```

This imports a utility to automatically download and install the correct version of ChromeDriver that matches your local Chrome browser.


It prevents compatibility errors and saves you from manually managing chromedriver.exe.

```from bs4 import BeautifulSoup```

Imports BeautifulSoup, a library used to parse HTML/XML content and extract data from web pages.


After Selenium loads the page, we pass the HTML to BeautifulSoup to extract the info we need.


```driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))```

Sets up and starts the Chrome browser.


ChromeDriverManager().install() automatically gets the latest compatible ChromeDriver.


Service(...) wraps this driver into a service.


webdriver.Chrome(...) starts an actual Chrome browser that you can control.

In [9]:
a = soup.find_all('tr', attrs={'data-barchart-symbol': True})
a
# Uses BeautifulSoup to search the parsed HTML and extract specific rows (<tr> elements)

[<tr data-barchart-symbol="ABX-T">
 <td class="fix-col text-left" data-barchart-field="symbol" data-barchart-field-type="symbol">
 <barchart-link-field raw-symbol="ABX.TO" symbol="ABX-T" symbolname="Barrick Mining Corp" type="6" value="ABX-T">
 <a href="/investing/markets/stocks/ABX-T/">ABX-T</a>
 </barchart-link-field>
 </td>
 <td class="text-left" data-barchart-field="symbolName" data-barchart-field-type="symbolName">
 <barchart-field binding="true" name="symbolName" symbol="ABX.TO" type="string" value="Barrick Mining Corp">Barrick Mining Corp</barchart-field>
 </td>
 <td class="text-right" data-barchart-field="lastPrice" data-barchart-field-type="lastPrice">
 <barchart-field binding="true" name="lastPrice" symbol="ABX.TO" type="price" value="27.15">27.15</barchart-field>
 </td>
 <td class="text-right" data-barchart-field="priceChange" data-barchart-field-type="priceChange">
 <barchart-field binding="true" class="quoteDown" name="priceChange" symbol="ABX.TO" type="priceChange" value=

In [11]:
symbols = [tr['data-barchart-symbol'] for tr in a]
symbols


['ABX-T',
 'AEM-T',
 'AQN-T',
 'ATD-T',
 'BAM-T',
 'BCE-T',
 'BIP-UN-T',
 'BMO-T',
 'BN-T',
 'BNS-T',
 'CAE-T',
 'CAR-UN-T',
 'CCL-B-T',
 'CCO-T',
 'CM-T',
 'CNQ-T',
 'CNR-T',
 'CP-T',
 'CSU-T',
 'CTC-A-T',
 'CVE-T',
 'DOL-T',
 'EMA-T',
 'ENB-T',
 'FM-T',
 'FNV-T',
 'FSV-T',
 'FTS-T',
 'GIB-A-T',
 'GIL-T',
 'H-T',
 'IFC-T',
 'IMO-T',
 'K-T',
 'L-T',
 'MFC-T',
 'MG-T',
 'MRU-T',
 'NA-T',
 'NTR-T',
 'OTEX-T',
 'POW-T',
 'PPL-T',
 'QSR-T',
 'RCI-B-T',
 'RY-T',
 'SAP-T',
 'SHOP-T',
 'SLF-T',
 'SU-T',
 'T-T',
 'TD-T',
 'TECK-B-T',
 'TOU-T',
 'TRI-T',
 'TRP-T',
 'WCN-T',
 'WN-T',
 'WPM-T',
 'WSP-T']

In [12]:
len(symbols)

60

Now we get the tickers of 60 stocks on TSX 60

But we need to convert these tickers to Bloomberg version of ticker name.

In [13]:
# remove the -T suffix from each element
clean_symbols = [s.replace('-T', '') for s in symbols]
clean_symbols

['ABX',
 'AEM',
 'AQN',
 'ATD',
 'BAM',
 'BCE',
 'BIP-UN',
 'BMO',
 'BN',
 'BNS',
 'CAE',
 'CAR-UN',
 'CCL-B',
 'CCO',
 'CM',
 'CNQ',
 'CNR',
 'CP',
 'CSU',
 'CTC-A',
 'CVE',
 'DOL',
 'EMA',
 'ENB',
 'FM',
 'FNV',
 'FSV',
 'FTS',
 'GIB-A',
 'GIL',
 'H',
 'IFC',
 'IMO',
 'K',
 'L',
 'MFC',
 'MG',
 'MRU',
 'NA',
 'NTR',
 'OTEX',
 'POW',
 'PPL',
 'QSR',
 'RCI-B',
 'RY',
 'SAP',
 'SHOP',
 'SLF',
 'SU',
 'T',
 'TD',
 'TECK-B',
 'TOU',
 'TRI',
 'TRP',
 'WCN',
 'WN',
 'WPM',
 'WSP']