Web scraping is the process of extracting HTML content from websites and transforming it into useful data.
While Python is the most common language for this task, Visual Basic for Applications (VBA) provides a capable, built-in alternative within Microsoft Office.
VBA is the internal scripting language for Excel, Word, and other Office products.
By combining VBA with Excel’s data manipulation features, you can scrape and analyze information directly in Excel without relying on external libraries or tools.
This repository demonstrates two practical approaches to web scraping using VBA:
- Automated browser path (Internet Explorer COM) – suitable for complex or dynamic pages that need to be rendered.
- Direct HTTP path (
ServerXMLHTTP) – faster and simpler for static web pages that return plain HTML.
Internet Explorer automation is deprecated on modern Windows systems. Use
ServerXMLHTTPwhere possible, or integrate SeleniumBasic for modern browsers.
VBA is disabled by default in Microsoft Excel. To enable it:
-
Open Excel and click File
-
Click Options at the bottom of the sidebar.
-
Choose Customize Ribbon.
-
On the right-hand side, enable the checkbox for Developer and click OK.
-
Confirm that the Developer tab is now visible at the top of Excel.
You can now access the VBA Editor through Developer → Visual Basic.
Inside the VBA editor, go to Tools → References and enable the following libraries:
- Microsoft HTML Object Library
- Microsoft Internet Controls
These libraries are essential for interacting with HTML and automating Internet Explorer.
The two enabled libraries provide the foundation for VBA web scraping:
- Microsoft HTML Object Library – allows manipulation of HTML elements such as tags, IDs, and classes.
- Microsoft Internet Controls – provides access to the Internet Explorer engine to open and navigate web pages.
Together, they allow Excel to send requests, load pages, and extract information from HTML content.
Alternatively, you can use ServerXMLHTTP to fetch data directly from websites without launching a browser.
- Create a new worksheet in Excel and name it Test.
- Open the VBA editor (Developer → Visual Basic).
- Right-click your workbook, select Insert → Module, and paste the following code.
Sub ScrapeWebsiteUsingIE()
' Create Internet Explorer instance
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
' Hide browser window
ie.Visible = False
ie.navigate "http://iproyal.com"
' Wait for page to load completely
Do While ie.Busy Or ie.readyState <> 4
DoEvents
Loop
' Access the HTML document
Dim html As Object
Set html = ie.document
' Retrieve specific HTML elements
Dim element As Object
Set element = html.getElementsByClassName("mt-16 sm:mt-24 text-center lg:text-left tp-headline-m lg:tp-headline-xl")
' Reference the output worksheet
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Test")
Dim startRow As Integer
startRow = 1
' Loop through elements and extract text
If element.Length > 0 Then
Dim i As Integer
For i = 0 To element.Length - 1
On Error Resume Next
Dim elementText As String
elementText = element.Item(i).innerText
If Err.Number <> 0 Then
Err.Clear
elementText = element.Item(i).outerText
End If
On Error GoTo 0
If elementText <> "" Then
ws.Cells(startRow, 1).Value = elementText
Else
ws.Cells(startRow, 1).Value = "Element does not support innerText or outerText."
End If
startRow = startRow + 1
Next i
Else
ws.Cells(startRow, 1).Value = "No elements found with the specified class name."
End If
' Close Internet Explorer
ie.Quit
Set ie = Nothing
End SubSub ScrapeWebsiteUsingServerXMLHTTP()
' Create HTTP object
Dim xml As Object
Set xml = CreateObject("MSXML2.ServerXMLHTTP.6.0")
' Send GET request
xml.Open "GET", "http://iproyal.com", False
xml.send
' Wait for the response
Do While xml.readyState <> 4
DoEvents
Loop
' Verify the request
If xml.Status = 200 Then
' Parse HTML
Dim html As Object
Set html = CreateObject("HTMLFile")
html.body.innerHTML = xml.responseText
' Get elements
Dim elements As Object
Set elements = html.getElementsByClassName("mt-16 sm:mt-24 text-center lg:text-left tp-headline-m lg:tp-headline-xl")
Dim startRow As Integer
startRow = 1
' Write results
If elements.Length > 0 Then
Dim i As Integer
For i = 0 To elements.Length - 1
Dim elementText As String
On Error Resume Next
elementText = elements.Item(i).innerText
If Err.Number <> 0 Then
Err.Clear
elementText = elements.Item(i).outerText
End If
On Error GoTo 0
Cells(startRow, 1).Value = elementText
startRow = startRow + 1
Next i
Else
Cells(startRow, 1).Value = "No elements found with the specified class name."
End If
Else
MsgBox "Failed to retrieve the web page. Status: " & xml.Status
End If
' Clean up
Set xml = Nothing
End Sub| Method | Advantages | Disadvantages |
|---|---|---|
| Internet Explorer | Handles dynamic content and JavaScript | Slower, deprecated |
| ServerXMLHTTP | Fast and lightweight | Can be blocked by websites or fail on dynamic pages |




