Hi. My seleniumbase scraper has suddenly started redirecting URL requests to a strange URL. It only seems to this for a particular site.
I've got it down to this MRE:
import seleniumbase as sb
def main():
driver = sb.Driver(uc=True, headless=True)
url = "https://www.atptour.com/scores/results-archive?year=1896&tournamentType=atpgs"
driver.get(url)
print(driver.current_url)
driver.quit()
if __name__ == "__main__":
main()
The print statement shows that the driver's current URL becomes:
chrome-extension://nkeimhogjdpnpccoofpliimaahmaaome/background.html
Worth mentioning that sometimes it sometimes takes a couple of runs for the issue to manifest.
I've tested this with "https://www.seleniumbase.io/" and can't replicate the issue.
Interestingly, I tested out adding a couple more requests to the chain and removing headless mode as follows:
import seleniumbase as sb
def main():
driver = sb.Driver(uc=True)
url = "https://www.atptour.com/scores/results-archive?year=1896&tournamentType=atpgs"
driver.get(url)
print(driver.current_url)
url = "https://www.atptour.com/scores/results-archive?year=1897&tournamentType=atpgs"
driver.get(url)
print(driver.current_url)
url = "https://www.atptour.com/scores/results-archive?year=1898&tournamentType=atpgs"
driver.get(url)
print(driver.current_url)
driver.quit()
if __name__ == "__main__":
main()
Stepping through this in the debugger then each page is loaded as expected however the print statements record:
chrome-extension://nkeimhogjdpnpccoofpliimaahmaaome/background.html
https://www.atptour.com/scores/results-archive?year=1896&tournamentType=atpgs
https://www.atptour.com/scores/results-archive?year=1897&tournamentType=atpgs
So it looks like after the first anomolous response the driver is running one URL behind.
Any ideas on what might be happening?
Hi. My seleniumbase scraper has suddenly started redirecting URL requests to a strange URL. It only seems to this for a particular site.
I've got it down to this MRE:
The
printstatement shows that the driver's current URL becomes:chrome-extension://nkeimhogjdpnpccoofpliimaahmaaome/background.htmlWorth mentioning that sometimes it sometimes takes a couple of runs for the issue to manifest.
I've tested this with "https://www.seleniumbase.io/" and can't replicate the issue.
Interestingly, I tested out adding a couple more requests to the chain and removing
headlessmode as follows:Stepping through this in the debugger then each page is loaded as expected however the print statements record:
So it looks like after the first anomolous response the driver is running one URL behind.
Any ideas on what might be happening?