fix: skip broken cross-origin iframe hierarchy reads#3289
fix: skip broken cross-origin iframe hierarchy reads#3289schmayterling wants to merge 7 commits into
Conversation
I'm concerned that this would lead to flaky tests. If run an Should we be handling failures in iframes with retry logic and eventual errors rather than silently eliding parts of the DOM? |
yeah, I think that can still happen for iframe-targeted assertions if the iframe is temporarily unavailable during that hierarchy snapshot. but yea the goal of this patch is to avoid crashing the entire hierarchy build when a third-party iframe is stale/broken/loading, and keep the rest of the page usable instead of failing with
I think retry/wait behavior for iframe content would need to happen at the assertion/command layer rather than inside hierarchy parsing itself. would be happy to see your thoughts with this perspective |
proksh
left a comment
There was a problem hiding this comment.
Thanks @schmayterling
- When iframe content is bad we keep the parent node but drop its children. An assertVisible targeting iframe content could fail on one snapshot and pass on rerun → flaky. We have stopped accepting pull requests that results in flaky test, so that will stop us to approve this PR. Do you have a solution that can be consistent?
- Rebase to main and revert the part either of #3314 or #3315 covers.
- Suggest rebasing on top of #3314 + #3315 once they merge so the diff shrinks to just the parsing/enrichment fix.
| // ChromeDriver can execute scripts inside cross-origin iframes via switchTo().frame() | ||
| driver.switchTo().frame(iframeElement) | ||
| return try { | ||
| driver.switchTo().frame(iframeElement) |
There was a problem hiding this comment.
This reordering - moving switchTo().frame() inside the try so a StaleElementReferenceException degrades to null - is exactly #3314. Let's revert it here and rebase on main to avoid two copies of the same fix.
| val params = WebHierarchy.parseIframeViewportParams(paramsJson, iframeSrc) ?: return null | ||
|
|
||
| // ChromeDriver can execute scripts inside cross-origin iframes via switchTo().frame() | ||
| driver.switchTo().frame(iframeElement) |
There was a problem hiding this comment.
Same as above. This reordering - moving switchTo().frame() inside the try so a StaleElementReferenceException degrades to null - is exactly #3314. Let's revert it here and rebase on main to avoid two copies of the same fix.
| return TreeNode(attributes = attributes, children = children) | ||
| } | ||
|
|
||
| fun isTransientBrowserContextError(e: Throwable): Boolean { |
There was a problem hiding this comment.
This classifies Selenium exceptions as transient - which is what #3315's SeleniumExceptionTranslator is for, at the Driver boundary. Once #3315 lands, callers won't see raw org.openqa.selenium.* here. Suggest removing this and the logWebHierarchyFailure / logIframeFetchFailure helpers that depend on it, and letting #3315 own the classification.
| } | ||
| } | ||
|
|
||
| private fun logWebHierarchyFailure(message: String, e: Exception) { |
There was a problem hiding this comment.
Tied to the classification in WebHierarchy - drop with isTransientBrowserContextError. After #3315, transient/infra failures don't reach this logging path.
| val rawMap = contentDesc as Map<String, Any> | ||
| val enrichedMap = injectCrossOriginIframes(rawMap) | ||
| val root = parseDomAsTreeNodes(enrichedMap) | ||
| val rawMap = WebHierarchy.normalizeDomNode(contentDesc, "web content description") |
|
|
||
| val options = driver.capabilities.getCapability("goog:chromeOptions") as Map<String, Any> | ||
| val debuggerAddress = options["debuggerAddress"] as String | ||
| val options = driver.capabilities.getCapability("goog:chromeOptions") as? Map<*, *> |
There was a problem hiding this comment.
Nice defensive cast - keep. Independent of the iframe work.
| webScreenRecorder?.onWindowChange() | ||
| } | ||
| } | ||
| if (windowHandles.isNotEmpty()) { |
There was a problem hiding this comment.
This window-handle fallback in detectWindowChange is a meaningful behavior change beyond the iframe-crash fix. Is it required for #3271, or is it scope that could be its own PR? Easier to review/revert if split out. (Also overlaps conceptually with #3315's NoSuchSessionException / unreachable handling — worth checking they don't fight.)
| return when (query) { | ||
| is OnDeviceElementQuery.Css -> queryCss(query) | ||
| else -> super.queryOnDeviceElements(query) | ||
| if (query is OnDeviceElementQuery.Css) { |
There was a problem hiding this comment.
Why were these changes needed?
| } | ||
| } | ||
|
|
||
| return mapOf( |
There was a problem hiding this comment.
This is the silent-elision behavior @Fishbowler flagged: when iframe content is bad we keep the parent node but drop its children. An assertVisible targeting iframe content could fail on one snapshot and pass on rerun → flaky.
You already noted retry/wait belongs at the command/assertion layer - agree, but we will prefer a solution that is consistent. We have stopped merging things that results in flaky results.
Proposed changes
fixes a Maestro web hierarchy crash when cross-origin iframe enrichment hits transient browser state or malformed JS/CDP results.
see:
from my understanding this usually happens after navigation, popup return, or iframe reload. the parent page is still usable, but a third-party iframe can be stale/closed/still loading/detached, or return null/malformed hierarchy data.
previously, that path used unsafe casts into
Map<String, Any>and numeric fields, so one broken iframe could crash the whole hierarchy build.this patch tries keeps the hierarchy path best-effort, see flow below
Testing
unit tests should cover:
ran the following:
idt e2e is needed because there already is one for iframes
Issues fixed