Skip to content

Conversation

@nvborisenko
Copy link
Member

@nvborisenko nvborisenko commented Nov 17, 2025

User description

Generate atoms as compilation unit instead of embedded assembly resources.

🔗 Related Issues

Fixes #16600

💥 What does this PR do?

  • Created bazel private rule too generate atoms to internal ResourceUtilities.cs
  • Use generated strings

💡 Additional Considerations

🔄 Types of changes

  • Cleanup (formatting, renaming)

PR Type

Enhancement, Tests


Description

  • Generate JavaScript atoms as static C# strings instead of embedded resources

  • Create Bazel rule and Python tool to compile atoms into ResourceUtilities partial class

  • Replace runtime resource loading with direct string references throughout codebase

  • Remove embedded resource dependencies from build configuration


Diagram Walkthrough

flowchart LR
  JS["JavaScript Atoms<br/>find-elements.js<br/>is-displayed.js<br/>get-attribute.js<br/>mutation-listener.js<br/>webdriver_prefs.json"]
  Tool["Python Generator Tool<br/>generate_resources_tool.py"]
  Bazel["Bazel Rule<br/>generated_resource_utilities"]
  PartialClass["ResourceUtilities.g.cs<br/>Partial Class with<br/>const string properties"]
  CSharp["C# Source Files<br/>FirefoxProfile.cs<br/>JavaScriptEngine.cs<br/>WebElement.cs<br/>RelativeBy.cs"]
  
  JS -->|Input| Tool
  Tool -->|Generates| PartialClass
  Bazel -->|Orchestrates| Tool
  PartialClass -->|Compile-time<br/>inclusion| CSharp
Loading

File Walkthrough

Relevant files
Build
6 files
generate_resources.bzl
New Bazel rule for resource generation                                     
+48/-0   
generate_resources_tool.py
Python tool to generate C# resource class                               
+85/-0   
BUILD.bazel
Define py_binary for resource generator                                   
+6/-0     
defs.bzl
Export generated_resource_utilities rule                                 
+2/-0     
BUILD.bazel
Replace embedded resources with generated class                   
+22/-29 
Selenium.WebDriver.csproj
Update build targets to generate resources                             
+4/-23   
Enhancement
6 files
ResourceUtilities.cs
Convert to partial class for code generation                         
+1/-1     
FirefoxProfile.cs
Use generated webdriver_prefs string constant                       
+4/-8     
FirefoxExtension.cs
Remove resource loading, use generated string                       
+2/-22   
JavaScriptEngine.cs
Replace mutation listener resource with constant                 
+1/-17   
RelativeBy.cs
Use generated find_elements string constant                           
+1/-10   
WebElement.cs
Replace atom resource loading with constants                         
+2/-19   

@selenium-ci selenium-ci added C-dotnet .NET Bindings B-build Includes scripting, bazel and CI integrations labels Nov 17, 2025
@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Nov 17, 2025

PR Compliance Guide 🔍

(Compliance updated until commit e2cac36)

Below is a summary of compliance checks for this PR:

Security Compliance
Code generation robustness

Description: The generator assumes input content never contains five consecutive double quotes and
embeds it in a raw string literal with fixed five-quote delimiter, which can break code
generation if such a sequence exists; inputs should be validated or the delimiter should
be dynamically increased to ensure safe embedding.
generate_resources_tool.py [47-59]

Referred Code
with open(path, "r", encoding="utf-8") as f:
    content = f.read()
# Use a C# raw string literal with five quotes. For a valid raw
# literal, the content must start on a new line and the closing
# quotes must be on their own line as well. We assume the content
# does not contain a sequence of five consecutive double quotes.
#
# Resulting C# will look like:
#   """""
#   <content>
#   """""
literal = '"""""\n' + content + '\n"""""'
props.append(f"    internal const string {prop_name} = {literal};")
Logic error handling

Description: Constructing a ZipArchive directly from UTF-8 bytes of WebDriverPrefsJson treats JSON as a
zip file, which will throw if content is not a valid ZIP and may indicate a logic error;
if user-controlled, it could lead to exceptions—needs validation that the data is a ZIP
archive or adjust logic to parse JSON instead.
FirefoxExtension.cs [72-75]

Referred Code
using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.WebDriverPrefsJson));
using (ZipArchive extensionZipArchive = new ZipArchive(zipFileStream, ZipArchiveMode.Read))
{
    extensionZipArchive.ExtractToDirectory(tempFileName);
Ticket Compliance
🟡
🎫 #16600
🟢 Stop loading atoms and other resources from embedded assembly resources at runtime.
Generate a C# source file at build time that contains static string constants for required
atoms/resources.
Use the generated static strings in place of ResourceUtilities.GetResourceStream calls
across the codebase.
Reduce dependency on Bazel resource naming by generating compile-time constants.
Improve performance by avoiding repeated assembly resource lookups.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Weak error handling: The generator reads inputs and writes output without try/except around file operations and
lacks actionable error messages for I/O failures or empty input cases.

Referred Code
def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(f"    internal const string {prop_name} = {literal};")

    lines: List[str] = []
    lines.append("// <auto-generated />")
    lines.append("namespace OpenQA.Selenium.Internal;")
    lines.append("")


 ... (clipped 21 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Lack of logging: Newly added generation and build steps perform critical actions (file reads/writes and
code generation) without emitting any audit logs, making it hard to trace failures or
actions.

Referred Code
def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(f"    internal const string {prop_name} = {literal};")

    lines: List[str] = []
    lines.append("// <auto-generated />")
    lines.append("namespace OpenQA.Selenium.Internal;")
    lines.append("")


 ... (clipped 11 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation gaps: While IDENT=path parsing is validated, there is no explicit check for file
existence/emptiness or content constraints (e.g., five-quote sequences), which may result
in malformed generated code.

Referred Code
def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path


def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing


 ... (clipped 24 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit 486f308
Security Compliance
Code generation robustness

Description: The generator assumes the JS content never contains five consecutive double quotes and
embeds it in a fixed 5-quote C# raw string literal, which could break code generation if
such a sequence appears, potentially leading to build-time code injection or malformed
output.
generate_resources_tool.py [42-56]

Referred Code
with open(path, "r", encoding="utf-8") as f:
    content = f.read()
# Use a C# raw string literal with five quotes. For a valid raw
# literal, the content must start on a new line and the closing
# quotes must be on their own line as well. We assume the content
# does not contain a sequence of five consecutive double quotes.
#
# Resulting C# will look like:
#   """""
#   <content>
#   """""
literal = '"""""\n' + content + '\n"""""'
props.append(
    f"    internal const string {ident} = {literal};"
)
Ticket Compliance
🟡
🎫 #16600
🟢 Generate C# code at build time to expose atoms/resources as native constant strings
instead of loading embedded assembly resources.
Update .NET WebDriver code to use the generated constants for atoms (e.g., is-displayed,
find-elements, get-attribute, mutation-listener, webdriver_prefs) instead of reading from
resources via reflection.
Integrate generation into the Bazel and .NET build so the generated C# file is produced
and compiled as part of the build.
Reduce dependency on Bazel resource naming conventions by avoiding embedded resource
lookup.
Maintain existing functionality and behavior while improving performance by avoiding
repeated resource loading.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Weak error handling: The generator reads files and writes output without handling IO errors or validating that
raw string delimiter conflicts cannot occur, risking unhandled exceptions or malformed
generated code.

Referred Code
def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for ident, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(
            f"    internal const string {ident} = {literal};"
        )

    lines: List[str] = []
    lines.append("// <auto-generated />")


 ... (clipped 12 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No auditing: The new generator tool and related Bazel rules perform build-time code generation without
any added logging or audit trail, which may be acceptable for build tooling but adds no
audit coverage for runtime critical actions.

Referred Code
#!/usr/bin/env python3
"""Generate C# ResourceUtilities partial class with embedded JS resources.

Usage:
  generate_resources_tool.py --output path/to/ResourceUtilities.g.cs \
      --input Ident1=path/to/file1.js \
      --input Ident2=path/to/file2.js ...

Each identifier becomes a const string in ResourceUtilities class.
The content is emitted as a C# raw string literal using 5-quotes.
"""

import argparse
import os
import sys
from typing import List, Tuple


def parse_args(argv: List[str]) -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument("--output", required=True)


 ... (clipped 64 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Direct exceptions: The tool raises raw exceptions with internal details (e.g., invalid arguments) which is
typical for build tools but could expose paths or internals if surfaced to end users;
context of exposure is unclear.

Referred Code
def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Limited validation: The generator minimally validates identifiers and paths from --input and emits them into
C# without escaping beyond assuming no five-quote sequence, which could break builds if
inputs contain disallowed characters; broader sanitization or escaping may be needed.

Referred Code
    parser = argparse.ArgumentParser()
    parser.add_argument("--output", required=True)
    parser.add_argument("--input", action="append", default=[], help="IDENT=path")
    return parser.parse_args(argv)


def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path


def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []


 ... (clipped 16 lines)

Learn more about managing compliance generic rules or creating your own custom rules

@qodo-merge-pro
Copy link
Contributor

qodo-merge-pro bot commented Nov 17, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect ZIP archive handling

The code incorrectly treats the webdriver_prefs JSON string as a ZIP archive,
which will cause a runtime failure. This logic is a result of a faulty
refactoring and should be removed, as webdriver_prefs is not a browser
extension.

dotnet/src/webdriver/Firefox/FirefoxExtension.cs [72-76]

-using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.webdriver_prefs));
-using (ZipArchive extensionZipArchive = new ZipArchive(zipFileStream, ZipArchiveMode.Read))
+// The webdriver_prefs.json file is not an extension, so it should not be
+// processed here. This logic appears to be a remnant of a refactoring.
+// The preferences from this resource are handled in FirefoxProfile.cs.
+// We can remove this block. If there are other extensions to handle,
+// they should be processed here, but not webdriver_prefs.
+// For now, we can assume this was a mistake and the original file
+// being processed was the webdriver extension, which is now handled
+// differently or is no longer needed.
+// If an actual extension file needs to be processed, it should be
+// loaded from ResourceUtilities and then decompressed.
+// e.g. using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.some_extension_xpi));
+// For now, we'll assume this is dead code.
+// If this method is still called for a file that is a real extension,
+// the logic to get its content from ResourceUtilities would be needed.
+// However, `webdriver_prefs` is definitely not a zip file.
+// The call to this method should be investigated.
+// For a minimal fix, we can throw an exception to indicate this misuse.
+if (this.extensionFileName == "webdriver.xpi")
+{
+    // This is a placeholder for what should be done.
+    // The webdriver.xpi is no longer shipped as a resource in this manner.
+    // The preferences are set directly.
+    return;
+}
+
+// Fallback to original file-based logic if it's not the known resource.
+if (!File.Exists(this.extensionFileName))
+{
+    throw new FileNotFoundException("Extension file not found", this.extensionFileName);
+}
+
+using (ZipArchive extensionZipArchive = ZipFile.OpenRead(this.extensionFileName))
 {
     extensionZipArchive.ExtractToDirectory(tempFileName);
 }
  • Apply / Chat
Suggestion importance[1-10]: 10

__

Why: This suggestion correctly identifies a critical bug introduced in the PR. The code attempts to decompress a JSON string as a ZIP archive, which will cause a runtime crash. This is a significant correctness issue that breaks functionality.

High
Align resource name with code

The Bazel target //third_party/js/selenium:webdriver_json will generate a C#
identifier webdriver, but the code expects webdriver_prefs. Rename the target to
align with the code usage and prevent a compilation error.

dotnet/src/webdriver/BUILD.bazel [29-39]

 generated_resource_utilities(
     name = "resource-utilities",
     srcs = [
         "//javascript/atoms/fragments:find-elements.js",
         "//javascript/atoms/fragments:is-displayed.js",
         "//javascript/cdp-support:mutation-listener.js",
         "//javascript/webdriver/atoms:get-attribute.js",
-        "//third_party/js/selenium:webdriver_json",
+        "//third_party/js/selenium:webdriver_prefs_json",
     ],
     out = "ResourceUtilities.g.cs",
 )
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a naming mismatch between the generated C# identifier for a resource and its usage in the C# code, which will cause a build failure. The proposed fix to align the Bazel target name is correct and necessary.

High
Dynamically determine quotes for robustness

To prevent potential compilation errors, dynamically determine the number of
quotes for C# raw string literals instead of using a fixed number. This makes
the resource generator more robust by avoiding conflicts with file content.

dotnet/private/generate_resources_tool.py [44-53]

-# Use a C# raw string literal with five quotes. For a valid raw
+# Use a C# raw string literal. For a valid raw
 # literal, the content must start on a new line and the closing
-# quotes must be on their own line as well. We assume the content
-# does not contain a sequence of five consecutive double quotes.
+# quotes must be on their own line as well. We dynamically determine
+# the number of quotes to use to avoid conflicts with the content.
 #
 # Resulting C# will look like:
-#   """""
+#   """..."""
 #   <content>
-#   """""
-literal = '"""""\n' + content + '\n"""""'
+#   """..."""
+quote_count = 5
+while '"' * quote_count in content:
+    quote_count += 1
+quotes = '"' * quote_count
+literal = f'{quotes}\n{content}\n{quotes}'
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a potential edge case where the code generation would fail if an input file contains five consecutive double quotes. The proposed fix makes the tool more robust, which is a good improvement.

Low
Learned
best practice
Validate and guard file I/O

Validate that each input file exists and is readable, and surface clear errors.
Also guard output directory resolution and file writes with try/except to
provide actionable messages.

dotnet/private/generate_resources_tool.py [39-71]

 def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
     props: List[str] = []
     for ident, path in inputs:
-        with open(path, "r", encoding="utf-8") as f:
-            content = f.read()
-        ...
-    os.makedirs(os.path.dirname(output), exist_ok=True)
-    with open(output, "w", encoding="utf-8", newline="\n") as f:
-        f.write("\n".join(lines))
+        if not os.path.isfile(path):
+            raise FileNotFoundError(f"Input file not found: {path} for identifier '{ident}'")
+        try:
+            with open(path, "r", encoding="utf-8") as f:
+                content = f.read()
+        except OSError as e:
+            raise RuntimeError(f"Failed to read '{path}': {e}") from e
+        literal = '"""""\n' + content + '\n"""""'
+        props.append(f"    internal const string {ident} = {literal};")
 
+    lines: List[str] = []
+    lines.append("// <auto-generated />")
+    lines.append("namespace OpenQA.Selenium.Internal;")
+    lines.append("")
+    lines.append("internal static partial class ResourceUtilities")
+    lines.append("{")
+    for p in props:
+        lines.append(p)
+    lines.append("}")
+    lines.append("")
+
+    out_dir = os.path.dirname(output)
+    if out_dir:
+        os.makedirs(out_dir, exist_ok=True)
+    try:
+        with open(output, "w", encoding="utf-8", newline="\n") as f:
+            f.write("\n".join(lines))
+    except OSError as e:
+        raise RuntimeError(f"Failed to write output '{output}': {e}") from e
+

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Ensure resource cleanup and robust file I/O handling using context managers and targeted validation/fallbacks for external inputs.

Low
  • Update

@nvborisenko nvborisenko requested a review from shs96c November 17, 2025 20:59
Copy link
Member

@shs96c shs96c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel bits look like they should work

@nvborisenko
Copy link
Member Author

I am merging this. Only one risk I see in the future: when we escape C# literals via """"" then additional escaping might be required, or no. Plan B: if it will be a case, then apply smart escaping or revert this entire PR.

@nvborisenko nvborisenko merged commit 4c0eb7f into SeleniumHQ:trunk Nov 18, 2025
10 checks passed
@nvborisenko nvborisenko deleted the dotnet-atoms-res-gen branch November 18, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

B-build Includes scripting, bazel and CI integrations C-dotnet .NET Bindings Review effort 3/5

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[🚀 Feature]: [dotnet] Statically generate atoms as native string instead of resources

3 participants