[dotnet] Generate atoms statically #16608

nvborisenko · 2025-11-17T20:55:02Z

User description

Generate atoms as compilation unit instead of embedded assembly resources.

🔗 Related Issues

Fixes #16600

💥 What does this PR do?

Created bazel private rule too generate atoms to internal ResourceUtilities.cs
Use generated strings

💡 Additional Considerations

🔄 Types of changes

Cleanup (formatting, renaming)

PR Type

Enhancement, Tests

Description

Generate JavaScript atoms as static C# strings instead of embedded resources
Create Bazel rule and Python tool to compile atoms into ResourceUtilities partial class
Replace runtime resource loading with direct string references throughout codebase
Remove embedded resource dependencies from build configuration

Diagram Walkthrough

flowchart LR
  JS["JavaScript Atoms<br/>find-elements.js<br/>is-displayed.js<br/>get-attribute.js<br/>mutation-listener.js<br/>webdriver_prefs.json"]
  Tool["Python Generator Tool<br/>generate_resources_tool.py"]
  Bazel["Bazel Rule<br/>generated_resource_utilities"]
  PartialClass["ResourceUtilities.g.cs<br/>Partial Class with<br/>const string properties"]
  CSharp["C# Source Files<br/>FirefoxProfile.cs<br/>JavaScriptEngine.cs<br/>WebElement.cs<br/>RelativeBy.cs"]
  
  JS -->|Input| Tool
  Tool -->|Generates| PartialClass
  Bazel -->|Orchestrates| Tool
  PartialClass -->|Compile-time<br/>inclusion| CSharp

File Walkthrough

Relevant files

Build

6 files

generate_resources.bzl `New Bazel rule for resource generation`	+48/-0
generate_resources_tool.py `Python tool to generate C# resource class`	+85/-0
BUILD.bazel `Define py_binary for resource generator`	+6/-0
defs.bzl `Export generated_resource_utilities rule`	+2/-0
BUILD.bazel `Replace embedded resources with generated class`	+22/-29
Selenium.WebDriver.csproj `Update build targets to generate resources`	+4/-23

Enhancement

6 files

ResourceUtilities.cs `Convert to partial class for code generation`	+1/-1
FirefoxProfile.cs `Use generated webdriver_prefs string constant`	+4/-8
FirefoxExtension.cs `Remove resource loading, use generated string`	+2/-22
JavaScriptEngine.cs `Replace mutation listener resource with constant`	+1/-17
RelativeBy.cs `Use generated find_elements string constant`	+1/-10
WebElement.cs `Replace atom resource loading with constants`	+2/-19

qodo-merge-pro · 2025-11-17T20:55:35Z

PR Compliance Guide 🔍

(Compliance updated until commit `e2cac36`)

Below is a summary of compliance checks for this PR:

Security Compliance

⚪

Code generation robustness

Description: The generator assumes input content never contains five consecutive double quotes and
embeds it in a raw string literal with fixed five-quote delimiter, which can break code
generation if such a sequence exists; inputs should be validated or the delimiter should
be dynamically increased to ensure safe embedding.
generate_resources_tool.py [47-59]

Referred Code

with open(path, "r", encoding="utf-8") as f:
    content = f.read()
# Use a C# raw string literal with five quotes. For a valid raw
# literal, the content must start on a new line and the closing
# quotes must be on their own line as well. We assume the content
# does not contain a sequence of five consecutive double quotes.
#
# Resulting C# will look like:
#   """""
#   <content>
#   """""
literal = '"""""\n' + content + '\n"""""'
props.append(f"    internal const string {prop_name} = {literal};")

Logic error handling

Description: Constructing a ZipArchive directly from UTF-8 bytes of WebDriverPrefsJson treats JSON as a
zip file, which will throw if content is not a valid ZIP and may indicate a logic error;
if user-controlled, it could lead to exceptions—needs validation that the data is a ZIP
archive or adjust logic to parse JSON instead.
FirefoxExtension.cs [72-75]

Referred Code

using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.WebDriverPrefsJson));
using (ZipArchive extensionZipArchive = new ZipArchive(zipFileStream, ZipArchiveMode.Read))
{
    extensionZipArchive.ExtractToDirectory(tempFileName);

Ticket Compliance

🟡

🎫 #16600

🟢	Stop loading atoms and other resources from embedded assembly resources at runtime.
	Generate a C# source file at build time that contains static string constants for required atoms/resources.
	Use the generated static strings in place of ResourceUtilities.GetResourceStream calls across the codebase.
	Reduce dependency on Bazel resource naming by generating compile-time constants.
⚪	Improve performance by avoiding repeated assembly resource lookups.

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

🔴

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Weak error handling: The generator reads inputs and writes output without try/except around file operations and
lacks actionable error messages for I/O failures or empty input cases.

Referred Code

def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(f"    internal const string {prop_name} = {literal};")

    lines: List[str] = []
    lines.append("// <auto-generated />")
    lines.append("namespace OpenQA.Selenium.Internal;")
    lines.append("")


 ... (clipped 21 lines)

⚪

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Lack of logging: Newly added generation and build steps perform critical actions (file reads/writes and
code generation) without emitting any audit logs, making it hard to trace failures or
actions.

Referred Code

def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(f"    internal const string {prop_name} = {literal};")

    lines: List[str] = []
    lines.append("// <auto-generated />")
    lines.append("namespace OpenQA.Selenium.Internal;")
    lines.append("")


 ... (clipped 11 lines)

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation gaps: While IDENT=path parsing is validated, there is no explicit check for file
existence/emptiness or content constraints (e.g., five-quote sequences), which may result
in malformed generated code.

Referred Code

def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path


def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for prop_name, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing


 ... (clipped 24 lines)

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit 486f308

Security Compliance

⚪

Code generation robustness

Description: The generator assumes the JS content never contains five consecutive double quotes and
embeds it in a fixed 5-quote C# raw string literal, which could break code generation if
such a sequence appears, potentially leading to build-time code injection or malformed
output.
generate_resources_tool.py [42-56]

Referred Code

with open(path, "r", encoding="utf-8") as f:
    content = f.read()
# Use a C# raw string literal with five quotes. For a valid raw
# literal, the content must start on a new line and the closing
# quotes must be on their own line as well. We assume the content
# does not contain a sequence of five consecutive double quotes.
#
# Resulting C# will look like:
#   """""
#   <content>
#   """""
literal = '"""""\n' + content + '\n"""""'
props.append(
    f"    internal const string {ident} = {literal};"
)

Ticket Compliance

🟡

🎫 #16600

🟢	Generate C# code at build time to expose atoms/resources as native constant strings instead of loading embedded assembly resources.
	Update .NET WebDriver code to use the generated constants for atoms (e.g., is-displayed, find-elements, get-attribute, mutation-listener, webdriver_prefs) instead of reading from resources via reflection.
	Integrate generation into the Bazel and .NET build so the generated C# file is produced and compiled as part of the build.
	Reduce dependency on Bazel resource naming conventions by avoiding embedded resource lookup.
⚪	Maintain existing functionality and behavior while improving performance by avoiding repeated resource loading.

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

🔴

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Weak error handling: The generator reads files and writes output without handling IO errors or validating that
raw string delimiter conflicts cannot occur, risking unhandled exceptions or malformed
generated code.

Referred Code

def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []
    for ident, path in inputs:
        with open(path, "r", encoding="utf-8") as f:
            content = f.read()
        # Use a C# raw string literal with five quotes. For a valid raw
        # literal, the content must start on a new line and the closing
        # quotes must be on their own line as well. We assume the content
        # does not contain a sequence of five consecutive double quotes.
        #
        # Resulting C# will look like:
        #   """""
        #   <content>
        #   """""
        literal = '"""""\n' + content + '\n"""""'
        props.append(
            f"    internal const string {ident} = {literal};"
        )

    lines: List[str] = []
    lines.append("// <auto-generated />")


 ... (clipped 12 lines)

⚪

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No auditing: The new generator tool and related Bazel rules perform build-time code generation without
any added logging or audit trail, which may be acceptable for build tooling but adds no
audit coverage for runtime critical actions.

Referred Code

#!/usr/bin/env python3
"""Generate C# ResourceUtilities partial class with embedded JS resources.

Usage:
  generate_resources_tool.py --output path/to/ResourceUtilities.g.cs \
      --input Ident1=path/to/file1.js \
      --input Ident2=path/to/file2.js ...

Each identifier becomes a const string in ResourceUtilities class.
The content is emitted as a C# raw string literal using 5-quotes.
"""

import argparse
import os
import sys
from typing import List, Tuple


def parse_args(argv: List[str]) -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument("--output", required=True)


 ... (clipped 64 lines)

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Direct exceptions: The tool raises raw exceptions with internal details (e.g., invalid arguments) which is
typical for build tools but could expose paths or internals if surfaced to end users;
context of exposure is unclear.

Referred Code

def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Limited validation: The generator minimally validates identifiers and paths from --input and emits them into
C# without escaping beyond assuming no five-quote sequence, which could break builds if
inputs contain disallowed characters; broader sanitization or escaping may be needed.

Referred Code

    parser = argparse.ArgumentParser()
    parser.add_argument("--output", required=True)
    parser.add_argument("--input", action="append", default=[], help="IDENT=path")
    return parser.parse_args(argv)


def parse_input_spec(spec: str) -> Tuple[str, str]:
    if "=" not in spec:
        raise ValueError(f"Invalid --input value, expected IDENT=path, got: {spec}")
    ident, path = spec.split("=", 1)
    ident = ident.strip()
    path = path.strip()
    if not ident:
        raise ValueError(f"Empty identifier in --input value: {spec}")
    if not path:
        raise ValueError(f"Empty path in --input value: {spec}")
    return ident, path


def generate(output: str, inputs: List[Tuple[str, str]]) -> None:
    props: List[str] = []


 ... (clipped 16 lines)

dotnet/src/webdriver/RelativeBy.cs

qodo-merge-pro · 2025-11-17T20:57:08Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix incorrect ZIP archive handling The code incorrectly treats the `webdriver_prefs` JSON string as a ZIP archive, which will cause a runtime failure. This logic is a result of a faulty refactoring and should be removed, as `webdriver_prefs` is not a browser extension. dotnet/src/webdriver/Firefox/FirefoxExtension.cs [72-76] -using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.webdriver_prefs)); -using (ZipArchive extensionZipArchive = new ZipArchive(zipFileStream, ZipArchiveMode.Read)) +// The webdriver_prefs.json file is not an extension, so it should not be +// processed here. This logic appears to be a remnant of a refactoring. +// The preferences from this resource are handled in FirefoxProfile.cs. +// We can remove this block. If there are other extensions to handle, +// they should be processed here, but not webdriver_prefs. +// For now, we can assume this was a mistake and the original file +// being processed was the webdriver extension, which is now handled +// differently or is no longer needed. +// If an actual extension file needs to be processed, it should be +// loaded from ResourceUtilities and then decompressed. +// e.g. using Stream zipFileStream = new MemoryStream(Encoding.UTF8.GetBytes(ResourceUtilities.some_extension_xpi)); +// For now, we'll assume this is dead code. +// If this method is still called for a file that is a real extension, +// the logic to get its content from ResourceUtilities would be needed. +// However, `webdriver_prefs` is definitely not a zip file. +// The call to this method should be investigated. +// For a minimal fix, we can throw an exception to indicate this misuse. +if (this.extensionFileName == "webdriver.xpi") +{ + // This is a placeholder for what should be done. + // The webdriver.xpi is no longer shipped as a resource in this manner. + // The preferences are set directly. + return; +} + +// Fallback to original file-based logic if it's not the known resource. +if (!File.Exists(this.extensionFileName)) +{ + throw new FileNotFoundException("Extension file not found", this.extensionFileName); +} + +using (ZipArchive extensionZipArchive = ZipFile.OpenRead(this.extensionFileName)) { extensionZipArchive.ExtractToDirectory(tempFileName); } Apply / Chat Suggestion importance[1-10]: 10 __ Why: This suggestion correctly identifies a critical bug introduced in the PR. The code attempts to decompress a JSON string as a ZIP archive, which will cause a runtime crash. This is a significant correctness issue that breaks functionality.	High
	Align resource name with code The Bazel target `//third_party/js/selenium:webdriver_json` will generate a C# identifier `webdriver`, but the code expects `webdriver_prefs`. Rename the target to align with the code usage and prevent a compilation error. dotnet/src/webdriver/BUILD.bazel [29-39] generated_resource_utilities( name = "resource-utilities", srcs = [ "//javascript/atoms/fragments:find-elements.js", "//javascript/atoms/fragments:is-displayed.js", "//javascript/cdp-support:mutation-listener.js", "//javascript/webdriver/atoms:get-attribute.js", - "//third_party/js/selenium:webdriver_json", + "//third_party/js/selenium:webdriver_prefs_json", ], out = "ResourceUtilities.g.cs", ) Apply / Chat Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies a naming mismatch between the generated C# identifier for a resource and its usage in the C# code, which will cause a build failure. The proposed fix to align the Bazel target name is correct and necessary.	High
	Dynamically determine quotes for robustness To prevent potential compilation errors, dynamically determine the number of quotes for C# raw string literals instead of using a fixed number. This makes the resource generator more robust by avoiding conflicts with file content. dotnet/private/generate_resources_tool.py [44-53] -# Use a C# raw string literal with five quotes. For a valid raw +# Use a C# raw string literal. For a valid raw # literal, the content must start on a new line and the closing -# quotes must be on their own line as well. We assume the content -# does not contain a sequence of five consecutive double quotes. +# quotes must be on their own line as well. We dynamically determine +# the number of quotes to use to avoid conflicts with the content. # # Resulting C# will look like: -# """"" +# """...""" # <content> -# """"" -literal = '"""""\n' + content + '\n"""""' +# """...""" +quote_count = 5 +while '"' * quote_count in content: + quote_count += 1 +quotes = '"' * quote_count +literal = f'{quotes}\n{content}\n{quotes}' Apply / Chat Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies a potential edge case where the code generation would fail if an input file contains five consecutive double quotes. The proposed fix makes the tool more robust, which is a good improvement.	Low
Learned best practice	Validate and guard file I/O Validate that each input file exists and is readable, and surface clear errors. Also guard output directory resolution and file writes with try/except to provide actionable messages. dotnet/private/generate_resources_tool.py [39-71] def generate(output: str, inputs: List[Tuple[str, str]]) -> None: props: List[str] = [] for ident, path in inputs: - with open(path, "r", encoding="utf-8") as f: - content = f.read() - ... - os.makedirs(os.path.dirname(output), exist_ok=True) - with open(output, "w", encoding="utf-8", newline="\n") as f: - f.write("\n".join(lines)) + if not os.path.isfile(path): + raise FileNotFoundError(f"Input file not found: {path} for identifier '{ident}'") + try: + with open(path, "r", encoding="utf-8") as f: + content = f.read() + except OSError as e: + raise RuntimeError(f"Failed to read '{path}': {e}") from e + literal = '"""""\n' + content + '\n"""""' + props.append(f" internal const string {ident} = {literal};") + lines: List[str] = [] + lines.append("// <auto-generated />") + lines.append("namespace OpenQA.Selenium.Internal;") + lines.append("") + lines.append("internal static partial class ResourceUtilities") + lines.append("{") + for p in props: + lines.append(p) + lines.append("}") + lines.append("") + + out_dir = os.path.dirname(output) + if out_dir: + os.makedirs(out_dir, exist_ok=True) + try: + with open(output, "w", encoding="utf-8", newline="\n") as f: + f.write("\n".join(lines)) + except OSError as e: + raise RuntimeError(f"Failed to write output '{output}': {e}") from e + `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: Relevant best practice - Ensure resource cleanup and robust file I/O handling using context managers and targeted validation/fallbacks for external inputs.	Low
Update

dotnet/src/webdriver/WebElement.cs

dotnet/private/generate_resources.bzl

shs96c

Bazel bits look like they should work

dotnet/private/generate_resources.bzl

dotnet/private/BUILD.bazel

nvborisenko · 2025-11-18T18:35:20Z

I am merging this. Only one risk I see in the future: when we escape C# literals via """"" then additional escaping might be required, or no. Plan B: if it will be a case, then apply smart escaping or revert this entire PR.

nvborisenko added 3 commits November 17, 2025 23:05

generate partial class

8aa9c86

clean bazel

fbe666d

Use gen strings in runtime

486f308

selenium-ci added C-dotnet .NET Bindings B-build Includes scripting, bazel and CI integrations labels Nov 17, 2025

qodo-merge-pro bot added the Review effort 3/5 label Nov 17, 2025