Skip to content

Conversation

@AlexSkrypnyk
Copy link
Member

@AlexSkrypnyk AlexSkrypnyk commented Sep 4, 2025

Summary by CodeRabbit

  • New Features
    • Non-production environments now publish a robots.txt that disallows all crawling (Disallow: /), helping prevent unintended indexing.
  • Bug Fixes
    • Standardized robots.txt formatting and ensured it is served with a text/plain content type.
    • Removed any inadvertent “Allow: /core” directive from responses.
  • Tests
    • Updated test coverage to validate robots.txt content, directives, and response headers across environments.

@coderabbitai
Copy link

coderabbitai bot commented Sep 4, 2025

Walkthrough

Tests and configuration were updated to change the robots.txt config key from robots_txt.settings to robotstxt.settings and to set robots.txt content to disallow crawling in non-production. Behat tests add checks for content-type and absence of "Allow: /core".

Changes

Cohort / File(s) Summary
Behat robots.txt checks
tests/behat/features/robotstxt.feature
Added assertions: content-type contains "text/plain" and response does not contain "Allow: /core". Existing status and content checks unchanged.
PHPUnit environment tests
tests/phpunit/Drupal/EnvironmentSettingsTest.php
Updated config key from robots_txt.settings.content to robotstxt.settings.content across 12 tests; changed robots.txt content to "User-agent: *\nDisallow: /".
Non-prod robots.txt settings
web/sites/default/includes/modules/settings.robotstxt.php
Switched config key to robotstxt.settings; updated content to "User-agent: *\nDisallow: /" with LF newlines for non-production environments.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor UA as User Agent
  participant Web as Web Server
  participant App as App (Drupal)
  participant Cfg as Config (settings.robotstxt.php)

  UA->>Web: GET /robots.txt
  Web->>App: Route request
  App->>Cfg: Read robotstxt.settings.content (env-aware)
  Note over Cfg,App: Non-production sets "User-agent: *\nDisallow: /"
  Cfg-->>App: robots.txt content
  App-->>Web: 200 OK, text/plain
  Web-->>UA: robots.txt body
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

PR: Needs review

Poem

I twitch my whiskers at robots’ lore,
From prod to preview, “Disallow: /”.
With headers neat and tests that sing,
I hop through configs, tap-tap spring.
Be kind, dear crawlers, pause your tour—
Non-prod gardens are for bunnies’ chore. 🐇

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 phpcs (3.7.2)
tests/phpunit/Drupal/EnvironmentSettingsTest.php

ERROR: Referenced sniff "Drupal" does not exist

Run "phpcs --help" for usage information

web/sites/default/includes/modules/settings.robotstxt.php

ERROR: Referenced sniff "Drupal" does not exist

Run "phpcs --help" for usage information

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/fix-robots-txt-settings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov
Copy link

codecov bot commented Sep 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.69%. Comparing base (b2eddff) to head (d008f6d).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #1981   +/-   ##
========================================
  Coverage    77.69%   77.69%           
========================================
  Files           90       90           
  Lines         5550     5550           
  Branches        44       44           
========================================
  Hits          4312     4312           
  Misses        1238     1238           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@AlexSkrypnyk AlexSkrypnyk enabled auto-merge (rebase) September 4, 2025 11:17
@AlexSkrypnyk AlexSkrypnyk added the AUTOMERGE Pull request has been approved and set to automerge label Sep 4, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/behat/features/robotstxt.feature (1)

9-9: Prefix scenario title with [TEST] to meet Behat guidelines.

Required by repository guidelines for Behat scenarios.

Apply this diff:

-  Scenario: Verify robots.txt exists and contains appropriate content in non-production
+  Scenario: [TEST] Verify robots.txt exists and contains appropriate content in non-production
tests/phpunit/Drupal/EnvironmentSettingsTest.php (1)

730-766: Explicitly assert no robots override in production tests.

Ensure prod envs don’t set robotstxt override.

For example, add an assertion within each prod test (adjust to your test base API):

// After $this->requireSettingsFile();
$this->assertArrayNotHasKey('robotstxt.settings', $this->config, 'robots.txt override must not be set in PROD');

If $this->config isn’t exposed, consider enhancing assertConfig() to verify absence, or add a helper like assertConfigNotSet('robotstxt.settings').

Also applies to: 928-975

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b2eddff and d008f6d.

⛔ Files ignored due to path filters (7)
  • .vortex/installer/tests/Fixtures/install/_baseline/tests/behat/features/robotstxt.feature is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/tests/phpunit/Drupal/EnvironmentSettingsTest.php is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/_baseline/web/sites/default/includes/modules/settings.robotstxt.php is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/hosting_acquia/docroot/sites/default/includes/modules/settings.robotstxt.php is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/hosting_acquia/tests/phpunit/Drupal/EnvironmentSettingsTest.php is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/hosting_lagoon/tests/phpunit/Drupal/EnvironmentSettingsTest.php is excluded by !.vortex/installer/tests/Fixtures/**
  • .vortex/installer/tests/Fixtures/install/provision_database_lagoon/tests/phpunit/Drupal/EnvironmentSettingsTest.php is excluded by !.vortex/installer/tests/Fixtures/**
📒 Files selected for processing (3)
  • tests/behat/features/robotstxt.feature (1 hunks)
  • tests/phpunit/Drupal/EnvironmentSettingsTest.php (12 hunks)
  • web/sites/default/includes/modules/settings.robotstxt.php (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
tests/behat/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

tests/behat/features/**/*.feature: Behat feature files must start with the required user story format: 'As a …', 'I want to …', 'So that …'
Prefix all test content/titles with '[TEST]' in Behat scenarios
Use numbered patterns for repeated test data (e.g., '[TEST] Topic 1', '[TEST] Topic 2') in Behat features
Avoid real names in test data within Behat features (e.g., do not use 'Workshop' or 'Training')
Use descriptive test data titles in Behat features (e.g., '[TEST] Event with All Fields')

Files:

  • tests/behat/features/robotstxt.feature
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: build (1)
  • GitHub Check: build (0)
  • GitHub Check: vortex-test-workflow (4)
  • GitHub Check: vortex-test-installer (8.4)
  • GitHub Check: vortex-test-workflow (2)
  • GitHub Check: vortex-test-workflow (0)
  • GitHub Check: vortex-test-workflow (3)
  • GitHub Check: vortex-test-workflow (1)
  • GitHub Check: vortex-test-deployment (0)
  • GitHub Check: vortex-test-common
  • GitHub Check: vortex-test-installer (8.3)
  • GitHub Check: vortex-test-deployment (1)
  • GitHub Check: vortex-test-installer (8.2)
🔇 Additional comments (3)
tests/behat/features/robotstxt.feature (1)

13-16: Good additions to validate MIME type and no stray core allow-list.

Both assertions align with the new config and harden the check.

web/sites/default/includes/modules/settings.robotstxt.php (1)

11-11: Config key rename and disallow-all content look correct.

Matches Drupal robotstxt module config name and desired non-prod behavior.

tests/phpunit/Drupal/EnvironmentSettingsTest.php (1)

308-308: Behat scenario prefix check failed; correct invocation and re-run

  • RG output confirms only robotstxt.settings usages and no robots_txt.settings remnants.
  • The fd command didn’t execute due to the path separator. Replace it with, for example:
    fd . tests/behat/features -e feature \
      | xargs -I{} rg -nP '^\s*Scenario:\s*(?!\[TEST\])' {} || true
    then re-run to ensure no Feature files define a Scenario: without the [TEST] prefix.

Given I am an anonymous user
When I go to "/robots.txt"
Then the response status code should be 200
And the response header "content-type" should contain the value "text/plain"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Header name casing nit.

If step defs are case-sensitive, prefer canonical "Content-Type".

-    And the response header "content-type" should contain the value "text/plain"
+    And the response header "Content-Type" should contain the value "text/plain"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
And the response header "content-type" should contain the value "text/plain"
And the response header "Content-Type" should contain the value "text/plain"
🤖 Prompt for AI Agents
In tests/behat/features/robotstxt.feature around line 13, the step uses a
lowercased header name "content-type" which may be case-sensitive; change it to
the canonical "Content-Type" in the step to ensure the step definition matches
and the assertion reliably checks the response header.

$config['environment_indicator.settings']['favicon'] = TRUE;
$config['environment_indicator.settings']['toolbar_integration'] = [TRUE];
$config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:";
$config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

DRY the repeated robots.txt content string across tests.

Define once and reuse to reduce duplication and future churn.

 class EnvironmentSettingsTest extends SettingsTestCase {
+  private const ROBOTS_TXT_DISALLOW_ALL = "User-agent: *\nDisallow: /";
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
-    $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+    $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;

Also applies to: 379-379, 423-423, 470-470, 520-520, 569-569, 618-618, 660-660, 702-702, 790-790, 842-842, 894-894

🤖 Prompt for AI Agents
In tests/phpunit/Drupal/EnvironmentSettingsTest.php around lines 308 (and also
at 379, 423, 470, 520, 569, 618, 660, 702, 790, 842, 894), the literal
robots.txt content string "User-agent: *\nDisallow: /" is duplicated across
multiple assertions; define a single reusable value (e.g., a private class
constant or a private static property at the top of the test class) and replace
each duplicated literal with that constant/property reference so the string is
declared once and reused across all test lines.

Comment on lines 10 to 12
if ($settings['environment'] !== ENVIRONMENT_PROD) {
$config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:";
$config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid undefined index notice on $settings['environment'].

Use null-coalescing to prevent PHP warnings if 'environment' isn’t set.

-if ($settings['environment'] !== ENVIRONMENT_PROD) {
+if (($settings['environment'] ?? null) !== ENVIRONMENT_PROD) {
   $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ($settings['environment'] !== ENVIRONMENT_PROD) {
$config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:";
$config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
}
if (($settings['environment'] ?? null) !== ENVIRONMENT_PROD) {
$config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
}
🤖 Prompt for AI Agents
In web/sites/default/includes/modules/settings.robotstxt.php around lines 10 to
12, the code accesses $settings['environment'] directly which can trigger an
"undefined index" notice; change the check to use null-coalescing (e.g.
($settings['environment'] ?? null) !== ENVIRONMENT_PROD) or assign $env =
$settings['environment'] ?? null and compare $env to ENVIRONMENT_PROD so the
warning is avoided while preserving the same logic.

@AlexSkrypnyk AlexSkrypnyk merged commit bb55324 into develop Sep 4, 2025
31 checks passed
@AlexSkrypnyk AlexSkrypnyk deleted the feature/fix-robots-txt-settings branch September 4, 2025 11:31
@github-project-automation github-project-automation bot moved this from BACKLOG to Release queue in Vortex Sep 4, 2025
@AlexSkrypnyk AlexSkrypnyk moved this from Release queue to Released in 25.8.0 in Vortex Sep 5, 2025
@AlexSkrypnyk AlexSkrypnyk added this to the 25.8.0 milestone Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AUTOMERGE Pull request has been approved and set to automerge

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants