-
-
Notifications
You must be signed in to change notification settings - Fork 28
Fixed robotstxt settings and test.
#1981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughTests and configuration were updated to change the robots.txt config key from robots_txt.settings to robotstxt.settings and to set robots.txt content to disallow crawling in non-production. Behat tests add checks for content-type and absence of "Allow: /core". Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor UA as User Agent
participant Web as Web Server
participant App as App (Drupal)
participant Cfg as Config (settings.robotstxt.php)
UA->>Web: GET /robots.txt
Web->>App: Route request
App->>Cfg: Read robotstxt.settings.content (env-aware)
Note over Cfg,App: Non-production sets "User-agent: *\nDisallow: /"
Cfg-->>App: robots.txt content
App-->>Web: 200 OK, text/plain
Web-->>UA: robots.txt body
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 phpcs (3.7.2)tests/phpunit/Drupal/EnvironmentSettingsTest.phpERROR: Referenced sniff "Drupal" does not exist Run "phpcs --help" for usage information web/sites/default/includes/modules/settings.robotstxt.phpERROR: Referenced sniff "Drupal" does not exist Run "phpcs --help" for usage information ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1981 +/- ##
========================================
Coverage 77.69% 77.69%
========================================
Files 90 90
Lines 5550 5550
Branches 44 44
========================================
Hits 4312 4312
Misses 1238 1238 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/behat/features/robotstxt.feature (1)
9-9: Prefix scenario title with [TEST] to meet Behat guidelines.Required by repository guidelines for Behat scenarios.
Apply this diff:
- Scenario: Verify robots.txt exists and contains appropriate content in non-production + Scenario: [TEST] Verify robots.txt exists and contains appropriate content in non-productiontests/phpunit/Drupal/EnvironmentSettingsTest.php (1)
730-766: Explicitly assert no robots override in production tests.Ensure prod envs don’t set robotstxt override.
For example, add an assertion within each prod test (adjust to your test base API):
// After $this->requireSettingsFile(); $this->assertArrayNotHasKey('robotstxt.settings', $this->config, 'robots.txt override must not be set in PROD');If
$this->configisn’t exposed, consider enhancingassertConfig()to verify absence, or add a helper likeassertConfigNotSet('robotstxt.settings').Also applies to: 928-975
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (7)
.vortex/installer/tests/Fixtures/install/_baseline/tests/behat/features/robotstxt.featureis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/_baseline/tests/phpunit/Drupal/EnvironmentSettingsTest.phpis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/_baseline/web/sites/default/includes/modules/settings.robotstxt.phpis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/hosting_acquia/docroot/sites/default/includes/modules/settings.robotstxt.phpis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/hosting_acquia/tests/phpunit/Drupal/EnvironmentSettingsTest.phpis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/hosting_lagoon/tests/phpunit/Drupal/EnvironmentSettingsTest.phpis excluded by!.vortex/installer/tests/Fixtures/**.vortex/installer/tests/Fixtures/install/provision_database_lagoon/tests/phpunit/Drupal/EnvironmentSettingsTest.phpis excluded by!.vortex/installer/tests/Fixtures/**
📒 Files selected for processing (3)
tests/behat/features/robotstxt.feature(1 hunks)tests/phpunit/Drupal/EnvironmentSettingsTest.php(12 hunks)web/sites/default/includes/modules/settings.robotstxt.php(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
tests/behat/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
tests/behat/features/**/*.feature: Behat feature files must start with the required user story format: 'As a …', 'I want to …', 'So that …'
Prefix all test content/titles with '[TEST]' in Behat scenarios
Use numbered patterns for repeated test data (e.g., '[TEST] Topic 1', '[TEST] Topic 2') in Behat features
Avoid real names in test data within Behat features (e.g., do not use 'Workshop' or 'Training')
Use descriptive test data titles in Behat features (e.g., '[TEST] Event with All Fields')
Files:
tests/behat/features/robotstxt.feature
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: build (1)
- GitHub Check: build (0)
- GitHub Check: vortex-test-workflow (4)
- GitHub Check: vortex-test-installer (8.4)
- GitHub Check: vortex-test-workflow (2)
- GitHub Check: vortex-test-workflow (0)
- GitHub Check: vortex-test-workflow (3)
- GitHub Check: vortex-test-workflow (1)
- GitHub Check: vortex-test-deployment (0)
- GitHub Check: vortex-test-common
- GitHub Check: vortex-test-installer (8.3)
- GitHub Check: vortex-test-deployment (1)
- GitHub Check: vortex-test-installer (8.2)
🔇 Additional comments (3)
tests/behat/features/robotstxt.feature (1)
13-16: Good additions to validate MIME type and no stray core allow-list.Both assertions align with the new config and harden the check.
web/sites/default/includes/modules/settings.robotstxt.php (1)
11-11: Config key rename and disallow-all content look correct.Matches Drupal robotstxt module config name and desired non-prod behavior.
tests/phpunit/Drupal/EnvironmentSettingsTest.php (1)
308-308: Behat scenario prefix check failed; correct invocation and re-run
- RG output confirms only
robotstxt.settingsusages and norobots_txt.settingsremnants.- The
fdcommand didn’t execute due to the path separator. Replace it with, for example:then re-run to ensure no Feature files define afd . tests/behat/features -e feature \ | xargs -I{} rg -nP '^\s*Scenario:\s*(?!\[TEST\])' {} || trueScenario:without the[TEST]prefix.
| Given I am an anonymous user | ||
| When I go to "/robots.txt" | ||
| Then the response status code should be 200 | ||
| And the response header "content-type" should contain the value "text/plain" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Header name casing nit.
If step defs are case-sensitive, prefer canonical "Content-Type".
- And the response header "content-type" should contain the value "text/plain"
+ And the response header "Content-Type" should contain the value "text/plain"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| And the response header "content-type" should contain the value "text/plain" | |
| And the response header "Content-Type" should contain the value "text/plain" |
🤖 Prompt for AI Agents
In tests/behat/features/robotstxt.feature around line 13, the step uses a
lowercased header name "content-type" which may be case-sensitive; change it to
the canonical "Content-Type" in the step to ensure the step definition matches
and the assertion reliably checks the response header.
| $config['environment_indicator.settings']['favicon'] = TRUE; | ||
| $config['environment_indicator.settings']['toolbar_integration'] = [TRUE]; | ||
| $config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:"; | ||
| $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
DRY the repeated robots.txt content string across tests.
Define once and reuse to reduce duplication and future churn.
class EnvironmentSettingsTest extends SettingsTestCase {
+ private const ROBOTS_TXT_DISALLOW_ALL = "User-agent: *\nDisallow: /";
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;
@@
- $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
+ $config['robotstxt.settings']['content'] = self::ROBOTS_TXT_DISALLOW_ALL;Also applies to: 379-379, 423-423, 470-470, 520-520, 569-569, 618-618, 660-660, 702-702, 790-790, 842-842, 894-894
🤖 Prompt for AI Agents
In tests/phpunit/Drupal/EnvironmentSettingsTest.php around lines 308 (and also
at 379, 423, 470, 520, 569, 618, 660, 702, 790, 842, 894), the literal
robots.txt content string "User-agent: *\nDisallow: /" is duplicated across
multiple assertions; define a single reusable value (e.g., a private class
constant or a private static property at the top of the test class) and replace
each duplicated literal with that constant/property reference so the string is
declared once and reused across all test lines.
| if ($settings['environment'] !== ENVIRONMENT_PROD) { | ||
| $config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:"; | ||
| $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid undefined index notice on $settings['environment'].
Use null-coalescing to prevent PHP warnings if 'environment' isn’t set.
-if ($settings['environment'] !== ENVIRONMENT_PROD) {
+if (($settings['environment'] ?? null) !== ENVIRONMENT_PROD) {
$config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /";
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if ($settings['environment'] !== ENVIRONMENT_PROD) { | |
| $config['robots_txt.settings']['content'] = "User-agent: *\r\nDisallow:"; | |
| $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /"; | |
| } | |
| if (($settings['environment'] ?? null) !== ENVIRONMENT_PROD) { | |
| $config['robotstxt.settings']['content'] = "User-agent: *\nDisallow: /"; | |
| } |
🤖 Prompt for AI Agents
In web/sites/default/includes/modules/settings.robotstxt.php around lines 10 to
12, the code accesses $settings['environment'] directly which can trigger an
"undefined index" notice; change the check to use null-coalescing (e.g.
($settings['environment'] ?? null) !== ENVIRONMENT_PROD) or assign $env =
$settings['environment'] ?? null and compare $env to ENVIRONMENT_PROD so the
warning is avoided while preserving the same logic.
Summary by CodeRabbit