Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BB-749 Update robots.txt.j2 to accept Allow rule and multiple values #4942

Merged

Conversation

john2x
Copy link
Contributor

@john2x john2x commented Jan 14, 2019

Configuration Pull Request

This PR updates the robots.txt.j2 template to satisfy the following:

  • accept Allow rules
  • accept multiple values for Agent, Disallow, and Allow as a list of strings
  • maintain backward compatibility of old var values (i.e. single string values)

Testing

To test the template, checkout this branch and add a test YAML file containing the NGINX_ROBOT_RULES values, and run ansible to render the template:

$ cd path/to/edx/configuration/configuration/playbooks/roles/nginx/templates/edx/app/nginx
$ vim test.yaml
...
$ ansible all -i localhost, -c local -m template -a "src=robots.txt.j2 dest=./test.txt" --extra-vars=@test.yaml
$ cat test.txt

This will render the template and save it to test.txt.

Here are some sample values for NGINX_ROBOT_RULES and their expected output:

# mix of single values and list of values
NGINX_ROBOT_RULES:
  - agent: '*'
    disallow: '/'
    allow:
      - '/$'
      - '/courses$'
      - '/courses/*/about$'
  - agent: 'W3C-checklink'  # backward compatible agent
    disallow: ''  # backward compatible disallow rule

# --- output ---
User-agent: *
Allow: /$
Allow: /courses$
Allow: /courses/*/about$
Disallow: /

User-agent: W3C-checklink
Disallow:
# singlue values only (for backward compatibility)
NGINX_ROBOT_RULES:
  - agent: '*'
    disallow: '/'
  - agent: 'W3C-checklink'
    disallow: ''

# --- output ---
User-agent: *
Disallow: /

User-agent: W3C-checklink
Disallow:
# all multiple values (lists of strings)
NGINX_ROBOT_RULES:
  - agent:
      - 'foo'
      - 'bar'
    disallow:
      - '/'
    allow:
      - '/$'
      - '/courses$'
      - '/courses/*/about$'
  - agent:
      - 'fizz'
    disallow:
      - '/'

# --- output ---
User-agent: foo
User-agent: bar
Allow: /$
Allow: /courses$
Allow: /courses/*/about$
Disallow: /

User-agent: fizz
Disallow: /

Discussions: https://gitlab.com/opencraft/client/lumerical/configuration/merge_requests/5

Reviewers


Make sure that the following steps are done before merging:

  • A DevOps team member has approved the PR if it is code shared across multiple services and you don't own all of the services.
  • Are you adding any new default values that need to be overridden when this change goes live? If so:
    • Update the appropriate internal repo (be sure to update for all our environments)
    • If you are updating a secure value rather than an internal one, file a DEVOPS ticket with details.
    • Add an entry to the CHANGELOG.
  • If you are making a complicated change, have you performed the proper testing specified on the Ops Ansible Testing Checklist? Adding a new variable does not require the full list (although testing on a sandbox is a great idea to ensure it links with your downstream code changes).

@openedx-webhooks
Copy link

Thanks for the pull request, @john2x! I've created OSPR-2977 to keep track of it in JIRA. JIRA is a place for product owners to prioritize feature reviews by the engineering development teams.

Feel free to add as much of the following information to the ticket:

  • supporting documentation
  • edx-code email threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will still be done via the GitHub pull request interface. As a reminder, our process documentation is here.

@openedx-webhooks openedx-webhooks added needs triage open-source-contribution PR author is not from Axim or 2U labels Jan 14, 2019
@natabene
Copy link
Contributor

@john2x Thank you for your contribution, please let me know once it is ready to be looked at.

@openedx-webhooks openedx-webhooks added waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. and removed needs triage labels Jan 14, 2019
@john2x
Copy link
Contributor Author

john2x commented Jan 14, 2019

@natabene This is ready to be looked at, thanks.

Copy link
Contributor

@lgp171188 lgp171188 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

  • I tested this by running the playbook and verifying the generated robots.txt file for various values.
  • I read through the code
  • I checked for accessibility issues NA
  • Includes documentation NA

@lgp171188
Copy link
Contributor

@john2x, can you update this PR to resolve the merge conflict?

@natabene
Copy link
Contributor

@john2x Can you let me know once you resolve the merge conflict?

@john2x john2x force-pushed the john2x/BB-749-update-robotstxt-template branch from 3dcd1c1 to a7eb846 Compare January 14, 2019 21:47
@john2x
Copy link
Contributor Author

john2x commented Jan 14, 2019

@natabene @lgp171188 I've rebased to the latest change from master and resolved the conflict.

@natabene
Copy link
Contributor

@john2x Thanks! @edx/devops This is ready for your review, when you have time.

@natabene natabene requested a review from a team January 15, 2019 16:28
@openedx-webhooks openedx-webhooks added awaiting prioritization and removed waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. labels Jan 15, 2019
@john2x
Copy link
Contributor Author

john2x commented Feb 7, 2019

@natabene do you have updates for the review/prioritization of this PR?

@natabene
Copy link
Contributor

natabene commented Feb 7, 2019

@john2x Sorry, we have not had a chance to review it yet, I will ping team again. Do you know why checks are failing now?

@john2x
Copy link
Contributor Author

john2x commented Feb 8, 2019

@natabene Thanks! The checks was because of conflicts on the CHANGELOG.md file due to recent changes on master. I have resolved the latest one.

@natabene
Copy link
Contributor

natabene commented Feb 8, 2019

@john2x Thank you! @edx/devops Can you give this a look?

Copy link
Contributor

@jdmulloy jdmulloy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@natabene
Copy link
Contributor

@jdmulloy Can you merge, too? Open edX community contributors don't have the necessary permissions to merge.

@feanil feanil merged commit df0d1b1 into openedx:master Feb 12, 2019
@openedx-webhooks
Copy link

@john2x 🎉 Your pull request was merged!

Please take a moment to answer a two question survey so we can improve your experience in the future.

@lgp171188
Copy link
Contributor

@natabene, can this be cherry-picked to Ironwood?

@nedbat
Copy link
Contributor

nedbat commented Feb 14, 2019

I've cherry-picked this onto Ironwood.

@xitij2000 xitij2000 deleted the john2x/BB-749-update-robotstxt-template branch February 21, 2019 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged open-source-contribution PR author is not from Axim or 2U
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants