Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Change default admin set name to NOT include a slash #1291

Open
billdueber opened this issue Jun 23, 2017 · 13 comments
Open

Proposal: Change default admin set name to NOT include a slash #1291

billdueber opened this issue Jun 23, 2017 · 13 comments

Comments

@billdueber
Copy link
Contributor

billdueber commented Jun 23, 2017

Descriptive summary

Apache's mod_proxy, when reverse proxying, canonicalizes %2F to / in passed URLs (even though it's properly escaped) due to security considerations. URLs with the default admin set id of admin_set/default then break. I'm suggesting we change the default admin set id, and consider restricting legal characters for an admin set ID.

Rationale

As defined in HEAD, the variable DEFAULT_ID for the default admin set is 'admin_set/default'.

https://github.com/samvera/hyrax/blame/master/app/models/concerns/hyrax/admin_set_behavior.rb#L29

This is a problem (annoyance?) in that Apache's mod_proxy cannonicalizes passed URLs prior to sending them off to the proxied server, due to longstanding security issues where someone would protect a path and then some yahoo would pass in a URL with an embedded %2F to get around it.

mod_proxy has an override (nocanon) but it is not on by default and its use may open up a server to other potential security issues.

The mod_proxy docs state:

Normally, mod_proxy will canonicalise ProxyPassed URLs. But this may be incompatible with some backends, particularly those that make use of PATH_INFO. The optional nocanon keyword suppresses this and passes the URL path "raw" to the backend. Note that this keyword may affect the security of your backend, as it removes the normal limited protection against URL-based attacks provided by the proxy.

So, there's a workaround, but if there's not a deliberate reason to include a slash, maybe we could change it.

A potential further step would be to restrict the characters that are allowed in an admin set ID in general, so it doesn't crop up again.

Expected behavior

Default settings don't result in borked URLs on a common platform.

Related work

Link to related tickets or prior related work here.

@jcoyne
Copy link
Member

jcoyne commented Jun 23, 2017

In RIIIF we also have this problem. We advise people to configure Apache to pass the encoded slashes correctly:

https://github.com/curationexperts/riiif#special-note-for-passenger-and-apache-users

@mjgiarlo mjgiarlo added this to the Backlog milestone Jun 23, 2017
@mjgiarlo
Copy link
Member

Would like @samvera/hyrax-code-reviewers to chime in on this. We may also want to discuss this in a Samvera Tech call. (Though not next week, because of the conflict with OR2017.)

Thx for raising this, @billdueber

@skorner
Copy link

skorner commented Jun 23, 2017

The scenario @billdueber described is not Apache+mod_passenger, but Apache+mod_proxy. There is of course a similar workaround available for mod_proxy (AllowEncodedSlashes NoDecode and ProxyPass <...> <...> nocanon).

The question here seems more about whether it is necessary to introduce such a configuration dependency in hyrax to only support the chosen default admin set name.

@jcoyne
Copy link
Member

jcoyne commented Jun 23, 2017

@skorner I would argue that ModProxy and Apache are not required parts of this stack.

@skorner
Copy link

skorner commented Jun 23, 2017

I agree. I'm merely highlighting the fact that the default admin set name choice has consequences in several deployment scenarios.

@jcoyne
Copy link
Member

jcoyne commented Jun 23, 2017

@skorner This seems like bad defaults in Apache to me. Nginx doesn't have this problem. I'm not strongly tied to the default admin set name, but it seems odd to change to support one deployment scenario. Furthermore there are other places you may encounter this bug (riiif for one) and it would make sense to fix the problem rather than eliminate one potential trigger.

@elrayle
Copy link
Contributor

elrayle commented Jun 26, 2017

Two questions:

  • What is the impact of making a change to the default admin set name?
  • @jcoyne Are you worried that by changing the name, we are fixing the problem in one place, but the same problem will occur in any number of other places, making this a game of whack-a-mole?

And an observation:
If the development cost is low, it is worth considering that Apache is not obscure with very few adopters. I would imagine that a large number of our community make use of Apache. It may not be an official part of the stack, but for a large number of institutions, it is a practical part of the stack.

@escowles
Copy link
Contributor

I'd like to echo @elrayle's concern: is this going to be an issue for existing installations? If not, then I would support removing the / from the default adminset name, since it's a special character in both URLs and POSIX filesystems. Even if this particular problem can be resolved, there are other parts of the stack where a / could cause problems.

I also agree that Apache is far from obscure, and is probably used by a significant fraction of Hydra deployments (possibly a majority). So dismissing it as "one deployment scenario" seems counterproductive here.

@jcoyne
Copy link
Member

jcoyne commented Jun 26, 2017

@elrayle yes, I'm worried that this will just fix it in one place. And I really don't care what we call it.

I think it's telling that NGINX and Apache have different default configurations. Has anyone asked the Apache team why they have it configured the way it is? I'm thinking that a reverse proxy should not be mutating data (so far as that is possible). Just because Apache is a big player doesn't mean they don't have bugs.

@svogtunh
Copy link

I would like to chime in on this, as well. I am struggling with this now. I would love for this to be fixed, even if it is considered a work-around for apache users.

@no-reply
Copy link
Contributor

i'd love to see this done, and would be sure to merge a PR that cleanly renames the admin set and provides tooling to properly update existing apps.

@svogtunh
Copy link

Even though I found a work-around, which was mentioned in an earlier comment, I still think this should be fixed.

@no-reply no-reply modified the milestones: Backlog, 2.x series, 3.x series Mar 18, 2021
@no-reply
Copy link
Contributor

bumping this to the 3.x series in hopes that we can pick it up post-3.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants