Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaks if path is behind as symlink and the symlink is changed #47

Closed
AudriusButkevicius opened this issue Jul 5, 2021 · 13 comments
Closed
Labels
not-a-bug It indicates that the issue is not related to SWS v1 Deprecated v1, migrate to v2 instead

Comments

@AudriusButkevicius
Copy link

AudriusButkevicius commented Jul 5, 2021

I am trying to use this in conjunction with https://github.com/kubernetes/git-sync

Namely, I've got a k8s pod that has 2 containers, git-sync and your image.
I've got a shared emptyDir volume shared between the two.

The way git-sync works is:
It checks out the revision under rev-<hash>, and creates a symlink name-of-repo -> rev-<hash>.
And then periodically checks for new changes, checks out new rev-<hash> and repoints the symlink.

I am pointing your image to serve name-of-repo/config, and initially it works, but I observe that once git-sync checks out a new revision, your image starts returning 404s.

I am no rust expert, but I believe as you wire your server up, you are probably resolving the symlink to a real path along the way, hence prevent any serving of files once the symlink is changed.

@AudriusButkevicius AudriusButkevicius changed the title Does not support paths behind symlinks Breaks if path is behind as symlink and the symlink is changed Jul 5, 2021
@joseluisq
Copy link
Collaborator

Can you please provide the version that you are using and the os in order to try to reproduce it?

@joseluisq
Copy link
Collaborator

I am pointing your image to serve name-of-repo/config, and initially it works, but I observe that once git-sync checks out a new revision, your image starts returning 404s

I assume that you are changing the root dir which is a symlink right ?

@AudriusButkevicius
Copy link
Author

AudriusButkevicius commented Jul 5, 2021

Yes.
Given:

root@you:/tmp/tmp.pjklPfxOlR# mkdir one
root@you:/tmp/tmp.pjklPfxOlR# ln -s one root
root@you:/tmp/tmp.pjklPfxOlR# date >one/a

Which looks like:

root@you:/tmp/tmp.pjklPfxOlR# ls -lah
total 20K
drwx------  3 root root 4.0K Jul  5 21:35 .
drwxrwxrwt 28 root root  12K Jul  5 21:35 ..
drwxr-xr-x  2 root root 4.0K Jul  5 21:35 one
lrwxrwxrwx  1 root root    3 Jul  5 21:35 root -> one

if I run the image as:

docker run -it --rm -v /tmp/tmp.pjklPfxOlR:/mnt -p 3000:3000 joseluisq/static-web-server:1.17.1-alpine --port 3000 --root /mnt/root/ --log-level trace

In the logs, when doing curl localhost:3000/a, I see:

2021-07-05T20:37:08 [TRACE] - Opening path /mnt/one/a

Even if --root is pointing at /mnt/root/.

If I remove one and repoint /mnt/root/ to something else, that has a inside, I just 404's.
Because it's still trying to open /mnt/one/a.

This is with: 1.17.1-alpine

@joseluisq
Copy link
Collaborator

joseluisq commented Jul 5, 2021

Ok, the behavior of "pointing always to the root dir " is expected for this server. Even if your symlinked root is changed later on, you will always get the path what was assigned first.
The reason is that the root path can only be changed during server startup but not after dynamically, meaning that the only way to reflect the root path change is restarting the server.

However a workaround could be to try to symlink your content only (inside the root) but not the root itself.
Allow to changing the root path on the fly could cause unexpected behavior or even represent security concerns mapping files on system.

Anyway If you can propose some alternative of solution for your use case please let me know I open to discuss.

I'm going to close this issue since it's not a bug.
However feel free to let your comments or purpose a solution if you want so.

@joseluisq joseluisq added the not-a-bug It indicates that the issue is not related to SWS label Jul 5, 2021
@AudriusButkevicius
Copy link
Author

AudriusButkevicius commented Jul 5, 2021

The reason is that the root path can only be changed during server startup but not after dynamically, meaning that the only way to reflect the root path change is restarting the server.

But the path I gave it is /mnt/root, not /mnt/one, the server went on and decided to do it's own thing and changed it itself, by resolving the symlink.
Why is it doing that in the first place?

Allow to changing the root path on the fly could cause unexpected behavior or even represent security concerns mapping files on system.

Sure, I buy a good security argument, but I don't see this one as being a good one.
Why security is only important for the root, but not for any other path down the line?

If anything, the default behaviour on nginx is the completely the opposite.

It permits the root to be a symlink, but does not follow symlinks within the root, unless there is a special directive in the server block permitting that.

That is because attackers can manipulate files within the root, but not outside of the root, hence by definition cannot change what the root symlink points to.

I'm going to close this issue since it's not a bug.

If this is intentional, can you point me where this intention happens?

Don't want to sound offensive, but it feels like you are not sure why/where this happens, and it's easier to just shrug it off as "not-a-bug".

For the record, I've already moved on to using nginx which solves my issue, but I am still going to fight my ground here, as I believe the arguments are quite weak and conclusion is wrong, based on what other web servers do.
If not for me, for future users.

@joseluisq
Copy link
Collaborator

I understand your point more clear now regarding the symlink for the root, but as I told you, it's not because I want to avoid your use case is because the server always resolves root and assets dirs that includes symlinks. That's why I labeled this issue as "is not a bug".
If you want the server behaves differently is another history and we can discuss it, but we can not thread this issue as a bug at first place.

Regarding the label, closing the issue doesn't mean nothing since I can easily reopen it if necessary.

But the path I gave it is /mnt/root, not /mnt/one, the server went on and decided to do it's own thing and changed it itself, by resolving the symlink. Why is it doing that in the first place?

So to clarify again, the server always resolves the root and assets dir paths to its absolutely paths and that includes symlinks that's why your surprise.
So if you don't want that behavior please file another issue explain your use case and why you want it to behaves differently. Then let's discuss it.

@AudriusButkevicius
Copy link
Author

I understand what the server does, it's clear from the logs. I think I even know which part of the code triggers this behaviour.

I don't understand why it does that and what it tries to achieve be doing that.

@joseluisq
Copy link
Collaborator

joseluisq commented Jul 6, 2021

I don't understand why it does that and what it tries to achieve be doing that.

The reason is "simple", convenience across platforms.
I chosen that approach because as you know a symlink can not point only to a dir but also to a file so it was a convenient solution that I preferred ad hoc in order to resolve every root or asset dir path independently of it was a symlink, relative or absolute path.

@AudriusButkevicius
Copy link
Author

AudriusButkevicius commented Jul 6, 2021

I still don't follow how resolving at startup is beneficial/better than resolving at request time?
It delivers exactly the same end result, so I must be missing something here.

Please, do not say performance, because it's exactly one extra syscall querying a data structure that is already in memory.
That is if you choose to manually resolve it, it's zero extra syscalls if you just choose to just open the file.

I don't see how preventing usage of features of the filesystem is a convenience, and why "across platforms" is relevant here, given usage (not creation) of symlinks is identical across all of them.

Can you give a scenario where this (pre-resolving symlinks at startup, but for the root and nothing else) makes it convenient, or a scenario where doing that would be in inconvenient?

@joseluisq
Copy link
Collaborator

I still don't follow how resolving at startup is beneficial/better than resolving at request time?

First, for the most common cases that I have seen using a web server to serving static files I have never seen a server with dynamic root path at request time. I see that use case personally rare, even so, your use case for example can be addressed conditioning your content under the root.

Second, I never has said that this server approach is better, but you are claiming that.
What I'm saying is that my opted solution is convenient for this server in order to resolve directory paths: symlinks or relative ones to their absolutes ones (paths by their own with no dependencies). That's all.
What is the problem with it? Is a bug? No. It's intentional. That's why I have labeled your claiming as Not A Bug.

It delivers exactly the same end result, so I must be missing something here.

Please provide an example or even better, code if you can.

Please, do not say performance, because it's exactly one extra syscall querying a data structure that is already in memory.
That is if you choose to manually resolve it, it's zero extra syscalls if you just choose to just open the file.

Honestly I don't care about it, since the "overhead" resolving the root or assets paths is insignificant and all this happens at startup time.

I don't see how preventing usage of features of the filesystem is a convenience, and why "across platforms" is relevant here, given usage (not creation) of symlinks is identical across all of them.

This is clearly a fallacy ad hominem since I have repeated myself comments above that if you provide me enough details in a separate issue we can discuss it and why not to consider to conditioning the current approach.
The point is not the inclusion of a feature like this, that it could happen, but the refactoring and the conditioning of the existing code.

Can you give a scenario where this (pre-resolving symlinks at startup, but for the root and nothing else) makes it convenient, or a scenario where doing that would be in inconvenient?

First, I take it again from the comment above:
I preferred ad hoc in order to resolve every root or asset dir path independently of it was a symlink, relative or absolute path.

The escenario is to have a path (either relative or symlink) resolved (with all intermediate components normalized and symbolic links resolved). It means a path with no dependencies. That is a path that exist by its own: absolute path.
I call the approach as a convenient for this server, you can call it differently if you want.
So here there is no "prevention of system features" or something like that.

For "symlinks being an inconvenience", you are claiming about it.
I don't see why not to adopt a symlink approach. You have misunderstood, there is nothing against it. But make it happen if you are interested in or file a feature request if you can not do it.

@AudriusButkevicius
Copy link
Author

AudriusButkevicius commented Jul 6, 2021

I have never seen a server with dynamic root path at request time.

I am not sure what you mean by dynamic, but nginx, apache, lighttpd all support being pointed to a symlink as the content root, and trust the operating system to re-resolve the symlink on every request.

The only reason I dislike using them is that configuring them requires an actual config file, oppose to just command line parameters.

I've opened this issue specifically for the issue/use case being discussed here, hence I don't trust that raising another one, repeating the same content explaining what use case doesn't work, will suddenly resolve this.

Given we are getting nowhere, and in order to save both of our time, I think we can consider this "resolved".

Thanks for your time, and good luck.

@joseluisq
Copy link
Collaborator

Thanks also for the discussion.
I guess I did try to explain what was involved in the decision of chosing path resolving at first place.
Anyway as I said, this project is open for improvements. And this issue is welcome as an improvement, that means as a feature rather than an issue or bug to fix.

@joseluisq joseluisq added the v1 Deprecated v1, migrate to v2 instead label Jul 17, 2021
@joseluisq
Copy link
Collaborator

Just FYI or future people interesting in this.
In the new major v2.0.0 the root directory path is not resolved to an absolute at start up time anymore, meaning that your use case with root symlinks updates is covered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not-a-bug It indicates that the issue is not related to SWS v1 Deprecated v1, migrate to v2 instead
Projects
None yet
Development

No branches or pull requests

2 participants