Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGRESSION: Vhosts no longer work #13

Closed
fosslinux opened this issue Mar 26, 2019 · 18 comments

Comments

Projects
None yet
4 participants
@fosslinux
Copy link
Collaborator

commented Mar 26, 2019

I've been having multiple reports of vhosts no longer working; ie when you go to one server it goes to the other. When you use the browsable links at the bottom of the page, it works fine.

I know for a fact that this used to work; at some point since 101 this broke.

cc/ @hb9kns @jamestomasino @benharri

@benharri

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

see gopher://tildeverse.org for an example.
tilde.team is set as the main host with the -h switch

@hb9kns

This comment has been minimized.

Copy link
Collaborator

commented Mar 28, 2019

I had a quick look through all that we changed since v.101, but I don't see any obvious things, as we did not directly touch vhost code. Could it be linked to a combination of -nx and -nu, blocking evaluation of vhost gophermap? I have to more look into the vhost code to better understand, and I don't have a vhost setup available for testing.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 31, 2019

same, i didn't see anything obvious.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 31, 2019

could it have anything to do with that @hb9kns

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 2, 2019

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 2, 2019

i was talking to @benharri and he can confirm there was the same issue in 101.

@jamestomasino

This comment has been minimized.

Copy link

commented Apr 2, 2019

There's more to the story than is obvious. Try these scenarios:

lynx gopher://tilde.team # this works as it's the default vhost
lynx gopher://tildeverse.org # this loads tilde.team instead

Now try this:

lynx gopher://tilde.team # still works
# now navigate from within lynx to the bottom of the gophermap and follow the link to gopher://tildeverse.org

Links from within a gophermap work!

Inspecting the gophermap link I see:

gopher://tildeverse.org/1/;tildeverse.org

So gophernicus is building some smarter links than the default here for some reason. These smarter links work, but direct ones do not.

And to confirm:

lynx gopher://tildeverse.org/1/;tildeverse.org # this DOES work

So, the secondary detection of a vhost is working, but not the primary check.

@jamestomasino

This comment has been minimized.

Copy link

commented Apr 2, 2019

Further inspection reveals that the actual vhost behavior is something like this:

  1. check if the requested selector is available on the default vhost. If so, serve it and stop.

  2. if the selector isn't found on the default vhost, check all the other vhosts until found, or...

  3. fail.

    lynx gopher://tildeverse.org/0/henlo.txt # finds the file properly since it exists ONLY on tildeverse.org

With this logic, I don't see any way that a root gophermap will ever be served properly via vhost unless linked with the special /1/;hostname suffix.

@benharri

This comment has been minimized.

Copy link
Contributor

commented Apr 2, 2019

In addition to the file henlo.txt being found in the tildeverse.org vhost, it appears that file conflicts will be won by the primary vhost specified by -h.
A request with the ;host selector followed by another without the ; will result in the same vhost being served.

It seems that the original implementation was hackier than expected!

@jamestomasino

This comment has been minimized.

Copy link

commented Apr 2, 2019

I think if there's a way to intercept the original request and have it include the hostname, then you could just hijack st.server_host immediately and assign it that name. Unfortunately my gut tells me that gopher doesn't pass that info along. If it did, you could remove pretty much all of the vhost code and replace it with a simple assignment.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 2, 2019

Right, thanks heaps for the help!

Next step: testing to see if gopher passes the hostname

if it does:

do a simple assignment as @jamestomasino was saying

else:

uhhh... we'll cross that bridge if we come to it.

@hb9kns

This comment has been minimized.

Copy link
Collaborator

commented Apr 3, 2019

Next step: testing to see if gopher passes the hostname
if it does:

I don't think so, at least we can't assume from RFC1436:

Client: {Opens connection to rawBits.micro.umn.edu at port 70}
Server: {Accepts connection but says nothing}
Client: {Sends an empty line: Meaning "list what you have"}
Server: {Sends a series of lines, each ending with CR LF}

Even if something passes the hostname, I don't see how we could rely on that.
Nothing forbids me to access a gopherhost directly by IP, which should work well according to the RFC, but then there is clearly no way of transmitting the hostname. IMO the gopher protocol cannot handle vhosts the same way we're used with HTTP.

else:
uhhh... we'll cross that bridge if we come to it.

I'm highly tempted to close this issue, as I don't think it's anything we can handle correctly in gophernicus or any other current host sticking to the RFC. (But we could explain the problem in the README, of course.)
We would need an extension of the protocol, where clients tell a server who they think it is.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 4, 2019

@hb9kns I'm more than happy to add something that is an extension of the protocol, but in no case should we rely on the client being compatible. gophernicus already adds many gophertypes (eg =) that are not in RFC1436. Will attempt to develop a solution. In the meantime, I suggest removing the vhost code from gophernicus and putting a note in the README. Once we remove it this issue can be closed and we will open a new issue. Why do I feel like any solution will be as janky as the first :)

@hb9kns

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2019

@fosslinux : I agree we should do what we can with gophernicus, of course, and I don't intend to play the dictator card and say "not our problem, go elsewhere" ^-^ Of course I did also not want to suggest we should put burden on clients.

OTOH, I don't like to remove/hide code which is already working in certain circumstances, if I understand correctly in this case (as said before, I don't have a vhost setup for testing, so I may not get all the details well).

But how about writing an assistance script for vhost operators to set absolute vhost paths (or "smarter links" according to jamestomasino) in all gophermaps of a given subdirectory? It would have to be run manually or as cronjob, and make sure the ;host syntax is present for local relative selectors.
It would be an ugly temporary fix, because it's effectively outside of gophernicus, but I'm not that keen on adding it into the code with yet another configuration file, before having it tested and discussed in detail. In addition, I assume this logic would have a noticeable impact on performance, as it would need to run each time a vhosted gophermap is served.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 4, 2019

@hb9kns

OTOH, I don't like to remove/hide code which is already working in certain circumstances, if I understand correctly in this case (as said before, I don't have a vhost setup for testing, so I may not get all the details well).

So, the current code works horribly. The root gophermap for a vhost will never be displayed and only files that aren't on the main vhost will work.

But how about writing an assistance script for vhost operators to set absolute vhost paths (or "smarter links" according to jamestomasino) in all gophermaps of a given subdirectory? It would have to be run manually or as cronjob, and make sure the ;host syntax is present for local relative selectors.
It would be an ugly temporary fix, because it's effectively outside of gophernicus, but I'm not that keen on adding it into the code with yet another configuration file, before having it tested and discussed in detail. In addition, I assume this logic would have a noticeable impact on performance, as it would need to run each time a vhosted gophermap is served.

Here I'm calling them "smarter links". These "smarter links" must be specified by the client; something we can't rely on. We could add a kind of redirect, but this would require some hacks as redirects aren't supported by the gopher RFC.

Looking through the code I see a st.server_host variable. Looking further into this.

@hb9kns

This comment has been minimized.

Copy link
Collaborator

commented Apr 5, 2019

So, the current code works horribly. The root gophermap for a vhost will never be displayed and only files that aren't on the main vhost will work.

That might be true, but can we assume nobody is using that piece of horribly working code? If we remove it, we might break something else.

Here I'm calling them "smarter links". These "smarter links" must be specified by the client; something we can't rely on. We could add a kind of redirect, but this would require some hacks as redirects aren't supported by the gopher RFC.

No, of course we should not rely on the client, and stick to the RFC. I was thinking about a server side solution, with assistance from the site admin.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 5, 2019

That might be true, but can we assume nobody is using that piece of horribly working code? If we remove it, we might break something else.

You have a point, I'm going to email you about an idea I've had.

No, of course we should not rely on the client, and stick to the RFC. I was thinking about a server side solution, with assistance from the site admin.

A redirect would be a server side solution.

The best way for server admins to go forward for now would be to prefer ipv6 wherever possible, and bind each address to a different ip. I'm not sure exactly if gophernicus can do that properly, tbh at the moment gophernicus codebase is a mess, and use ipv4 however with the caveats we've discussed here.

@fosslinux

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 19, 2019

I'm going to close this for now. We're currently leaving it as is, and I'll add a note to the README about this.

@fosslinux fosslinux closed this Apr 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.