This is a stupidly tiny, nonconformant (but functional) Gopher server
- Requires
gophermap
files. - Refuses to serve documents outside of the root
- Refuses to serve anything that isn't world-readable
- Written in Python 3 (Python 3.7 tested)
- No outside dependencies
- No needless features
There's no support for directories that don't contain a gophermap
file.
What you've put in the gophermap
file is all there is. There's no support
for inserting the contents of a directory in to a gophermap file.
There's no PHP support. No CGI support. No search support.
The code can be easily audited. The code can be easily understood.
Right now, there's also no unit tests, and the code could be cleaner.
Still, the baseline is simple and small and works. It's simple and small enough that it could provide a test-case to iterate upon a variety of potential designs.
The Gopher protocol prevents .
and ..
from being a part of valid
selectors. But, I'm not writing my own path parsing code.
If it is running on a port above 1024 as a non-root user, chroot
isn't available and I still want to be very sure there is no possible
way to accidentally serve documents from outside the document root.
Right now, I'm resolving the expected path, (following all symlinks,
processing '.' and '..', etc) and then verifying that it is still
within my expected document root.
It would be very possible for me to think, "I'm avoiding '..' and I refuse
to serve symlinked files or directories" and then someone creates a symlink
to /
and references /somedir/link-to-root/etc/passwd
. somedir
is in
my document root, and etc
is a real directory and not a symlink. No periods
are found in the selector at all!
I can, however, reliably check for and fail all references to ..
if I'm
currently running chroot
ed and I can trust that symlinks that point out
of my jail will fail. We can't chroot
as a normal user, but then we can't
listen to port 70 as a normal user. So, chroot
is a valid additional
safety measure, but it isn't enabled by default.
I would like to drop priviledges. The current Python 3 asyncio
logic
doesn't make it easy to drop priviledges after a priviledged port has
been grabbed.
Gophermap files have a gopher-type
character prefixed
to a TAB-separated list of columns.
The gopher-type
indicates what sort of resource it is:
0
: plain-text file (US-ASCII encoding)1
: directory (which should have agophermap
in it)3
: Error code (mostly generated by system)9
: binary fileh
: HTML file or HTML-style URLg
: GIF image file (Can also useI
,:
, or9
.)I
: Other image file (Can also use:
or9
.)s
: Sound or other audio file (Can also use<
or9
.)i
: info-line (normally implied -- details later)
Less common, but still potentially useful:
8
: Telnet (good for BBSes, MUDs, etc.)T
: Interactive 3270 emulation sessions (valid, but rare)+
: Redundant/Mirror server7
: Gopher full-text search (unsupported bygofor
)
Other types include:
4
: BinHex encoded file (Text-encoding of binary data? Do not use.)5
: DOS binary archive (Use9
instead.)6
: Uuencoded file (Text-encoding of binary data? Do not use.)c
: Some sort of Calendar (Use9
instead.)e
: Some sort of Event (Use9
instead.)M
: MIME multipart/mixed (Text-encoding of binary data? Do not use.):
: Gopher+ any image (UseI
instead.)<
: Gopher+ any sound or audio (Uses
instead.);
: Gopher+ any movie (Use9
instead.)d
: Binary document (Use9
instead.)
Possibly unexpected results:
-
: do not list entry (Not supported bygofor
and will be sent to client.)#
: internal comment (Not supported bygofor
and will be sent to client.)!
: page title (Not supported bygofor
and will be sent to client.)
The gophermap
files consist of four TAB-seperated columns.
If no TAB is present, it is treated as an informational line. The 'i' character is prepended automatically, and stub values are filled in for the selector, host and port.
If there are two columns, it is treated as a Gopher link to the current Gopherhole. This adds your current server and current port to the missing columns.
If there are three columns, it is expected that you're referencing a Gopherhole other than your own. Regardless of what port your own Gopherhole is listening on, this will always only ever add the standard Gopher port, 70.
If there are four columns, it is passed as-is.
If there are more than four columns, the remaining columns are dropped.
This is required to be compatible with Gopher+. If gofor
adds some
Gopher+ features later, additional columns may be supported.
You probably see all manner of other gopher types that vary by
which Gopher server you're looking at. Here's the thing, though:
the client is responsible for handling the gopher-type
, not
the server.
Is a server listing movies with the Gopher+ ;
gopher-type?
This means that line will entirely disappear for some clients,
when you could have just offered it via the standard 9
binary
type. (The whole point of the extended types is so that lines
can be dropped from display to users when gopher-types are
unsupported.)
Is a server returning all archives with the 5
"DOS binary"
gopher-type? Any gopher-client that expects those to be
DOS-specific will drop them from display.
gofor
is stupidly simple. Those gopher-type
characters
only matter to servers when they create the gophermaps that
they send to clients. Since gofor
doesn't support that,
you can use whatever new or unusual gopher-type
characters
you want.
It's the clients that care about the gopher-type
characters.
They're the ones that need to handle the new and nonstandard
gopher-type
s that you may be using. Personally, I don't
have the hardware available to test all of the possible
clients, so I keep my own gopherhole strictly standard.
usage: gofor [-h] [--fqdn FQDN] [--port PORT] [--root ROOT] [--ipv4]
[--verbose] [--version] [--chroot]
gofor: simple gopher server
optional arguments:
-h, --help show this help message and exit
--fqdn FQDN, -f FQDN Fully qualified domain name clients should use.
--port PORT, -p PORT The port to listen to.
--root ROOT, -r ROOT The document root to serve from.
--ipv4, -4 Bind to 0.0.0.0 instead of ::
--verbose, -v Be more verbose.
--version show program's version number and exit
--chroot chroot in to the document root
This gopher server violates the spec.
Gopher is supposed to serve files terminated by a '.' on a line by itself. It's supposed to avoid this for binary files (of course) but everything else is subject to this.
However, clients don't tell the server whether they expect a binary or
a text file. All incoming requests look the same. It's just a selector.
The gopher-type
is only known by the client at selection time.
What qualifies as a text-file? What qualifies as a binary file?
According to RFC1436, only 5
(DOS binary archives) and 9
(binary files) are immune from the period requirement. While there
weren't that many types of media available, there was g
for GIF
files and I
for arbitrary types of images. According to the spec,
those images should be terminated with periods on lines of their own.
How can I serve documents properly when I don't know what the client is expecting? Some Gopher servers track state and remember the expected types of files. But, with the clients I've tested, the only time the terminal period was needed was when dealing with directories.
If clients only need the terminal period in that one case, then I just only send the terminal period in that one case.
Gopher+ got rid of the requirement for the terminal period. Instead
a Gopher+ selector can return the amount of data, '-1' for the old
period-terminator, or '-2' to terminate on connection-close. It
also clarifies that it is <CRLF>.<CRLF>
.
The only way binary images (as opposed to text-based images like PNM) could have worked reliably with Gopher-1 would be if servers violated the spec and ignored the terminal period.
Also, consider that Gopher+ added new gopher types for images, audio, and movies. Gopher+ is supposed to be compatible with Gopher 1 clients. It should be possible to request those resources with an older client. The only way those files can be served is without a terminal period. The only way older clients could access those resources without choking is if it was expected for clients to handle connection termination as the preferred way to terminate potentially unknown types of files.
In practice, I don't think any functioning Gopher-1 client could have been relying on the terminal period for anything outside of the gophermaps. It's possible that something accessed as plain-text needed it, but the protocol didn't really support reusable connections. If the server closes the connection, it's pretty easy to treat it as the end of the file.