Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSGI support for irclogsearch #9

Merged
merged 18 commits into from Sep 17, 2015

Conversation

@alga
Copy link
Contributor

commented Sep 16, 2015

Since CGI is unavailable on nginx, added an option to hook up log search via wsgi.

alga added 13 commits Sep 16, 2015
Instead of monkeypatching sys.stdout.
This way print_search_form and print_search_results can be reused
between cgi and wsgi without extra keyword args.
With wsgi it's time to start considering thread safety.  It's very
realistic that on a single server there would be serveral instances of
search configured, running in a multithreaded environment.
setup.py Outdated
@@ -50,6 +54,7 @@ def read(filename):
irclog2html = irclog2html.irclog2html:main
logs2html = irclog2html.logs2html:main
irclogsearch = irclog2html.irclogsearch:main
serve = irclog2html.irclogsearch:serve

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

This is a very generic name. If I pip install irclog2html, I'd want something more specific on my $PATH than just "serve".

How about "irclogserver"?

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Actually, I may have misunderstood. If the purpose of this script is just for manual testing while developing, maybe remove the entry point from setup.py and define it explicitly in buildout.cfg?

where = logfile_path
where = os.path.dirname(__file__)
if logfile_pattern is None:
logfile_pattern = "*.log"

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Indentation seems wonky here

errors='xmlcharrefreplace',
line_buffering=True)

status = str("200 Ok")

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

A comment could be useful, explaining the str() stuff. Something like

# wsgiref needs byte strings on Python 2, and unicode strings on Python 3.
# Our 'from __future__ import unicode_literals' requires an explicit type conversion here.

def get_path(environ):
path = environ.get('PATH_INFO', '/')
path = path.split('/')[-1] # is this enough to protect from FS traversal?

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Windows and ''?

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

I'd rather return 404 for URLs with extra slashes. Now it feels like Acquisition: /irclogs/anything/goes/here/search?q=...

line_buffering=True)

status = str("200 Ok")
content_type = str("text/html; charset=UTF-8")

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Could be text/plain, if you try to access the .log file directly. Charset is tricky: log files might be in xchat's hybrid UTF-8/Latin-1 mix.

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Let's leave that for a future PR: serving .log files (and transcoding them to UTF-8 on the fly, line-by-line).

start_response(status, headers)
return result

def serve():

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Two blank lines between function defs.

from wsgiref.simple_server import make_server
srv = make_server('localhost', 8080, wsgi)
print("Started at http://localhost:8080/")
srv.serve_forever()

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

This is nice for testing, but for actual use the port number should be configurable, e.g. via a command line option.

Also the log location and other options?

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

(That's if we decide to have an installed script for serving IRC logs, rather than moving this to buildout.cfg)

'search'
>>> get_path(dict(PATH_INFO='/#channel-2015-05-05.log.html'))
'#channel-2015-05-05.log.html'

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Oh, is the path info already URL-decoded?

I assume you've tested this manually, right?

This comment has been minimized.

Copy link
@alga

alga Sep 17, 2015

Author Contributor

Well, log/html files with hashes in names work fine with bin/serve.

"""

def doctest_wsgi():

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

Two blank lines.

file=stream)
print(FOOTER, file=stream)

return formatter # destroying it closes the stream, breaking the result

This comment has been minimized.

Copy link
@mgedmin

mgedmin Sep 17, 2015

Owner

This part is something I need to understand, because I currently don't, despite your explanations.

(I think that's because my code is dumb, not because your explanations are bad.)

This comment has been minimized.

Copy link
@alga

alga Sep 17, 2015

Author Contributor

Just change it to return None and debug the test failure.

@mgedmin

This comment has been minimized.

If you intend to test the IOError exception case, I think you need to remove the ./, no?

mgedmin added a commit that referenced this pull request Sep 17, 2015
WSGI support for irclogsearch
@mgedmin mgedmin merged commit 555dded into mgedmin:master Sep 17, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@mgedmin

This comment has been minimized.

Copy link
Owner

commented Sep 17, 2015

Thank you!

@mgedmin

This comment has been minimized.

Copy link
Owner

commented Sep 17, 2015

Code coverage regressed from 98% to 97% after I merged this :/

Oh well.

@alga

This comment has been minimized.

Copy link
Contributor Author

commented Sep 17, 2015

The serve() function is not covered...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.