marvinthepa / urlscan
- Source
- Commits
- Network (1)
- Issues (0)
- Downloads (0)
- Wiki (1)
- Graphs
-
Branch:
master
Martin Sander (author)
Thu Oct 08 13:40:59 -0700 2009
urlscan /
README
urlscan
Daniel Burrows <dburrows@debian.org>
0) Purpose and Requirements
urlscan is a small program that is designed to integrate with the
"mutt" mailreader to allow you to easily launch a Web browser for URLs
contained in email messages. It is a replacement for the "urlview"
program.
urlscan requires Python and the python-urwid library, as well as
sensible-browser from the debianutils package.
1) Features
urlscan parses an email message passed on standard input and scans
it for URLs. It then displays the URLs and their context within the
message, and allows you to choose one or more URLs to send to your Web
browser.
Relative to urlview, urlscan has the following additional features:
(1) Support for emails in quoted-printable and base64 encodings. No
more stripping out =40D from URLs by hand!
(2) The context of each URL is provided along with the URL. For
HTML mails, a crude parser is used to render the HTML into text.
2) Setting up urlscan
To set up urlscan, install the Debian "urlscan" package (or use
setup.py to install the program). Once urlscan is installed, add the
following lines to your .muttrc:
macro index,pager \cb "<pipe-message> urlscan<Enter>" "call urlscan to extract URLs out of a message"
macro attach,compose \cb "<pipe-entry> urlscan<Enter>" "call urlscan to extract URLs out of a message"
Once this is done, Control-b while reading mail in mutt will
automatically invoke urlscan on the message.
urlscan uses sensible-browser to invoke the default Web browser of
the current environment. To choose a particular browser, set the
environment variable BROWSER; e.g.,
export BROWSER=/usr/bin/epiphany
.
3) Known bugs and limitations
(1) Because the Python curses module does not support wide
characters (see Debian bug #336861), non-ASCII characters can
cause unpredictable results in urlscan. This problem will go
away if Python and urwid are patched to support wide characters.
(6) Running urlscan sometimes "messes up" the terminal background.
This seems to be an urwid bug, but I haven't tracked down just
what's going on.
(2) Extraction of context from HTML messages leaves something to be
desired. Probably the ideal solution would be to extract
context on a word basis rather than on a paragraph basis.
(3) The HTML message handling is a bit kludgy in general.
(4) multipart/alternative sections are handled by descending into
all the sub-parts, rather than just picking one, which may lead
to URLs and context appearing twice.
(5) Configurability is more than a little bit lacking.

