Introduction
News administrators seeking to keep a handle on the size of their newsfeeds may opt, as I have, not to carry binaries. While it is fairly simple to drop newsgroups with binaries
, pictures
, and sound
in their names, you still need to deal with binaries posted in newsgroups intended solely for discussions.
To handle this for news servers running innd, I wrote the perl script purge-binaries. purge-binaries scans news postings, purging those it determines to be binary files. It is designed to work as a channel with inn, although it should be possible to feed it output from ls -l
too.
To use it, you will need Perl 4.0 or better as well as the ctime.pl
routine that emulates ctime(3C)
.
As an alternative, you may want to investigate cleanfeed. Provided you're running innd with perl filtering enabled, cleanfeed not only filters out binaries but also traps a whole slew of other garbage that would otherwise fill up your newsspool.
Installation
- Retrieve the script and save it locally.
- Verify ownership and permissions on the script.
- Edit the script and adjust the location of the perl in the first line,
$logfile
,$encodedpct
,$minsize
, and/or$maxlines
to suit your tastes.
Use
By design, purge-binaries operates as a channel with inn, checking each article as it arrives and deleting binaries. To use, add an entry like the following to your newsfeeds file:
purge-binaries:*:Tc,WbO:/usr/local/news/bin/purge-binaries
which will process articles in every newsgroup. If you'd prefer, you can modify this entry to restrict attention to certain hierarchies and/or newsgroups. Refer to newsfeeds(5)
for more specific information.
Note that this channel is fed overview data so crossposted articles will be handled properly.
You may want to adjust some of the parameters in the script itself to improve performance or whatnot. One word of warning though - lowering the percentage of encoding will increase the risk of falsely identifying a post as a binary.
Also of interest may be Brian Edmonds' binsummary script, which summarizes the output generated by purge-binaries. It shows hidden binary groups and how prevalent binary postings are in various groups.
License
You may freely use and redistribute this. I can not offer any support but am interested in your comments, suggestions, and problem reports. Note, though, that I no longer administer a news server so I'm not likely to be able to troubleshoot problems you might have with this script.