Post Hacker News Stories & Comments to NNTP Server
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
mk
src
template
test
.gitignore
.npmignore
Makefile
README.md
package.json
screenshot1.png
screenshot2.png

README.md

Post Hacker News Stories & Comments to an NNTP Server

Or how to read HN offline.

Features

  • Uses the official HN API
  • No configuration files
  • CLI
  • Compatible w/ cron jobs
  • MIME multipart/alternative mails w/ html & txt portions
  • Mostly stateless
  • Read-only
  • No up-voting support or score updates

A screenshot of running mutt

Requirements

  • nodejs 0.10.3x (doesn't work w/ node 0.12 or iojs due to request dependency)
  • rnews CL util from INN package
  • w3m browser

Installation & Setup

(in Fedora 21)

# yum install w3m inn

Add this to sudoers (replacing alex w/ your user name):

alex ALL = (news) NOPASSWD: /bin/rnews

Then in the cloned repo:

$ make

or just

# npm install -g hackernews2nntp

Check your local inn

# /usr/libexec/news/ctlinnd newgroup news.ycombinator

must not raise an error.

Then

$ hackernews2nntp-get exact 8874 -v | hackernews2nntp-convert -v -f mbox > 1.mbox

will download a HN comment & convert it to mbox format. If you have mutt installed, you can view it via mutt -f 1.box.

$ hackernews2nntp-get exact 8874 -v | hackernews2nntp-convert -v | sudo rnews -N

will post the same comment to news.ycombinator group. If the message didn't appear (because it's too old (Apr 2007) for a default INN settings), run

$ journalctl /bin/rnews
$ journalctl -u innd

Examples

  1. Get top 100 stories & all comments for them, then exit:

     $ hackernews2nntp-get top100 -v | hackernews2nntp-convert -v | sudo rnews -N
    

    If you get an EPIPE error, don't pipe to rnews but try to invoke hackernews2nntp-conver w/ --fork option:

     $ hackernews2nntp-get top100 -v | hackernews2nntp-convert -v --fork
    

    (It will call sudo rnews -N internally for each article.)

  2. Get last 200 stories/comments, then exit:

     $ hackernews2nntp-get last 200 -v --nokids | hackernews2nntp-convert -v | sudo rnews -N
    
  3. Don't post anything to an NNTP server but create 1 big .mbox file:

     $ rm 1.mbox
     $ hackernews2nntp-get top100 -v | hackernews2nntp-convert -v -f mbox >> 1.mbox
    
  4. Get stories/comments in range from 8,000,000 to 8,000,100:

     $ hackernews2nntp-get -v --nokids range 8000000 8000100 | hackernews2nntp-convert -v | sudo rnews -N
    
  5. Get stories/comments from 8859730 up to the most current one & save the last (highest numerical value) item id in /tmp/last-item.txt:

     $ hackernews2nntp-get -v --maxitem-save /tmp/last-item.txt --nokids range 8859730 | hackernews2nntp-convert -v | sudo rnews -N
    

FAQ

  1. I have a problem w/ rnews.

    Please, don't ask me any questions about INN. I have a very vague idea how it works. I've chosen rnews because it (a) can read articles form stdin in a batch mode, (b) doesn't modify the incoming article, (c) fast, (d) comes w/ INN.

    Unfortunately it's not possible to know 'was the article posted or not' w/o reading INN logs.

  2. Can hackernews2nntp run as a daemon?

    No.

  3. What does hackernews2nntp-convert warning: json validation failed message mean?

    Usually it means that a HN post was deleted & there was no usefull data in the json payload. For example,

     $ hackernews2nntp-get exact 126809 | json -g -c 'this.deleted'
     [
       {
     	"deleted": true,
     	"id": 127217
       }
     ]
    

    vs.

     $ hackernews2nntp-get exact 126809 | json -g \
     	-c '!this.kids && this.by == "pg" && this.type == "comment"' | json 0
     {
       "by": "pg",
       "id": 126816,
       "parent": 126809,
       "text": "As you can see, we do.  You can read more [...]",
       "time": 1204404016,
       "type": "comment"
     }
    

Bugs

  • Barely tested on Fedora 21 only.
  • Supports only UTF-8 locale.
  • Don't follow 'parent' property, e.g. if it gets a comment, it tries to download all its 'kids', but ignores the 'parent'.
  • hackernews2nntp-get can pause node 0.10.x process if you're not using --nokids option.
  • src/crawler2.coffee is too long.

See Also

rnews(1), w3m(1), mbox(5), sudoers(5)

News

0.2.1

  • hackernews2nntp-get
    • fix a crash in json validation

0.2.0

  • hackernews2nntp-get
    • totally rewrite Crawler
    • throttle a max number of http requests by 20/s (see --conn-per-sec)

0.1.0

  • hackernews2nntp-get

    • range mode
    • --maxitem-save CLO
    • -s CLO
    • always print statistics on exit w/ -v or -s CLOs
  • hackernews2nntp-convert

    • --template-dir CLO
    • fix a bug in mbox header w/ missing leading zeros

Credits

Many thanks to John Magolske for suggestions for hackernews2nntp-get range mode & --maxitem-save CLO & also for reporting bugs.

License

MIT.