Skip to content


Repository files navigation

Filters out RSS/Atom feeds, returning articles that match a specified pattern. The output is another valid XML feed.

What's included

  • a cli util;
  • a standalone http server that shares the same engine w/ the cli util.
  • a web client that uses the included server as an intermediary and acts as a gui version of the cli util.


  • node >= 20


$ npm i -g grepfeed
$ grepfeed-server

Open in a browser.

How it works

lib/feed.js contains all the code that parses & transforms xml feeds. Its core is Grep class--a Transform stream:

readable_stream.pipe(<our filter>).pipe(writable_stream)


cli/grepfeed.js extends Grep to override several methods where it's convenient to write the output in any format one wants. 3 interfaces are included: text-only (the default), json, xml. The latter produces a valid rss 2.0 feed. E.g.

$ curl | cli/grepfeed.js apple -d=2016 -x

parses the input feed, selects only articles written in 2016 or newer that match the regexp pattern /apple/. -x means xml output.

Usage: grepfeed.js [opt] [PATTERN] < xml

  -e      print only articles w/ enclosures
  -n NUM  number of articles to print
  -x      xml output
  -j      json output
  -m      print only meta
  -V      program version

Filter by:

  -d      [-]date[,date]
  -c      categories

Or/and search for a regexp PATTERN in each rss article & print the
matching ones. The internal order of the search: title, summary,
description, author.

  -v      invert match


Acts as a proxy: downloads a requested feed & returns the filtered xml. Query params match cli/grepfeed.js command line interface. To start a server, run

$ make
$ server/index.js

(For a different host/port combination, use HOST & PORT env vars.)

This following example yields the same xml as in the cli/grepfeed.js case, only does it through http:

$ curl ''

Notice d means -d in the cli/grepfeed.js example, -x doesn't make sense here, _ means the 1st command line arg, apple in this case. The server doesn't invoke cli/grepfeed.js program; they both use minimist to parse command options, thus the perceived similarity in the behaviour.


A URL you'd like to filter must be reachable from within the machine server/index.js is running on. This could pose a security risk or be inconvenient if you want to filter XML from your LAN. In the latter case run grepfeed-server on your local machine.


  • All html tags in article titles are removed, even if a title is in plain text.
  • This should've been written in Rust or something similar, as Node is slow and memory hungry for this kind of tasks.

See also

itunesrss, rss2mail




Filters out RSS/Atom feeds, returning articles that match a specified pattern. The output is another valid XML feed.







No packages published