Skip to content

gromnitsky/grepfeed

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
cli
 
 
lib
 
 
 
 
 
 
web
 
 
 
 
 
 
 
 
 
 

Grepfeed

Filters out rss/atom feeds. Returns articles matching a pattern. The output is another valid xml feed.

What's included

  • a cli util grepfeed.js;
  • a standalone http server that shares the same engine w/ the cli util.
  • a web client that uses the included server as an intermediary and acts as a gui version of the cli util.

Requirements

  • node >= 14.17.6
  • GNU make

Setup

  • cli/server

      $ npm i grepfeed
    

    or manually after cloning the repo:

    $ NODE_ENV=production npm i
    
  • web client, that isn't included in the npm pkg

    $ npm i
    $ make
    

How it works

lib/feed.js contains all the code that parses & transforms xml feeds. Its core is Grep class--a Transform stream:

readable_stream.pipe(<our filter>).pipe(writable_stream)

cli

cli/grepfeed extends Grep to override several methods where it's convenient to write the output in any format one wants. 3 interfaces are included: text-only (the default), json, xml. The latter produces a valid rss 2.0 feed. E.g.

$ curl http://example.com/rss | cli/grepfeed.js apple -d=2016 -x

parses the input feed, selects only articles written in 2016 or newer that match the regexp pattern /apple/. -x means xml output.

Usage: grepfeed.js [opt] [PATTERN] < xml

  -e      print only articles w/ enclosures
  -n NUM  number of articles to print
  -x      xml output
  -j      json output
  -m      print only meta
  -V      program version

Filter by:

  -d      [-]date[,date]
  -c      categories

Or/and search for a regexp PATTERN in each rss article & print the
matching ones. The internal order of the search: title, summary,
description, author.

  -v      invert match

server

Acts as a proxy: downloads a requested feed & returns the filtered xml. Query params match cli/grepfeed.js command line interface. To start a server, run

$ server/index.js .

(To select a diff port, use PORT env var.)

This following example yields the same xml as in the cli/grepfeed.js case, only does it through http:

$ curl '127.0.0.1:3000/api/?_=apple&d=2016&url=http%3A%2F%2Fexample.com%2Frss'

Notice d means -d in the cli/grepfeed.js example, -x doesn't make sense here, _ means the 1st command line arg, apple in this case. The server doesn't invoke cli/grepfeed.js program; they both use minimist to parse command options, thus the perceived similarity in the behaviour.

web client

A web client is a simple React app (chrome/ff only) that internally talks to the above server. If you have built the web client, pass to the server the dir w/ the compiled client files:

$ server/index.js _out

& open http://127.0.0.1:3000 in a browser.

License

MIT.

Have fun!

About

Filters out rss/atom feeds. Returns articles matching a pattern. The output is another valid xml feed.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published