Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

reFeed

A service that offers RSS feeds for websites that do not natively provide them.

API Documentation

reFeed is composed of two main components, the core and the client.

Core

The core is the engine that takes care of fetching, caching and updating the RSS feeds. Each task is handled by a standalone model:

  • a PageLoader is responsible for requesting the webpage located at a specific url and returning the String representation of the response body.
  • a PageParser is instantiated with the html data and other config data to parse the webpage and try to find a set of items that can be recognized as articles. The rules a PageParser uses to recognize articles are defined on a per-feed basis as a collection of CSS selectors. This may be extended in the future to handle complex page layouts that require more sophisticated ways of parsing. For an example of a feed configuration file, take a look at json/hindawi.json
  • a FeedGenerator takes a feedId, a config object and an optional xmlFile argument that represents a path to cached version of a feed (feedId). FeedGenerator makes use of the cached version so that the engine does not have to fetch the whole website every time a few articles are added. It may instantiate one or more PageLoaders and PageParsers.

Life cycle

  • The core looks for feedIds that require regenerating (using, for example, a query to a mongo database that stores the required data).
  • Instantiate a FeedGenerator for each feedId that is returned from the query.
  • Listen to the end event on FeedGenerator instances and write the corresponding XML files to a specific directory.
  • Update the last check time of each feed in the database.
  • A static file server serves the XML files.

Client (not implemented)

The client part of reFeed provides a usable interface that facilitates searching, creating and fetching of generated feeds.

Usage scenario

  • The home page of the client is a simple webpage with a Google-like search box. The user uses that box to find an already generated feed based on its website URL, title, description or category.

    1. If one match is found, the user is redirected to that feed.

    2. If more than one match are found, a list of the matches is displayed with title, description and website URL for each match.

    3. If nothing is found, and the input matches a valid URL pattern, the client tries to generate/find at least one feed on that URL. The results are then displayed in a similar way to the one described in a or b.

[...]

About

A service that offers RSS feeds for websites that do not natively provide them.

Resources

Releases

No releases published

Packages

No packages published