Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Update documentation

  • Loading branch information...
commit 0a0818b1b46312d2b0e685053393085eaa0a821b 1 parent 8868c0e
@danmactough authored
Showing with 108 additions and 66 deletions.
  1. +108 −66 README.md
View
174 README.md
@@ -11,23 +11,39 @@ Isaac Schlueter's [sax](https://github.com/isaacs/sax-js) parser.
## Installation
+```bash
npm install feedparser
+```
-## Changes since v0.9.13
-
-Instantiating the parser or calling one of the parser methods now may be
-called with an optional [options object](#options).
+## Changes since v0.9.x
-## Usage
+The module now exports `parseString`, `parseFile`, `parseUrl`, and `parseStream`
+as static functions. You no longer need to create a `FeedParser` instance or use
+the prototype methods. Due to confusion about how to implement those methods in
+application code, using the prototype methods is now **DEPRECATED**.
-### Create a new instance
+As a major enhancement, Feedparser is now able to properly handle XML
+namespaces, including those in sadistic feeds that define a non-default
+namespace for the main feed elements.
+### Old API (Deprecated)
```javascript
+var FeedParser = require('feedparser')
+ , parser = new FeedParser()
+ ;
+parser.on('article', console.log);
+parser.parseString(string);
+```
- var FeedParser = require('feedparser')
- , parser = new FeedParser() // optionally called with an options object
- ;
+### New API
+```javascript
+var feedparser = require('feedparser');
+feedparser.parseString(string)
+ .on('article', console.log);
```
+
+## Usage
+
### parser.parseString(string, [options], [callback])
- `string` - the contents of the feed
@@ -38,6 +54,11 @@ called with an optional [options object](#options).
### parser.parseUrl(url, [options], [callback])
+The first argument can be either a url or a `request` options object. The only
+required option is uri, all others are optional. See
+[request](https://github.com/mikeal/request#requestoptions-callback) for details
+about what that `request` options object might look like.
+
- `url` - fully qualified uri or a parsed url object from url.parse()
### parser.parseStream(readableStream, [options], [callback])
@@ -68,77 +89,98 @@ called with an optional [options object](#options).
## Examples
```javascript
-
- var FeedParser = require('feedparser')
- , parser = new FeedParser()
- // The following modules are used in the examples below
- , fs = require('fs')
- , request = require('request')
- ;
+var feedparser = require('feedparser')
+ , fs = require('fs') // used in the examples below
+ ;
```
### Use as an EventEmitter
-```javascript
+(For brevity in this pseudo-code, I'm not handling errors. But you need to
+handle errors in your code.)
- parser.on('article', function (article){
- console.log('Got article: %s', JSON.stringify(article));
- });
-
- // You can give a local file path to parseFile()
- parser.parseFile('./feed');
-
- // For libxml compatibility, you can also give a URL to parseFile()
- parser.parseFile('http://cyber.law.harvard.edu/rss/examples/rss2sample.xml');
-
- // Or, you can give that URL to parseUrl()
- parser.parseUrl('http://cyber.law.harvard.edu/rss/examples/rss2sample.xml');
-
- // But you should probably be using conditional GETs and passing the results to
- // parseString() or piping it right into the stream, if possible
-
- var reqObj = {'uri': 'http://cyber.law.harvard.edu/rss/examples/rss2sample.xml',
- 'headers': {'If-Modified-Since' : <your cached 'lastModified' value>,
- 'If-None-Match' : <your cached 'etag' value>}};
-
- // parseString()
- request(reqObj, function (err, response, body){
- parser.parseString(body);
- });
+```javascript
- // Stream piping -- very sexy
- request(reqObj).pipe(parser.stream);
-
- // Using the stream interface with a file (or string)
- // A good alternative to parseFile() or parseString() when you have a large local file
- parser.parseStream(fs.createReadStream('./feed'));
- // Or
- fs.createReadStream('./feed').pipe(parser.stream);
+function callback (article) {
+ console.log('Got article: %s', JSON.stringify(article));
+}
+
+// You can give a local file path to parseFile()
+feedparser.parseFile('./feed')
+ .on('article', callback);
+
+// For libxml compatibility, you can also give a URL to parseFile()
+feedparser.parseFile('http://cyber.law.harvard.edu/rss/examples/rss2sample.xml')
+ .on('article', callback);
+
+// Or, you can give that URL to parseUrl()
+feedparser.parseUrl('http://cyber.law.harvard.edu/rss/examples/rss2sample.xml')
+ .on('article', callback);
+
+// But you should probably be using conditional GETs and passing the results to
+// parseString() or piping it right into the stream, if possible
+
+var request = require('request');
+var reqObj = {'uri': 'http://cyber.law.harvard.edu/rss/examples/rss2sample.xml',
+ 'headers': {'If-Modified-Since' : <your cached 'lastModified' value>,
+ 'If-None-Match' : <your cached 'etag' value>}};
+
+// parseString()
+request(reqObj, function (err, response, body){
+ feedparser.parseString(body)
+ .on('article', callback);
+});
+
+// Stream piping
+request(reqObj).pipe(feedparser.stream);
+
+// Or you could try letting feedparser handle working with request (experimental)
+feedparser.parseUrl(reqObj)
+ .on('response', function (response){
+ // do something like save the HTTP headers for a future request
+ })
+ .on('article', callback);
+
+// Using the stream interface with a file (or string)
+// A good alternative to parseFile() or parseString() when you have a large local file
+feedparser.parseStream(fs.createReadStream('./feed'))
+ .on('article', callback);
+// Or
+fs.createReadStream('./feed').pipe(feedparser.stream)
+ .on('article', callback);
```
+#### Events
+* `complete` - called with `meta` and `articles` when parsing is complete
+* `end` - called with no parameters when parsing is complete or aborted (e.g., due to error)
+* `error` - called with `error` whenever there is a an error of any kind (SAXEror, Feedparser error, request error, etc.)
+* `meta` - called with `meta` when it has been parsed
+* `article` - called with a single `article` when each article has been parsed
+* `response` - called with the HTTP `response` only when a url has been fetched via parseUrl or parseFile
+* `304` - called with no parameters when when a url has been fetched with a conditional GET via parseUrl or parseFile and the remote server responds with '304 Not Modified'
+
### Use with a callback
When the feed is finished being parsed, if you provide a callback, it gets
called with three parameters: error, meta, and articles.
```javascript
+function callback (error, meta, articles){
+ if (error) console.error(error);
+ else {
+ console.log('Feed info');
+ console.log('%s - %s - %s', meta.title, meta.link, meta.xmlurl);
+ console.log('Articles');
+ articles.forEach(function (article){
+ console.log('%s - %s (%s)', article.date, article.title, article.link);
+ });
+ }
+}
+
+feedparser.parseFile('./feed', callback);
- function myCallback (error, meta, articles){
- if (error) console.error(error);
- else {
- console.log('Feed info');
- console.log('%s - %s - %s', meta.title, meta.link, meta.xmlUrl);
- console.log('Articles');
- articles.forEach(function (article){
- console.log('%s - %s (%s)', article.date, article.title, article.link);
- });
- }
- }
-
- parser.parseFile('./feed', myCallback);
-
- // To use the stream interface with a callback, you *MUST* use parseStream(), not piping
- parser.parseStream(fs.createReadStream('./feed'), myCallback);
+// To use the stream interface with a callback, you *MUST* use parseStream(), not piping
+feedparser.parseStream(fs.createReadStream('./feed'), callback);
```
## What is the parsed output produced by feedparser?
@@ -220,7 +262,7 @@ the original inspiration and a starting point.
(The MIT License)
-Copyright (c) 2011 Dan MacTough &lt;danmactough@gmail.com&gt;
+Copyright (c) 2011-2012 Dan MacTough &lt;danmactough@gmail.com&gt;
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the 'Software'), to deal in
Please sign in to comment.
Something went wrong with that request. Please try again.