Skip to content

Inist-CNRS/get-doctype

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

get-doctype

Javascript module for parsing doctypes in XML document. Useful to get name, pubid and sysid whithout parsing the whole XML document Make usage of existing module :

Usage

From command line

Readin a file from its path :

./parse-doctype test/dataset/public.xml 

Output looks like this :

{ type: 'PUBLIC',
  name: 'TEI.2',
  pubid: '-//TEI P4//DTD Main DTD Driver File//EN',
  sysid: 'http://www.tei-c.org/Lite/DTD/teixlite.dtd' }

Reading stdin :

cat test/dataset/public.xml | ./parse-doctype

If no doctype found or error while parsing file, then an error is thrown.

./parse-doctype test/dataset/no-doctype.xml
[Error: No doctype found]

./parse-doctype test/dataset/parsing-problem.xml 
[Error: No doctype found, Sax-Error: Unexpected end
Line: 0
Column: 180
Char: ]

From a javascript / nodejs program

Readin a file from its path :

var getDoctype = require("get-doctype");
var xmlFile = "test/dataset/public.xml";
getDoctype.parseFile(xmlFile, function(doctype) {
  // Do what you want with the docytype object
});

Reading stdin :

getDoctype.parseStdin(function (doctype) {
  // Do what you want with the docytype object
});

if you want to parse a string containing the XML

var xmlString =
    '<?xml version="1.0"?>'
  + '<!DOCTYPE greeting SYSTEM "hello.dtd">'
  + '<greeting>Hello, world!</greeting>';
getDoctype.parseString(xmlString, function(doctype) {
  // Do what you want with the docytype object
});