-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON output #716
Comments
@MaxLanar thanks for the Read through the links, and see you caught the attention of @balthisar... Interesting use of the Hmmm, Take for example
What would that look like in Or perhaps more importantly what would be the |
@geoffmcl, I encouraged them to file this feature request after a question to the W3C mailing list. If you take a look at the XML output routines for documentation, this would look a lot like that, except dump JSON instead of XML, using the filter callback. In English/French/whatever, it would return all of the data from the I'm actually surprised no one has asked for this facility before, because it almost eliminates the need for interfacing to the C library from non-C languages, as almost all of these scripting languages support some type of It would probably look something like:
This could be extended to include the actual document output, although I recommend still using STDOUT or a file for that (escaping a huge HTML document properly for JSON isn't pretty), include an array for "configuration", etc. Actually, separately, I might enter a feature request that allows tidycfg files to written in JSON in the future, too. This potentially lessens the burden on many, many tools that work with the console application instead of using LibTidy directly. |
@balthisar thanks for the feedback, especially the Quite some time ago I experimented with a tidy-json app, rendering the tidy DOM like html tree as {
"in_file" : "F:\\Projects\\tidy-test\\test\\input5\\in_704.html",
"out_file" : "temp.json",
"name" : "#Root",
"content" : [
{
"name" : "#DOCTYPE",
"attributes" : [
{
"name" : "PUBLIC"
}
],
"name" : "html",
"content" : [
{
"name" : "head",
"content" : [
{
"name" : "meta",
"attributes" : [
{
"name" : "name",
"value" : "generator"
},
{
"name" : "content",
"value" : "HTML Tidy for HTML5 for Windows version 5.7.3"
}
],
"name" : "title"
}
],
"name" : "body",
"content" : [
{
"name" : "#Text",
"value" : "hello & bye\r\n"
}
]
}
]
}
]
} It was not too difficult to do the same for the messages, using the {
"filename": "F:\\Projects\\tidy-test\\test\\input5\\in_704.html",
"messages": [
"message": {
"messageLine": 1,
"messageColumn": 7,
"messageLevel": 351,
"messageIsMuted": true,
"messageDefault": "missing <!DOCTYPE> declaration",
"message": "dclaration <!DOCTYPE> manquante"
},
"message": {
"messageLine": 1,
"messageColumn": 7,
"messageLevel": 351,
"messageIsMuted": true,
"messageDefault": "texte brut isn't allowed in <head> elements",
"message": "texte brut n'est pas permis dans les lments <head>"
},
"message": {
"messageLine": 1,
"messageColumn": 7,
"messageLevel": 350,
"messageIsMuted": true,
"messageDefault": "<head> previously mentioned",
"message": "<head> prcdemment mentionns"
},
"message": {
"messageLine": 1,
"messageColumn": 7,
"messageLevel": 351,
"messageIsMuted": true,
"messageDefault": "inserting implicit <body>",
"message": "insertion implicite de <body>"
},
"message": {
"messageLine": 1,
"messageColumn": 7,
"messageLevel": 351,
"messageIsMuted": true,
"messageDefault": "inserting missing 'title' element",
"message": "ajout d'un lment 'title' manquant"
},
"message": {
"messageLine": 0,
"messageColumn": 0,
"messageLevel": 350,
"messageIsMuted": true,
"messageDefault": "Document content looks like HTML5",
"message": "Le contenu du document ressemble HTML5"
},
"message": {
"messageLine": 0,
"messageColumn": 0,
"messageLevel": 357,
"messageIsMuted": true,
"messageDefault": "Tidy found 4 avertissements and 0 erreur!\n",
"message": "Tidy a trouv 4 avertissements et 0 erreur!\n"
}
]
} First it does not pass some json checking s/w I have... bombs at about line 3... can not yet see the problem, and appreciate someone pointing out the missing
Have not had a chance to look at 3. and understand why... but maybe this is same as a previous bug where an ouput buffer was used as part of the input... especially given that some do seem a mixture of languages... so has maybe been fixed by a PR not yet merged... not sure... Given that they can all be addressed, this brings up the possibility of why is this not done in such a separate, Anyway out of time today, but look forward to some interesting feedback... thanks... |
It looks correct on the surface, and everything that should be escaped looks escaped. I agree, there's a pointer somewhere screwed up; it works okay in LibTidy; you're not trying to hold anything after the callback returns, are you? Every is dealloced after returning, so maybe that's it. Why does the French look like it's missing a lot of letters? The sample is also exposing the Hmmm, I'll have to look into the I suppose the advantage of putting it into tidy.c is because everything is in tidy.c, including all of the documentation generation stuff, and everyone has tidy.c by default, without having to fuss with installer options, cmake options, etc. It's just another alternative output format, and there's not really a disadvantage. I'll dig into it more, too, when I have some time. As you know, I've been swamped. |
@balthisar Of course I am extract all the items into a And now see the I converted the Will try to attach the file output, since I am always unsure what There are several points about making this a separate distributed app -
There seems no need to have an ever growing single I am still having a problem with the output passing my json checking s/w. It fails within the first 3 or 4 lines... Any help on that appreciated... And now I have added the But whatever is eventually decided I am still seeing mixed language messages... Need to get to the bottom of this... ideas welcome... thanks... |
Seemed to have solved the valid And have opened a #719 issue to address the bug in the This is the current tidy-json app... appreciated if others could clone, build, and test this... Look forward to further feedback... thanks... |
Ok, my tidy-json app is getting stable and mature... Have now added everthing available from the One of the last things is a sort of And as noted in issue #719, have solved one of the problems, the One interesting thing about the output of the Look forward to others building and testing this, and further feedback... thanks... |
Hello,
flycheck, a on the fly syntax checking solution for GNU Emacs ( http://www.flycheck.org ) has an issue with tidy's output in localized environments. It can't run properly except in an English setup, because localized error messages also localize the diagnostic type (e.g. "Warning" becomes "Avertissement" in French), which make them unparseable by Flycheck.
It has been figured out, here (flycheck's issue 1376), that the best way to resolve this, both for flycheck and tidy, would be for tidy command line client to be able to produce JSON output.
So here is the feature request, please provide JSON output to tidy command line client.
Thanks !
The text was updated successfully, but these errors were encountered: