Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Verify that the patch files used by git match the grammar #4

Open
jonm opened this Issue Jul 7, 2012 · 13 comments

Comments

Projects
None yet
3 participants
Owner

jonm commented Jul 7, 2012

No description provided.

I plan on working on this now. Gonna whip up a little parser with the grammar so we can do this automatically, I think.

Owner

jonm commented Jul 7, 2012

Great! Note that the IETF provides a utility that I think compiles ABNF grammars into a C program somehow. See http://tools.ietf.org/ and specifically http://vinegen.com/metabbs/metabbs.php/board/Download in case that's useful.

Owner

jonm commented Jul 7, 2012

And yes, I was thinking having some automated tests would be neat. We might be able to get the ABNF out into a separate file and include it (have to check the XML format to see how includes are done) which might facilitate grabbing it to run tests against.

Neat! I was thinking about using kpeg because I've been looking for an excuse, but maybe getting used to the IETF's tools would be good, too.

Owner

jonm commented Jul 7, 2012

Whatever scratches your itch. :)

On Sat, Jul 7, 2012 at 5:25 PM, Steve Klabnik <
reply@reply.github.com

wrote:

Neat! I was thinking about using kpeg
because I've been looking for an excuse, but maybe getting used to the
IETF's tools would be good, too.


Reply to this email directly or view it on GitHub:
#4 (comment)

........
Jon Moore

I got partway there with kpeg, but since it uses a different form than ABNF, I'm exploring APG instead in the hopes that we can do something like the XML include idea.

I had to do this to get APG to compile on my MacBook Air.

So.

I managed to get APG to correctly generate a grammar.c file. But now I need to figure out how to actually use the APG C API to use that generated file.

I'm going to give that a go tomorrow. I can push it up to my branch if anyone else wants to work on this or if I fail to make any headway.

prathe commented Jul 8, 2012

My motivation was that last week I achieved parsing RFC 3986 and RFC 5988 with lpeg (Lua PEG) written by the creator of Lua.

You can use the ABNF definition with few adjustments to fit the lpeg syntax. As an example lpeg grammar can look pretty close to the RFC 5986 ABNF.

But I tend to prefer an alternative syntax with lpeg which gives you much more power, such as reusing the basic grammar pattern. It is more verbose, but that was a need when I parsed RFC 5988 which uses RFC 3986 for URI. I didn't want to repeat the whole URI grammar inside the Web Linking grammar.

It sounds interesting to use the tool propose above which won't require any changes to the ABNF. But what about building a toolkit in Lua which provides a base to use within current and future ABNF parser. That what I wanted to do. I was keeping it private because it's purpose was first to drive my project with it, improving it before making it a Lua module.

Owner

jonm commented Jul 11, 2012

@philippe: Very cool! I think at this point we just have proposals and no artifacts (not sure how far Steve K got), and it will be important to have some executable tests. I'm fine with using Lua for this if you like. I'm probably going to spend more of my time on the writing and bouncing around the mailing lists, so I think you can feel free to contribute how you prefer.

Yeah, I didn't get any farther with my C implementation. Don't worry about 'duplicating' my work or anything, if you want to take a crack at it, please do!

prathe commented Jul 12, 2012

The advantage of using the ABNF syntax as-is in the code or a close hybrid, is that it is convenient for development iteration while implementing the grammar. A few tests can be written to ensures that the overall seems to work.

But this approach does not allow inspecting what was match and where. I think this is important to test relevant captures as a second step instead of just having a "it matches" or "it doesn't match" kind of test which should be more useful in the first step.

I am more willing to help with the second test suites part, once the grammar will be established.

I am moreover ready to participate and interested by the scope of the specification and the possibility to make it an hypermedia. While historically diff output have always refer to file path, I wonder if the spec could cover the fact that using file path or URI relative reference or URI is possible and so by defining instructions as how an implementation should process the diff document in all those cases. It could make it a more useful media-type, an hypermedia one, without preventing the historical well-known use case of diffs with this media-type. It would just be amazing if a resource could be represented as diff of another resource. This sounds so natural to me and brings new power to text/diff and new possibilities in application design.

That is only a supposition that it could work, we have to dig down that way to find out. But when I use to think to the file paths in diff output, they are only relevant with out of band information as where they could be apply (in which folder). Yes media-type specification are not required to be useful for REST architect and this one particularly is an attempt at a formal definition, but I think it worth trying to bring it one step further making it an hypermedia.

What do yo think?

By the way if you prefer to make a new issue instead of having this conversation here, send me the new link and I'll repost my comment there.

prathe commented Jul 12, 2012

Just thinking aloud...

As to differentiate the non hypermedia with the hypermedia, the use of a certain parameter could specified it. By default value of the parameter should indicate the non hypermedia version (file based). The parameter would indicate that the paths are URI based or file based.

Owner

jonm commented Jul 13, 2012

There are really two main forms of diff. The first, output by the 'diff' utility, compares two different representations and lists them with respect to a certain context. This is commonly seen as relative or absolute filenames.

The second form is seen in version control systems, such as output from 'svn diff' or 'git diff', etc. This lists a single file (again, relative to some context) on an "Index" line and compares different versions of the same file.

In either case, I think this generalizes pretty simply from a filename to a URI, especially since filenames can be taken as relative URIs from the original context. So I think this is a simple extension that's worth pursuing.

I'll try to extend the ABNF grammar to address this, but I will open another issue to track it. I think let's keep this issue for discussing 'git diff' specifically. I'll also open an issue for adding some tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment