New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example parsers #14

Open
Geal opened this Issue Feb 26, 2015 · 75 comments

Comments

Projects
None yet
@Geal
Copy link
Owner

Geal commented Feb 26, 2015

We currently have a few example parsers. In order to test the project and make it useful, other formats can be implemented. Here is a list, if anyone wants to try it:

@thehydroimpulse

This comment has been minimized.

Copy link

thehydroimpulse commented Apr 3, 2015

I'm writing a Thrift library for Rust that'll use Nom for both their IDL and the network protocol, so that can be another example (although in a different repo).

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Apr 3, 2015

Nice idea, that will be useful! Please notify me when it is done, I will add a link in this list.

@filipegoncalves

This comment has been minimized.

Copy link
Contributor

filipegoncalves commented Apr 27, 2015

This looks interesting. Is anyone actively working on any of these parsers? I'd like to work on a few of these.

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Apr 27, 2015

I have some code for a GIF one at https://github.com/Geal/gif.rs but it is hard to test, since the graphical tools in Piston change a lot.

You can pick any of them. Network packets may be the easiest, since they don't require a decompression phase.

I am using the gif example to see what kind of API can be built over nom. Most of the parsing example are done as one pass over the data, but often there is some logic on the side, and it is not easy to encode correctly.

@elij

This comment has been minimized.

Copy link

elij commented May 1, 2015

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented May 5, 2015

@elij this is a great idea! Was it easy to do?

@elij

This comment has been minimized.

Copy link

elij commented May 5, 2015

yup it's a great framework -- though I struggled a bit with eof so I borrowed some code from rust-config (https://github.com/elij/fastq.rs/blob/master/src/parser.rs#L69) -- is there a better solution?

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented May 5, 2015

yes, eof should be a parser provided by nom, I am just waiting for @filipegoncalves to send a PR 😉

@filipegoncalves

This comment has been minimized.

Copy link
Contributor

filipegoncalves commented May 5, 2015

Hah, sorry for my silence. I've been busy lately. I just sent a PR (#31).

I will be working on one of these example parsers as soon as I get some spare time. There are some great ideas in here!

@Keruspe

This comment has been minimized.

Copy link
Contributor

Keruspe commented May 29, 2015

I might give tar a try

@nelsonjchen

This comment has been minimized.

Copy link
Contributor

nelsonjchen commented Jun 19, 2015

Does this check off PCAP?

https://github.com/richo/pcapng-rs

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Jun 19, 2015

pcap-ng and pcap are two different formats, right? It seems the consensus now is to move everything to pcap-ng, though.

@TechnoMancer

This comment has been minimized.

Copy link

TechnoMancer commented Jul 17, 2015

I will try a FLAC parser, need to add quite a few things for it though.

@badboy

This comment has been minimized.

Copy link
Contributor

badboy commented Jul 17, 2015

ISO8601 is done in https://github.com/badboy/iso8601 (I hope it's mostly correct.)

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Jul 17, 2015

ok, it should be up to date. More to come 😄

@sbeckeriv

This comment has been minimized.

Copy link

sbeckeriv commented Aug 23, 2015

WARC file format released. https://crates.io/crates/warc_parser

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Aug 24, 2015

@sbeckeriv great, thanks!

@porglezomp

This comment has been minimized.

Copy link

porglezomp commented Sep 14, 2015

It might be informative to try parsing the rust grammar with nom, if nobody has yet. In any case, I'd like to see a few programming languages on that list, since that's my use case.

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Sep 15, 2015

@porglezomp programming languages examples would definitely be useful, but the Rust grammar might be a bit too much for the first attempt. Which other languages would you like to handle?

@porglezomp

This comment has been minimized.

Copy link

porglezomp commented Sep 15, 2015

Yeah, I'm aware of the scale problem of Rust. I don't want to write that one, but I think it's a good holy grail for any parser library written in Rust. I'd like to try parsing the Lua grammar first, I think.

I recommend adding to the list:

  • Programming Languages
    • Rust
    • Lua (I'll do this)
    • Python (or some other whitespace significant language)
    • C
@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Sep 15, 2015

ok, I added them to the list :)

@chriskrycho

This comment has been minimized.

Copy link

chriskrycho commented Nov 16, 2015

You have INI marked as done; do you have a link to it? (I'd love to use this for some tooling I'm hoping to build in 2016; need a good non-trivial example for it, though.)

@badboy

This comment has been minimized.

@chriskrycho

This comment has been minimized.

Copy link

chriskrycho commented Nov 16, 2015

Thanks very much, @badboy!

@fbernier

This comment has been minimized.

Copy link

fbernier commented Nov 16, 2015

I'll try to make the TOML parser very soon.

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Nov 16, 2015

Actually, I think I should rewrite that INI parser, now that more convenient combinators are available.
Also, I should really work on that combinator for space separated stuff

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Nov 16, 2015

@fbernier great! Please keep me posted!

@l0calh05t

This comment has been minimized.

Copy link

l0calh05t commented Nov 16, 2015

Maybe add a simple example for trailing commas in lists? Python has those, but is quite complex. Can't think of a simple example though.

@johshoff

This comment has been minimized.

Copy link

johshoff commented Nov 17, 2015

That IRC example is no longer using nom. The parser was moved into its own repository: https://github.com/Detegr/RBot-parser

@dtolnay

This comment has been minimized.

Copy link

dtolnay commented Oct 31, 2016

Yeah, I'm aware of the scale problem of Rust. I don't want to write that one, but I think it's a good holy grail for any parser library written in Rust.

As of version 0.10.0, syn is now able to parse practically all of Rust syntax. One of my test cases is to parse the entire github.com/rust-lang/rust repo into an AST and print it back out, asserting that the output is identical to the original.

I am technically not using nom but instead a fork which removes the IResult::Incomplete variant. I found that the extra macro code generated to handle Incomplete was more than doubling the compile time for something that I didn't even want. Nevertheless, the code is enough like nom that I think we can check off the box.

Example snippet to parse one arm of a match expression:

named!(match_arm -> Arm, do_parse!(
    attrs: many0!(outer_attr) >>
    pats: separated_nonempty_list!(punct!("|"), pat) >>
    guard: option!(preceded!(keyword!("if"), expr)) >>
    punct!("=>") >>
    body: alt!(
        map!(block, |blk| ExprKind::Block(BlockCheckMode::Default, blk).into())
        |
        expr
    ) >>
    (Arm {
        attrs: attrs,
        pats: pats,
        guard: guard.map(Box::new),
        body: Box::new(body),
    })
));
@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Nov 1, 2016

@dtolnay syn is an amazing example, thanks for your hard work :)

@Geal

This comment has been minimized.

Copy link
Owner

Geal commented Nov 1, 2016

@dtolnay could I get your input on #356? It might fix your issues with compile times, so I'd like to get your thoughts on this.

@J-F-Liu

This comment has been minimized.

Copy link

J-F-Liu commented Dec 23, 2016

I am writing a PDF library using nom to parse PDF syntax. Released v0.1.0 just now.
https://github.com/J-F-Liu/lopdf

@valarauca

This comment has been minimized.

Copy link
Contributor

valarauca commented Dec 23, 2016

So I've implemented a EDI parser for the ANS standard EDI for work with this. Awesome library really useful. Sadly that's owned by my employer.

I've started implementing an x64 assembler with nom. I'm really struggling with writing the parser. The main reason is register names have a lot of overlap, and are very short. For example r8, r8w, r11, and r12d. Ideally I want to map these to an enum. map!() makes this easy, but how can I match those terms in nom?

@Keruspe

This comment has been minimized.

Copy link
Contributor

Keruspe commented Dec 24, 2016

I converted several "keys" to enum values in my brainfuck parser, might or might not be relevant to your needs. See the first parsers defined with "named!" https://github.com/Keruspe/brainfuck.rs/blob/master/src/parser.rs

@przygienda

This comment has been minimized.

Copy link

przygienda commented Mar 9, 2017

is there a way (or it would be great if it's possible) to generate EBNF from this? Great package BTW ...

@ithinuel

This comment has been minimized.

Copy link
Contributor

ithinuel commented Apr 14, 2017

Hi,
I just pushed a pcap parser : https://github.com/ithinuel/pcap-rs.
It still needs the PR #492 to be merge so it can use official nom crate.

Any feedback is welcome.

@bbqsrc

This comment has been minimized.

Copy link

bbqsrc commented Apr 19, 2017

A parser for the Mediawiki format would be quite useful.

@dwerner

This comment has been minimized.

Copy link

dwerner commented Jun 3, 2017

@Geal thanks for an awesome library! I wrote a wavefront obj/mtl 3d mesh parser using it nom-obj, which I published to crates.io

@olivren

This comment has been minimized.

Copy link

olivren commented Aug 10, 2017

I wrote a parser for the simple key/value text format .properties, which is a standard for Java configuration files. It uses nom 3.1. Can it be added to the list?

This is the first parser I wrote using a Parser Combinator library. If anyone can review my code I would be delighted. Also, I tried to add error reporting to my code, but I gave up after I tried to insert add_return_error and return_error calls all over the place to no avail (in the branch "error-reporting"). Is there an example of a text parser that reports parsing errors?

Edit: I rewrote my library using Pest instead of Nom, as I find it more suited to parsing a text format. I will definitely use nom if I need to parse a binary format, though.

@santifa

This comment has been minimized.

Copy link

santifa commented Sep 5, 2017

@Geal thanks for this library.
I've implemented a parser for URI's which is
part of a larger side project for RDF (n3, ttl,...) parsers. The full abnf of rfc 3986 is implemented but the pct-encoding is still a bit messy.

@dbrgn

This comment has been minimized.

Copy link
Contributor

dbrgn commented Sep 22, 2017

Here's a parser for ICE candidates SDP (RFC 5245), used for example in WebRTC: https://github.com/dbrgn/candidateparser

@kamarkiewicz

This comment has been minimized.

Copy link
Contributor

kamarkiewicz commented Sep 22, 2017

I wrote a Session Initiation Protocol (RFC3261) low-level push parser with API inspired by seanmonstar/httparse (hyper's HTTP parser):
https://github.com/kamarkiewicz/parsip

@thejpster

This comment has been minimized.

Copy link

thejpster commented Jan 11, 2018

I'd be interested in something that could parse SNMP MIB and YANG.

https://en.wikipedia.org/wiki/YANG

@ctrlcctrlv

This comment has been minimized.

Copy link

ctrlcctrlv commented Jun 2, 2018

@Riduidel

This comment has been minimized.

Copy link

Riduidel commented Jun 6, 2018

As a beginner in Rust world, I'm quite sure I will say something horribly wrong, but is there any planned support for some XML dialects ? (typically RSS/ATOM) ?

@dwerner

This comment has been minimized.

Copy link

dwerner commented Jun 6, 2018

Nothing at all wrong with asking, and I'm sure someone might want to implement one at some point, but this is a list of example parsers written using nom, rather than a list of formats "supported" by nom. An xml parser would be an excellent idea for learning nom, imo.

@porglezomp

This comment has been minimized.

Copy link

porglezomp commented Jun 7, 2018

@Riduidel if you're specifically interested in just having parsers for those formats, look at https://github.com/rust-syndication. I don't think there's any nom involved there though.

@vandenoever

This comment has been minimized.

Copy link

vandenoever commented Jun 13, 2018

@ProgVal

This comment has been minimized.

Copy link
Contributor

ProgVal commented Jun 18, 2018

I wrote a Python parser: https://docs.rs/python-parser/

@idursun

This comment has been minimized.

Copy link

idursun commented Jan 4, 2019

I think Redis database file format parser is not using nom at all. I couldn't find any reference to nom anywhere.

@nelsonjchen

This comment has been minimized.

Copy link
Contributor

nelsonjchen commented Jan 4, 2019

@idursun Maybe it refers to this old branch from a year before the last update to master. https://github.com/badboy/rdb-rs/tree/nom-parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment