Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relying on Parsers.xparse behavior when a parse fails #78

Closed
yurivish opened this issue Feb 16, 2021 · 3 comments
Closed

Relying on Parsers.xparse behavior when a parse fails #78

yurivish opened this issue Feb 16, 2021 · 3 comments

Comments

@yurivish
Copy link

yurivish commented Feb 16, 2021

I am working with strings interspersed with letters. I'd like to parse numbers up to either the next letter or the end of the string. I can do this with xparse:

julia> Parsers.xparse(Int, "1234Q", 1, 5)
(1234, -32607, 1, 5, 5)

If a parse fails, the first tuple element x holds the result I'm looking for. However, it is not documented whether this is intentional behavior (that the x is correct up to the moment of the parse failing).

I'm curious – is this a sanctioned use of Parsers.xparse? The docs state that "x is a value of type T, even if parsing does not succeed" but do not say that the answer is guaranteed to be correct.

If it is guaranteed, I'd be happy to submit a docs PR. Thanks for your work on this package!

@quinnj
Copy link
Member

quinnj commented Feb 22, 2021

Sorry for the slow response. Yeah, that can be relied upon behavior, but note that if you pass Parsers.xparse(Int, "1234Q"; delim=nothing), then parsing will continue until an invalid character (or end of string) and return that parsing succeeded. If you also need to pass a starting byte pos + len, then I think you'll just need to build your own Parsers.Options struct with delim=nothing and pass that to Parsers.xparse. Let me know if you run into troubles, I know a lot of this is pretty undocumented.

@yurivish
Copy link
Author

Sure thing, thanks for the response. I've actually switched my parsing code to Go, where I've found ways to make everything much faster (some of which could have been done in Julia, but the faster startup time is a big boon as well, since it lets me build a fast command line tool that composes well with other UNIX utilities).

Parsers.xparse(Int, "1234Q"; delim=nothing), then parsing will continue until an invalid character (or end of string) and return that parsing succeeded.

That sounds great, and like exactly what I was looking for! In Go I ended up with this inside a loop, which parses strings like A100B200X500:

// Parse the hour
hour := buf[i] - 'A'
// Parse the first digit of the number, which always follows a letter.
var count int64 = int64(buf[i] - '0')
i++
// Parse and accumulate any remaining digits onto count
for i < l {
	digit := buf[i] - '0'
	// '0'..'9' will have values 0..9, while A-Z will have higher values.
	if digit < 10 {
		count = count*10 + int64(digit)
	} else {
		break
	}
	i++
}
counts[hour] += count

@quinnj
Copy link
Member

quinnj commented Feb 23, 2021

Cool; sounds great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants