Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tryparse: parse string to Nullable #9487

Closed
wants to merge 3 commits into from

Conversation

tanmaykm
Copy link
Member

Introduces methods that parse a string as the indicated type and return a Nullable with the result instead of throwing exception:

  • maybeint{T<:Integer}(::Type{T<:Integer},s::AbstractString)
  • maybefloat32(s::AbstractString) and maybefloat64(s::AbstractString)

Ref: discussions at #9316, #3631, #5704

@dhoegh
Copy link
Contributor

dhoegh commented Dec 29, 2014

Just a thought why not maybe{T}(::Type{T},s::AbstractString) for all types, hence you collect the functions similar to zero.

@tanmaykm
Copy link
Member Author

Yes, maybe will be nicer. Thanks for the suggestion.

@nalimilan
Copy link
Member

Introducing the maybe terminology while the type is called Nullable is a bit weird. Also, for consistency with the current parseint, parsefloat, you could merge all of these functions into something like parse(::Type{Int}, ...), parse(::Type{Float}, ...), parse(::Type{Nullable{Int}}, ...), parse(::Type{Nullable{Float}}, ...), etc.

EDIT: fix types by adding ::Type{...}

@ivarne
Copy link
Member

ivarne commented Dec 29, 2014

@nalimilan That seems like a massive improvement in API. Seems like Nullable will soon need some shorter name.

@dhoegh
Copy link
Contributor

dhoegh commented Dec 29, 2014

+1 for the API @nalimilan.

@tanmaykm
Copy link
Member Author

@nalimilan parse is even better. Shall update the PR and manual, unless there are any other views.

Not sure why the AppVeyor build is failing though. Seems spurious, but it failed even after re-running the build.

@nalimilan
Copy link
Member

Or to avoid the conflict in meanings with the current parse function, the new one could be named something like parseval. (It would also be a funny pun with a French mathematician's name...)

@nalimilan
Copy link
Member

@tanmaykm The AppVeyor failure doesn't sound so spurious, as the failure happens in maybefloat64. But why would it happen only on Windows, I don't know.

@tanmaykm
Copy link
Member Author

The AppVeyor logs report:

exception on 3: ERROR: test failed: (NaN == 64.0)
614 in expression: get(maybefloat64("\x034")) == 64.0

whereas strings.jl line 1315 is:

@test get(maybefloat64("64")) == 64.0

Seems like some sort of corruption.

@vtjnash
Copy link
Member

vtjnash commented Jan 2, 2015

sadly, that probably is due to #7906

@vtjnash
Copy link
Member

vtjnash commented Mar 10, 2015

bump. maybe we can merge this now?

@toivoh
Copy link
Contributor

toivoh commented Mar 10, 2015

About api:
I'm not so sure about using eg

parse(Int, ...)

and

parse(Nullable{Int}, ...)

to make the distinction of whether a non-Int input should be allowed and returned add a null result. The latter call looks to like that it would look for a text representation of a (nullable) Int or null, and throw an error if neither was found.

@tanmaykm
Copy link
Member Author

Will rebase and test again.

@JeffBezanson
Copy link
Member

@toivoh good point. We could have parse and tryparse. Together they could replace parseint, parsefloat, float64_isvalid, and float32_isvalid, so still a net reduction of 2 functions. This PR should definitely remove the _isvalid functions.

base = convert(T,base)
m::T = div(typemax(T)-base+1,base)
n::T = 0
while n <= m
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid duplicating all of this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will be removing the duplication. Will pass the error string back through an optional ArgumentError parameter so that only methods that throw would incur the overhead of creating it.

@tanmaykm tanmaykm force-pushed the maybefloat branch 9 times, most recently from 77a30d0 to 30665d6 Compare March 12, 2015 04:40
@tanmaykm
Copy link
Member Author

Help! Any clues as to why this build is failing? https://travis-ci.org/JuliaLang/julia/builds/54050279

@nalimilan
Copy link
Member

Usually this kind of error means that a compile-time function call relies on a type that's defined in a file that's included later in the bootstrap process. Here, it looks like parsing the literal array at

const _fact_table128 =
implies a call to parseint which involves BigInt in one way or another. But I have a hard time understanding why this would only happen on 32-bit, and how your changes may have triggered this.

@tanmaykm tanmaykm force-pushed the maybefloat branch 2 times, most recently from 2931312 to 9441d5d Compare March 13, 2015 04:54
@tanmaykm
Copy link
Member Author

Thanks @nalimilan, I was able to reorder a portion of combinatorics.jl to fix the build issue. Not sure if the Appveyor issue is related.

@tanmaykm
Copy link
Member Author

This now exports the tryparse method:

  • tryparse{T<:Integer}(::Type{T<:Integer},s::AbstractString)
  • tryparse(::Type{Float..},s::AbstractString)
  • a few variants of the above

And:

  • tryparse(Float.., ...) call the corresponding C functions jl_try_strtof, jl_try_substrtof, jl_try_strtod and jl_try_substrtod.
  • The parseint, parsefloat, float64_isvalid and float32_isvalid methods wrap the corresponding tryparse methods.
  • The jl_strtod, jl_strtof, ... functions are wrappers over the jl_try_str... functions.

This should fix #10498 as well.

I shall rebase and add documentation if this looks fine. Should the ..._isvalid methods be deprecated?

@tanmaykm tanmaykm changed the title maybefloat & maybeint: parse string to Nullable tryparse: parse string to Nullable Mar 13, 2015
@ViralBShah
Copy link
Member

Unless the ..._isvalid methods were exported, I don't think deprecation is necessary.

@tkelman
Copy link
Contributor

tkelman commented Mar 13, 2015

float64_isvalid is used by JSON.jl, and presumably other packages as well. A deprecation would help lessen some breakage.

@@ -180,6 +180,27 @@ big(n::Integer) = convert(BigInt,n)
big(x::FloatingPoint) = convert(BigFloat,x)
big(q::Rational) = big(num(q))//big(den(q))

const _fact_table128 =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should put these lines here. This file contains almost exclusively includes. Have you tried simply moving include("combinatorics.jl") below the BigInt section?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cross dependencies that break if the entire file is shifted. These were the only UInt128 dependent code that was there. I thought of putting these in an int128funcs.jl file, but this seemed too small to go into a separate file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to come up with something else. Having a factorial table in sysimg.jl is bananas. What depends on combinatorics.jl? I thought almost nothing would depend on this; it's all extra library functions that we've discussed removing from Base.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsing UInt128 literals seems to require BigInt in 32 bit environment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like gmp.jl depends on factorial and binomial. The simple generic definitions of those can be moved to intfuncs.jl. That should solve it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be fine if I move the binomial and factorial from gmp.jl and mpfr.jl into combinatorics.jl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So should they be moved to intfuncs.jl or combinatorics.jl since it already has the other variants?

@nalimilan
Copy link
Member

I think I liked better parse(Nullable{Int}, ...) than tryparse(Int, ...). Especially since as @ivarne said above, with a new syntax for Nullable, we could imagine writing it as parse(Int?, ...). But well...

Are there any other cases where a function should come in two versions, one returning a Nullable? Just to check that the try* pattern generalizes well.

Introduces the tryparse method:
- tryparse{T<:Integer}(::Type{T<:Integer},s::AbstractString)
- tryparse(::Type{Float..},s::AbstractString)
- a few variants of the above

And:
- tryparse(Float.., ...) call the corresponding C functions jl_try_strtof, jl_try_substrtof, jl_try_strtod and jl_try_substrtod.
- The parseint, parsefloat, float64_isvalid and float32_isvalid methods wrap the corresponding tryparse methods.
- The jl_strtod, jl_strtof, ... functions are wrappers over the jl_try_str... functions.

This should fix JuliaLang#10498 as well.

Ref: discussions at JuliaLang#9316, JuliaLang#3631, JuliaLang#5704
@JeffBezanson
Copy link
Member

Could you move this branch to JuliaLang so I can work on it too?
I just moved a copy there, but I'll delete it and we can use yours instead.

@JeffBezanson
Copy link
Member

@nalimilan Parsing numbers seems to be unusual in that (1) it is very performance-critical, (2) it has a failure mode that is exactly halfway to needing an exception. Half the time, you want an exception if the string is not a number, and half the time it just means you will handle the string differently. That's why I think parse/tryparse is justified.

@tanmaykm
Copy link
Member Author

Ok. Moved to the tryparse branch on JuliaLang

@JeffBezanson
Copy link
Member

replaced by #10543

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants