Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate tryparse(T, s) in favor of parse(Union{T, Void}, s) #25131

Closed
wants to merge 1 commit into from

Conversation

nalimilan
Copy link
Member

See previous discussions at #16981.

@alyst
Copy link
Contributor

alyst commented Dec 16, 2017

"in favor of parse(Union{T, Void}, s)"

That's a good candidate for parse(Union{T, Invalid}, s). Is it decided that the Invalid type would not be introduced?
Anyway, Void !== Missing, so I guess for datafame columns with the missing data it should be possible to do parse.(Union{T, Void}, col) or parse.(Union{T, Void, Missing}, col) that would return AbstractVector{Union{T, Void, Missing}}.

@nalimilan
Copy link
Member Author

The idea is that for missing values you'd better check explicitly for "", "null" or "NA", as there's no standard representation anyway (cf. what CSV does).

As for Invalid, I don't think that's considered at this point, at least not in Base. I'd be interested in having several kinds of missing values, but that would have to be much more flexible than just one additional flavour.

@alyst
Copy link
Contributor

alyst commented Dec 16, 2017

The idea is that for missing values you'd better check explicitly for "", "null" or "NA", as there's no standard representation anyway (cf. what CSV does).

That's true, but that concerns data import.
If the data is already read and of type Union{String, Missing}, would parse.(Union{Int, Void}, col) work (i.e. would parse(Union{Int, ...}, missing) be missing, or nothing, or an error)?

@nalimilan
Copy link
Member Author

Ah, OK, that's a totally different question. I think it should behave just like any other parse invocation when passed missing, but I'm not sure whether we should propagate or throw an error. Since it's currently an error we can change this at any point when we have the reply. At any rate I don't think it should return nothing silently.

@alyst
Copy link
Contributor

alyst commented Dec 16, 2017

I'm not sure whether we should propagate or throw an error.

Perhaps the easiest would be if parse(Int, missing) or parse(Union{Int, Void}, missing) throw an error, although it doesn't totally align with the missing value propagation concept and parse(Union{Int, Void}, s) nonthrowing behaviour.
If it returns missing, parse.(..., strcol) result would contain a mixture of nothing and missing (which is probably less confusing that a mix of nothing and null).

@ararslan ararslan added kind:deprecation This change introduces or involves a deprecation domain:strings "Strings!" labels Dec 16, 2017
@alyst
Copy link
Contributor

alyst commented Dec 18, 2017

The more I think about invalid::Invalid, the more it makes sense

  • parse(Union{Int, Invalid}, str) is more self-explanatory than parse(Union{Int, Void}, str)
  • if the result of parse() is passed around, it's easier to figure out that invalid originated from failed parsing
  • nothing is used in various places as a default value, so for f(...; ntries = nothing) this will work f(..., ntries = parse(Union{Int, Void}, "blabla")), but I'm not sure it's the expected behaviour
  • unlike Missing, (virtually) no new methods are required to support Invalid

@nalimilan
Copy link
Member Author

Adding a new type just for one function doesn't sound worth it to me.

@StefanKarpinski
Copy link
Sponsor Member

OTOH: returning an error object capturing something about the parse error might be helpful.

@nalimilan
Copy link
Member Author

Yeah. The advantage of merging tryparse into parse is that we can support parse(Union{T, ParseError}, s) in the future as an extension, and people could still use parse(Union{T, Void}, s) if they don't care about the reason.

@StefanKarpinski
Copy link
Sponsor Member

That sounds like a great plan to me!

@alyst
Copy link
Contributor

alyst commented Dec 18, 2017

How scalable parse(Union{T, ParseError}, s) would be? Would it allow efficient storage in Vector{Union{T, ParseError}}?

@StefanKarpinski
Copy link
Sponsor Member

Possibly, but that's a matter of future work. Practically, I'm not sure why you want a vector of values or errors, it seems to me that you'd probably want to handle errors as you encounter them, but it's nice to not have to use try/catch to get parse errors.

@alyst
Copy link
Contributor

alyst commented Dec 18, 2017

I'm not sure why you want a vector of values or errors

For example when importing dataframes that contain errors (misformatted values).
Sometimes these errors are tolerable, you just need to know that the value is misformatted.
With parse(Union{T, Void}, s) one can already do

Union{Int, Void, Missing}[s !== missing ? parse(Union{Int,Void}, s) : missing for s in str_col]

But it would be nice to parse.(Union{Int, Void/Invalid/ParseError}, str_col).

@nalimilan nalimilan changed the title Deprecate tryparse(T, s) in favor of tryparse(Union{T, Void}, s) Deprecate tryparse(T, s) in favor of parse(Union{T, Void}, s) Jan 7, 2018
@nalimilan nalimilan added the status:triage This should be discussed on a triage call label Jan 7, 2018
@nalimilan
Copy link
Member Author

Should we do this or not?

@JeffBezanson
Copy link
Sponsor Member

This doesn't seem quite right to me --- if you ask parse for a value of type Union{T,Nothing} and it returns nothing, then it gave you a valid value of the requested type, which then seems like the parse succeeded.

@JeffBezanson JeffBezanson removed the status:triage This should be discussed on a triage call label Jan 11, 2018
@JeffBezanson
Copy link
Sponsor Member

Triage agrees using the type argument to select whether an error is thrown is not ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!" kind:deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants