Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The optional modifier produces empty strings instead of nil #87

Open
nathany-copia opened this issue Dec 13, 2021 · 6 comments
Open

The optional modifier produces empty strings instead of nil #87

nathany-copia opened this issue Dec 13, 2021 · 6 comments

Comments

@nathany-copia
Copy link

Strings don't behave as described by the README:

"If you use the optional modifier o together with a soft cast modifier (uppercase), then the value is set to nil"

~x"./NAME/text()"So produces an empty string instead of nil.

Someone attempted to fix this with #53, but it would've been better to open an issue to discuss possible solutions and backwards compatibility.

@nathany-copia
Copy link
Author

nathany-copia commented Dec 13, 2021

Perhaps the most obvious (initial) fix would be to document the exception in the README (and possibly a workaround).

@Shakadak
Copy link
Member

The main problem (from memory), is that :xmerl_xpath.string/3 [1] either returns a list, or a scalar (without information on whether the xml path is unreachable) that is wrapped in a list by sweet_xml. The way it is handled afterward through Enum.join/1 is in my opinion problematic, but the damage is already done.

Overall, I think a formal description of how the modifiers interacts, and what they expect as input and should return as output is necessary before analyzing how to update the library.

I don't think I have enough time to do either in the coming weeks so any participation from anyone is welcomed.

Regarding the README update, I'm totally in favor of it.

[1] https://www.erlang.org/doc/man/xmerl_xpath.html#string-3

@nathany-copia
Copy link
Author

nathany-copia commented Dec 13, 2021

Thanks for the explanation. So if xmerl doesn't provide information on whether the path is reachable for strings, then SweetXML cannot distinguish between <NAME></NAME> in the XML (as an empty string) or not <NAME> node at all (as nil), as might be done with the other types?

If that is the case, it sounds like So may not be a valid combination?

To be honest, I'm not sure how valuable that distinction of reachable or not is.

I think I'm more likely to want all empty strings to be converted to nil for my current situation, and others may prefer all strings (as is).

Is there anything less cumbersome than using transform_by on each-and-every field? e.g.

name: ~x"./NAME/text()"s |> transform_by(&Transform.to_string_or_nil/1),

@Shakadak
Copy link
Member

Shakadak commented Dec 14, 2021

So if xmerl doesn't provide information on whether the path is reachable for strings

This part needs testing, as this is simply my inference from what the type of :xmerl_xpath.string/3 and the behavior of SweetXml around it implies, but this is not sufficient, and could be incorrect, maybe a node present with nothing inside would be represented as a list of one element, with an empty string. (the private function _value/1 [1] can imply that, but again, testing will clear things out)

If that is the case, it sounds like So may not be a valid combination?
To be honest, I'm not sure how valuable that distinction of reachable or not is.

It's not that it is invalid, it is that the behavior is incompletely defined, if you take a look at the to_cast/3 implementation for soft_string [2], anything that can be converted using Kernel.to_string/1 will never be converted to nil if the optional modifier is also present. To me converting from charlist to binary is a form of casting, it stays semantically the same, the general usage is the same, but converting from an number to a binary is completely different, the operations allowed on them differs significantly.

So then what does the modifiers string and soft_string mean ? is it simply use to_string/1 and be done with it ? then nil will pretty much never happen. Is it convert from charlist to binary, and vice versa for the list modifier (sl what is even a stringlist ? a char list, a list of string ?) ? then the implementation is too permissive, badly documented, and incorrect.

That's why I wish to formalize the behavior of the modifiers.

Is there anything less cumbersome than using transform_by on each-and-every field?

A bit, you can define a function that navigate the supbspec given to xpath/3, for example:

  def inspect_spec(xs) do
    Enum.map(xs, fn
      {k, [x | xs]} -> {k, [x | Enum.map(xs, fn {k2, x} -> {k2, x |> SweetXml.transform_by(&IO.inspect(&1, label: "#{k}.#{k2}"))} end)]}
      {k, x} -> {k, x |> SweetXml.transform_by(&IO.inspect(&1, label: k))}
    end)
  end

You could imagine a function subspec_map/2 that takes a subspec like inspect_spec/1 above, a function that takes a key and a xpath struct as argument and returns a possibly new xpath struct, traversing the subspec recursively (unlike inspect_spec/1 that I used for quickly debugging something with only one level of nesting).

You would then only need to apply it like so:

subspec_map(subspec, fn _, x ->
  if x.cast_to == :soft_string and x.is_optional do
    SweetXml.transform_by(x, fn "" -> nil ; y -> y end)
  else
    x
  end
end)

Or other stuff of the sort.

[1] https://github.com/kbrw/sweet_xml/blob/master/lib/sweet_xml.ex#L763-L778
[2] https://github.com/kbrw/sweet_xml/blob/master/lib/sweet_xml.ex#L876-L882

@Shakadak
Copy link
Member

  @spec subspec_map(subspec, (%SweetXpath{} -> %SweetXpath{})) :: subspec
  when subspec: (%SweetXpath{}| keyword(%SweetXpath{} | subspec))
  def subspec_map(%SweetXpath{} = x, transform) do
    transform.(x)
  end
  def subspec_map({k, x}, transform) do
    {k, subspec_map(x, transform)}
  end
  def subspec_map(subspec, transform) when is_list(subspec) do
    Enum.map(subspec, fn x -> subspec_map(x, transform) end)
  end

@Shakadak
Copy link
Member

Hi, so it's confirmed that we won't do breaking changes, and we will go the path of documenting thoroughly the modifiers in order to leave no surprises to the users

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants