Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop on recursive specs/schemas #36

Open
asianfilm opened this issue Jan 22, 2020 · 4 comments
Open

Infinite loop on recursive specs/schemas #36

asianfilm opened this issue Jan 22, 2020 · 4 comments

Comments

@asianfilm
Copy link

asianfilm commented Jan 22, 2020

I'm using the Floki web parsing library and have defined the following specs for use in my own code:

def floki_html_tree(),
  do: coll_of(one_of([floki_html_comment(), floki_html_doctype(), floki_html_tag()]))

def floki_html_comment(),
  do: {:comment, spec(is_binary())}

def floki_html_doctype(),
  do: {:doctype, spec(is_binary()), spec(is_binary()), spec(is_binary())}

def floki_html_tag(),
  do: {floki_html_tag_name(), coll_of(floki_html_attribute()), floki_html_children_nodes()}

def floki_html_tag_name(),
  do: spec(is_binary() and (&(&1 in ["a", "body", "div", "h3", "head", "html", "li", "link"])))

def floki_html_attribute(),
  do: {spec(is_binary()), spec(is_binary())}

def floki_html_children_nodes,
  do: coll_of(one_of([floki_html_tag(), spec(is_binary()), floki_html_comment()]))

Note that floki_html_children_nodes() makes a recursive reference to the floki_html_tag() spec. I don't get an error, but my code enters an infinite loop while my laptop gets very loud and very hot. (To be clear, the data structure is a tree, but its definition is a cyclical graph, which is presumably what's "trapping" the Norm parsing code.)

My workaround is to "dumb down" the definition of floki_html_children_nodes(), which makes my code run but isn't giving me the code safety that I want:

def floki_html_children_nodes(), do: spec(is_list())

Then again, one is mostly dealing with small fragments of html, having targeted the specific fragments while parsing, and one mostly only cares about the top level or two of any particular html sub-tree. There may also be a trade off to make between speed and granular accuracy.

On a related note, and perhaps worthy of another issue, is how to deal with third-party libraries. I don't necessarily want to go down Typescript's index.d.ts rabbit-hole. And perhaps it's easy enough to define one's own specs (at the granularity one requires for one's use of them).

@keathley
Copy link
Member

Unfortunately, I'm not sure I have a good solution to recursive definitions. Its something I thought a lot about when initially building Norm. I've thought of 2 ways to make it work but both have issues.

Use macros for everything

Currently spec is the only macro provided by Norm. If we converted the rest of Norm's api to macros then we could delay execution of these function calls until they were actually needed. This delay in execution would allow us to make recursive definitions. But the tradeoff is that it makes the implementation much more complex. I wasn't sure that complexity was worth it and tbh I wasn't sure I had the skill to build that system. It's still an option that I'm considering but I'm not sure how I feel about it.

Support symbolic calls or spec names

Right now there's no way to "export" or reference a spec. So we have to use functions. But since functions calls are eager we end up infinitely recursing. I've considered supporting something like this:

defspec :tree_spec, do: one_of([{:terminal, spec(is_binary())}, {:node, :tree_spec}])

In this case :tree_spec is defined recursively, but because the recursion is symbolic we don't end up in the loop. In order to make this work Norm would have to check all atoms against some sort of registry and if its there call into that spec. Otherwise, just treat it like a bare atom. A subtle implication in this approach is that defspec could expose specs defined in a module to other modules. I haven't thought about how to do this, but that might also make it easier for libraries or modules to expose what specs they define.

These are the best ways I've come up with to solve this problem holistically. At the moment I'm leaning towards symbolic calls and defspec. But if you have other thoughts on this I'd love to hear them :).

@keathley
Copy link
Member

keathley commented Feb 2, 2020

I started playing around with the defspec idea. Unfortunately I think we still end up in the "everything has to become a macro" situation. The reason is that the only way I can think to "register" specs is to add them to a module attribute in the module the spec is defined in. When we go to conform values we no longer know what module the spec is defined in. You'd need to pass along more information like {__MODULE__, :tree_spec} which seems pretty janky. It also makes it really hard to tell when you want to call a recursive spec vs. when you want to conform values against a tuple with atoms inside it. The only way I can think to make things ergonomic is to use macros in many other places. That seems...less awesome. If other folks have ideas on how to solve this I'd be interested to hear them.

@keathley
Copy link
Member

keathley commented Feb 2, 2020

Another option I just thought about (but haven't fully considered yet) is the idea of supporting spec(:tree). This would potentially allow us to build recursive definitions and support specs exported from other modules. But its a bit weird and non-ergonomic. But I wanted to jot down the idea while I was thinking about it.

@derekkraan
Copy link
Contributor

I am running into this right now. I'm not sure if this is a good idea but I'm processing a swagger file and turning it into Norm specs. The swagger spec includes some recursive definitions.

It's impossible to define a recursive schema in norm, but it is possible to call valid?/2 in a predicate function, so the result is that everything is validated correctly (as far as I can tell), but errors aren't bubbled up, since valid?/2 only returns a boolean.

I wonder if it might be possible to accept not only boolean output from predicates, but also allow them to call conform/2 and process the results of that transparently. Or perhaps it would be easier (for library users) to allow returning a norm spec in a predicate function, which would then be run by Norm, and the output captured that way.

I might experiment a little with this to see if it's something that could be done without too much difficulty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants