New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opaque failure - What to do when 'Learning fails'? #14
Comments
Since a program could not be learned from the examples given, usually, more examples will not help. Since normally all programs expressible in the DSL which satisfy the examples are learned, no programs learned means that there are no programs in the DSL that satisfy all of the examples and adding more examples would only further constraint the learning problem. (I say "normally" because using the escape hatches of the learning procedure you could write your own non-monotonic learning sub-procedure... but that's generally a bad idea because of the confusion you bring up.) As you say, this means that the grammar would likely have to be extended to express the desired operation. We know the error reporting is poor and it's an issue we intend to address. If you are comfortable sharing your data, it would be helpful to see your inputs, both to determine if it is in fact not expressible and, if so, help us know how we might want to extend the language to cover your scenario. You can e-mail me at |
@danpere Thanks for the feedback. I am not sure, but I think one of the problems I might be running into is that there are (at least) two date formats mingled in the documents yyyymmdd and mm/dd/yyyy, either of which could be the accepted 'output' and the grammar may be failing to generalize across them. |
For clarification, you are using the The differing formats might be the issue. |
@danpere has covered all the main points. As you rightly observed, we can solve this by extending the grammar to support the task. @danpere mentioned learning conditional, which basically partitions your inputs into different clusters (each of which shares the same format) and learns a program for each of them. This is on-going work. Which API did you use? Did you extract a substring out of a string, or a sequence of substrings out of a string? |
I see in the samples this code snippet. When I try and create my own dataset and try to learn from it from some not very nice real world examples, my program tends to output "Error: Learning fails!"
What does it really mean that learning failed? Does it mean that the grammar is too incomplete to build a suitable generalization so the grammar needs to be extended!? Could it also mean that generalization was too hard for the learning system and it gave up, in which case maybe it will work with more examples? How can I determine the correct path forward?
This Learn() api really needs to throw a specific exception instead of just returning null!
The text was updated successfully, but these errors were encountered: