Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version 2.2.0 gives pgml.predict is not unique #549

Closed
dusanmarjanovic opened this issue Feb 15, 2023 · 4 comments · Fixed by #566
Closed

version 2.2.0 gives pgml.predict is not unique #549

dusanmarjanovic opened this issue Feb 15, 2023 · 4 comments · Fixed by #566
Labels
bug Something isn't working

Comments

@dusanmarjanovic
Copy link

New version installed from the apt repo gives error such as:
error returned from database: function pgml.predict(unknown, smallint[]) is not unique

Caused by:
function pgml.predict(unknown, smallint[]) is not unique

even for the most of the examples in the notebooks section.

@montanalow
Copy link
Contributor

This is due to ambiguity with the single tuple version of predict added for preprocessing when there is a single raw feature argument. This used to work a bit by accident, as the smallint[] and broader feature float4[] were both flattened into a single array. I'll look at what we can do to disambiguate, it maybe as easy as adding explicit predict functions for each type of numeric feature array to provide exact matches and disambiguate, but I'd rather see if we can do something smarter, to disambiguate based on the preprocessing directives, in addition to the types.

@montanalow montanalow added the bug Something isn't working label Feb 21, 2023
@dusanmarjanovic
Copy link
Author

any progress on this? :) it makes using preprocessing feature rather unusable

@montanalow
Copy link
Contributor

I'm worried about this ambiguity in the APIs, since you mention it makes preprocessing unusable. To clarify, predict is overloaded, in non equivalent ways for preprocessing.

  1. Passing an ARRAY (now of any of the common numeric types, not just FLOAT4), will bypass preprocessing and operate directly on the raw features in the array.
  2. Passing a Postgres row type which is differentiated by () instead of ARRAY[].

The syntax distinction is subtle enough that people may overlook it, and call the wrong version. I'm curious if this was biting you, or just the areas in the notebooks that were not explicitly cast to FLOAT4?

@dusanmarjanovic
Copy link
Author

Thanks for the update, not just the stuff from the notebooks, thats what I tried in the end to make sure I was not doing something wrong in my own project. I try to run some predictions on my own data but it needs to be preprocessed because of the NULLs and categorical variables. I will try now with the adjusted api.rs hope it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants