-
-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch frontend to new filtering API #1001
Comments
@dmos62 I really really like the hints representation and I imagine it must have been quite fun implementing it. It suits all usecases from the backend and api perspectives, if a user is to manually look at things and form the query. However, we would require additional information and possibly some change in the json structure for being able to use this on a client. Following are some of my initial thoughts on how to make it easier for clients. I've added them as multiple top level comments.
|
1. Provide a role for each function.Problem: From any frontend client perspective, it is absolutely essential to know what each function does i.e what kind of function it is. Both Suggestion: Providing a role for each function will let clients know how to use them, especially for frontend clients when having to construct layouts. These roles would be a small set of values that the client will have to know, which is okay. Eg.,
|
2. Hints seem to represent both properties and arguments for functions. Split hints into
|
3. Provide additional properties for specific functions. (This builds on top of 2)Problem: Certain functions like Suggestion: Add additional meta information like [
{
id: "list",
name: "List",
properties: {
returns: "data_list",
parameter_count: -1,
allowed_parameter_types: ["comparable", "string_like"]
}
}
] |
4. If possible, add names for parameters, so that requests can use the name instead of relying on index. (This builds on top of 2)Problem: Relying strictly on index would make it hard to use the api in the long run, especially when complex functions are involved and index strictly matters. This is not a priority but does make things easier. Suggestion:
[
{
id: "lesser",
name: "Lesser than",
role: "conditional_function",
properties: {
returns: "boolean",
parameter_count: 2
},
parameters: [
{
id: 'comparable',
name: 'lhs'
},
{
id: 'comparable',
name: 'rhs'
}
]
},
{
id: "in",
name: "In",
role: "conditional_function",
properties: {
returns: "boolean",
parameter_count: 2
},
parameters: [
{
id: 'any', // To represent any type
name: 'find'
},
{
id: 'list',
name: 'bucket'
}
]
},
{
id: "column_reference",
name: "Column Reference",
properties: {
returns: "type_of_argument", // This is a hint the client will know and calculate
parameter_count: 1,
},
parameters: [
{
id: 'column',
name: 'column'
}
]
}
] The request sent from the client will then look something like: {
lesser: {
lhs: 3,
rhs: 5
}
}
{
in: {
find: { column_reference: { column: 'column1' } },
bucket: [
{ column_reference: { column: 'column2' } },
{ literal: { value: 'foo' } }
]
}
}
{
and: [
{ ... },
{ ... }
]
} |
5. For the db_types endpoint, rename
|
I'm also assigning the backend team to this issue since the suggestions include some significant changes in the JSON structure. We could also convert this issue to a discussion to make it easier to discuss each suggestion individually. |
Thanks for the detailed and very well structured feedback, @pavish! Reading your suggestions I think I know where you're coming from. I wrote the previous revision of this refactor in a spirit similar to what you're talking about. I'd sum it up with the word strict, maybe? After discussing that previous revision amongst ourselves, we decided to go for another iteration, aiming for something relaxed and highly simple. Especially, from the perspective of user developers. That is users that might jump into the code to add new functions themselves.
The roles you outlined, if I understand them correctly, can be inferred from the function signatures (types of their inputs and output). If there's some information that is missing in the current hint sets, that can be added via a new hint.
In my eyes argument types, counts, etc. are function properties too. I'd say that given the sense of flexibility we want to achieve with this API, a homogenous Note that the spirit of this API is that we're not actually saying what the properties of a function are. We're giving hints about what they might be. Hence the term We're also not saying to the future user developers that hints are mandatory. The ultimate vision of this, as I understand it, is that the user shouldn't have to study the hint system to use his newly written function. He can leave the hints out, and then he won't profit from the suggestions that the UI can give him, but he'll still be able to use the function in the UI. I'm not implying that all this is planned for Alpha, though.
I think that current
That's a good idea. I've been playing with it and I'm not yet sure how to handle all kinds of signatures, but I'm pretty sure this is the direction the API is headed in. Thanks!
I'll refer to my note a few paragraphs higher about why we're using the term hints, as opposed to properties or attributes. As for the structure, the I hope I don't come off dismissive. You bring up good points. I think we're currently synchronizing our visions of what this aspect of the product should be. This proposal is coming from a pretty long chain of discussion-iteration cycles amongst the backenders and I'll be interested to read what they have to say about your concerns. I sympathize with the fact that this proposal requires a lot from the frontend. And, yes, this was (and still is) fun to implement, haha. |
Okay, I understand the hints structure more clearly now. But there are a few points I feel strongly about and few more lingering questions.
It's relaxed and simple on the backend and a ton of work on the frontend and for clients. :) Some questions first
Suggestion: A different formatI'm suggesting a complete different structure here. I understand the amount of discussion that has gone into this an I'm sorry for not participating earlier, but the API is intended to be used by clients and I'm proposing a structure that would ease it for client and also satisfy backend requirements. I'm not suggesting we completely change Representing a function signatureIn the client's point of view, the entire A function signatures absolutely needs the following:
I don't see a case where a user developer or anyone could create functions without having to define both Also, if the parameter or return type can be It would also be better to separate
I find the following structure feels more readable and usable for clients. I've represented them in TS'ish to get the idea across to everyone. interface FunctionDefinition {
id: string,
name: string,
returns: ArgumentDataType,
parameter_count: number
parameters: [
{
accept: ArgumentDataType | ArgumentDataType[],
name: string
},
...,
],
additional_info: { ... } // Could be anything, maybe same as nested hints structure
} type ArgumentDataType = string | GenericArgumentDataType;
type GenericArgumentDataType = {
id: string,
accept: ArgumentDataType | ArgumentDataType[]
} ArgumentDataTypesA client will only know the values of
ArgumentDataTypes like GenericArgumentDataTypesGenerics are represented as: {
id: `list`,
accept: `any`
} or in a nested way: {
id: `list`,
accept: {
id: `list`,
accept: `string_like`
}
} The representation of a Example representationFunction endpoint for IN would look like: [
{
id: 'in',
name: 'In',
returns: 'boolean',
parameter_count: 2,
parameters: [
{
accept: 'any',
name: 'Find'
},
{
accept: {
id: 'list',
accept: 'any'
},
name: 'Bucket',
}
]
}
] We can go one step further and ensure type safety by defining generics in this function definition: [
{
id: 'in',
name: 'In',
returns: 'boolean',
parameter_count: 2,
generics: {
A: 'any'
},
parameters: [
{
accept: 'A',
name: 'Find'
},
{
accept: {
id: 'list',
accept: 'A'
},
name: 'Bucket',
}
]
}
] This structure helps clients understand that IN accepts any data type but an instance of list would accept only one type. Functions with generic return typesI would like us to avoid such functions since postgres does not support them either way (as far as I know). The only special cases are The return type of these functions need to be determined in client run-time and the client will have to know how to work with these. Instead of representing them as functions, we need to think of them as reserved keywords, kind of like Similar to how the client would have to know the set of all ArgumentDataTypes, the client will also have to know these two keywords. This is in comparison to the hints system where this problem is not solved as I've mentioned in the question section above. Passing functions to functions, would work like the following in clients:
The request structure will remain more or less the same as defined in this issue description{"in": [
{"column_reference": ["column1"]},
{"list": [
{"column_reference": ["column2"]},
{"to_lowercase": [
{"column_reference": ["column3"]},
]},
{"literal": ["foo"]},
{"literal": ["bar"]},
]},
]} This will work, and also the alternate format: {
in: {
find: { column_reference: { column: 'column1' } },
bucket: [
{ column_reference: { column: 'column2' } },
{ literal: { value: 'foo' } }
]
}
} The
|
@pavish @dmos62 I haven't read through all the earlier comments but I'd like to reiterate my idea from the other thread of keeping the I think it would be great to have the |
@kgodey Makes sense. I would suggest taking a look at the format mentioned in this comment for the frontend specific API. |
I'll leave it to @dmos62 to come up with a proposal for this if he also thinks it's a good idea.
This is what I'd like to avoid. I think we can have a relaxed and simple implementation on the backend and use that as a base to have an API that also makes it simple to use for the frontend. |
It's because of how The point of having I quite like this setup. All of these functions are special on the frontend:
Again, these functions are first-class to avoid syntax sugar, which complicates the interface and its backend implementation, and I think its frontend application too. First-class might be a misnomer, since as you point out, they're "special". But, we can address that with generics if we want to. I'll talk about generics later.
A
Yes.
A hint can contain any information, including nested hints.
Hints are optional. You can write them, and you can follow them, but you're not obligated to do either. On the choice of array for representing hint sets. I'd normally use a set for this (and I did initially), but JSON only supports maps and lists. I didn't use a map as a makeshift set, because that's awkward (in my experience). And, because, for example, where you have multiple I'll try to sum up your suggestions:
I like the ideas of generics and named parameters. I've a few reservations about the burden/usefulness tradeoff of generics in our use case. Named parameters is a question of when and how. I'm sceptical of the proposed declaration format. It seems specialized for a specific UI. I'd say you're challenging:
For 1, I think that all function attributes being in an array is fine for the frontend, since it can reshuffle the data however it likes. I feel motivated to keep the format of the hints in the same structure as the user developer declared them (an array of hints currently). We could populate a data structure derived from the hints in the format suggested (for example) and publish it in compliment to publishing the hints verbatim. That's what @kgodey suggested. But, I think that should be a last resort. Let's see how this discussion pans out first. For 2, the idea of supporting partial declaration is debatable of course. Initially, I was going for a full-declaration-only API. We shifted to a more relaxed model, which I like. I think it's empowering to the power user, and it forces us to be resilient to bad declarations and bad expressions, which profits the casual user too. If we cannot find a way to work with partial declarations on the frontend, we can walk this backend decision back.
It is a fair amount of work on the frontend, but is it due to the problems you've mentioned? Generics and named parameters aside? Can you be more specific in what complexity you find should be shifted to the backend? By the way, I'm not sure what the distinction between a client and a frontend is in this case. For the sake of minimizing the number of round-trip messages, I'll presume that you mean client, as in the entity served by a server. Also, let us distinguish the complexity inherent in letting a user compose practically any database function expression and the complexity created by an inefficient API. GenericsAs I mentioned earlier, I'm a bit sceptical of the tradeoff between the complexity added by supporting generics and the benefit of a type-safer interface. Where generics would be useful:
Doing generics in the |
I would be interested in hearing @mathemancer's opinion. |
I realised that on the frontend you do know the type of the referenced column. Duh. Generics is starting to look more useful. |
@dmos I am quite fine with the backend code keeping things simple with the hints structure, what I do not want is for the API to behave the same manner. I do not understand why the API needs to be loose. When you say users may want it, who does "users" refer to? As far as I can see, the whole "functions" endpoint is only useful for those building "General purpose" clients, and never for specific clients. A general purpose client is our frontend, and anyone wanting to build a frontend for Mathesar. Everyone else untilizing the Mathesar API for a specific purpose or application would never need to use the "functions" endpoint. The term users is very ambiguous in this enrire discussion and I would like us to establish that clearly. |
As suggested on the weekly discussion, I think it would be useful to set up a call to discuss both the goals and format of the API. |
@dmos62 I can see that you agree with (or atleast consider) the usefulness of Let me focus on the point we disagree upon. Also, let us refrain from talking about the internal representation on the backend. I'm only concerned on the API representation. I am in favour of a more obvious API where:
Your argument relies on the reasoning that it's more useful for user developers if the API does not follow the obvious structure. Here are some key aspects that bother me:
It isn't specialized for a specific UI, it makes it more easier for any parser reading the format. I just want it to be very clear for clients to understand and make use of the function declarations and data types.
I'd rather have them declare functions correctly, or not allow them to expose them on the API with erreneous declarations. User developers here seem to refer to someone who's an user who writes backend code directly and do not necessarily have any need for the "functions" API. In constract, user developers like plugin authors would focus more on the clients functioning properly, than ease of adding a new function. With each function, I think there should be a flag to determine whether or not to expose on the functions endpoint. If the user developers want to expose them, it should be valid. In the API atleast, our focus should be more on clients and not on user developers. |
That's a good summary of your position, I think.
That's a good point. I'll see if I can minimize the "specialness" down to inputs of
Yes, that's one of the goals of this revision. Ultimately we want an interface (or at least we were imagining an interface) where the user is unconstrained in what expressions he can pass to Postgres, but at the same time the safe choices are gently pointed out to him. The novice can take the safe road and the expert can enjoy interacting with the database in a pretty direct way.
The idea is to support an interface that would not totally discard those incompletely defined functions. Hints give you niceties, but you can still use the function without it. You have to then rely on the user's judgment and ability to troubleshoot. Of course, we might not want an advanced expression UI for Alpha, but it's good that the API support it.
I think these possible failures are aligned with the goals of this refactor. We want to hand hold to the best of our ability, but only when the user wants it, and we want to allow the user to work in "look no hands" mode (failing as late as possible). At the same time, we want to empower the user developer that might want to extend the database function set to suit his use case (and then use those functions through the UI). We want to tell the user that a list as an argument to
I'd say my goal was an API where UI's ability to guide the user is as good as:
Want to whip up a function and try it out in the UI? Go for it, forget the hints. Want some basic assistance: then add some basic hints. Want even more assistance: then add hints that maybe require more care (generics, etc.). I'll not address every point you made, since some of them, will be well addressed by others in the upcoming sync call. |
The ideal UI for the hints structure would be a query editor based UI with suggestions, where the user can directly write queries and the editor should show hints. Any other UI (eg., our current UX) cannot be implemented if the functions are invalid or incomplete. If I do not know the return type of a function, I cannot compose it in other functions. Similarly if I do not know acceptable parameters for a function, I cannot pass values to it.
That's the problem, if we are showing the functions and allowing incorrect usage without any sort of conditions, showing suggestions would not matter. The novice are our primary users. Why are we so keen on making it easy for user developers, rather than the users themselves? If someone is tech savvy enough to write functions and add it to Mathesar, they can include 2 additional fields.
I'm saying let's allow using of those functions while forming the filter query. I'm not opposed to that. But we can't let them do that through clients. I just don't want invalid and incomplete functions to be exposed on the "functions" endpoint.
This comes with the cost of providing good UX. A slightly strict structure would support all kinds of interfaces. Are those incompletely defined functions that important? As I mentioned before, why not shift a small amount of responsibility towards user developers rather than inconvenience users.
This is exactly what I want to avoid, because it's not possible unless the UI is an editor. No other UI can be reliably implemented with flaky information. Providing a well defined structure will support all kinds of interfaces (including editor based UX), whereas the hints structure only seems to focus on an editor based interface. |
I'll just make two quick points. User developers are users that happen to be developers too. They might not be the primary user group, but we're looking for a way to accommodate them as well. A totally hint-constrained UI can be constructed too: just disregard database functions that aren't described fully enough. That would suit the current UX design. I presumed that that's what we would do in the near term. Maybe I wasn't clear enough on that. |
@pavish There's another tid bit to consider. Even novice users will probably want to do things like |
@dmos62 Our current UX cannot support multi-level composition and we will have to design another for advanced users (note: advanced does not mean technical. Advanced here refers to someone with a bit more know-how than our novice users.) We can figure out when we might want such an UX, I'd say probably right after the alpha. @kgodey We'd like your thoughts on this. I've not been focusing on the UX we have at hand for this discussion, but rather the ability to support complex UX to be implemented reliably using the API. |
There's lots to cover here, so apologies in advance if I miss something, or I'm not understanding something. I think that input and description should not be so dependent. Not every function needs to be understood by our client (or any client) to be useful. To be more specific, I think it's crucial that our API is set up to accept requests involving non-recommended compositions of functions. I think that we're eventually going to want to be able to be pretty flexible in what we allow users to do. This would imply (to me) either accepting raw SQL input, or coming up with a UI that's essentially as flexible as raw SQL. I think this means a pretty general API like the one @dmos62 has proposed. As far as moving complexity to the front end goes, I think that's somewhat appropriate in that context. Writing queries is, in the end, the user's job. We're here to help and guide them since they may not have the prerequisite knowledge to write a query that gets the data they want. I disagree with the conflation of incomplete and incorrect descriptions. Plenty of clients could use the API without knowing anything about output types. (how about @dmos62 I don't think the idea of having a specific API to give higher-level functionality is a cop out. Rather, I think it's a good internal-to-the-backend test use case for the query building paradigm. Think of it as something of an internal client to the lower-level API that then exposes higher-level concepts to outside callers. |
We had a call about this, notes and next steps are here. I think it probably makes sense to close this issue and open a new one for implementing the filtering API @dmos62. |
Over on the backend, we seem to have converged on a filtering API that we're happy with (relevant PR). The purpose of this thread is to discuss the proposal with the frontend team.
To understand the new filtering API, it's probably best to forget about the current/previous API, since there's hardly any things in common. The new API is more powerful, flexible and extensible, but it also requires more from the frontend implementors.
Tutorial
Here's the expected high-level workflow from frontend's perspective; the goal is to describe the filter in terms of a database function that returns a boolean:
/api/v0/functions
endpoint to get the list of supported database functions (also known asdb_function
s orDbFunction
subclasses in the backend) and their hints (hints are important).Here's an example of what the endpoint returns. Notice that hints can nest. The
starts_with
database function is hinted to return a boolean, and to take two parameters, all of which are expected to be string-like./api/v0/db_types
endpoint to get the hints associated with each database type:Below sample shows the DB type
character varying
to have the hintstring_like
, the DB typeuri
to have the hinturi
, and the DB typenumeric
to have the hintcomparable
. Note that these hint sets are incomplete; one should expect most DB types to have multiple hints. Also, note that hints are pretty expressive and contain different types of information, like information about parameters, return types, type attributes (e.g.string_like
,comparable
). An advanced user might extend the hint and the function sets to suit his use case.Note I'm not trying to suggest a UI design. It should be as simple or as advanced as the spec says, but point is that, whether implicitly or explicitly, the user will be assembling a function expression.
Note that to be expressive with database functions you'll have to nest them. For example, to check if a URI's authority starts with "foo", you'll do something like
starts_with(extract_uri_authority(column_reference("that_uri_column")), literal("foo"))
(pseudo code; actual JSON syntax lower down).Note that parameter position will matter. We might need a convention or some sort of documentation to tell whether, for example, it's
starts_with(string, prefix)
orstarts_with(prefix, string)
. For functions with two parameters, the usualstring starts_with prefix == starts_with(string, prefix)
logic might be enough. But, we could have more complicated functions. Adding doc strings or parameter titles to the DB function declarations would not be complicated. This is not a solved problem as of yet, but I don't consider it a blocker either, since we can get a lot done already without having great support for functions with complicated signatures.Note that some functions will not hint at their return value type (currently all defined functions could have return hints, but that will probably not hold for all conceivable functions), and/or their parameter count, and/or their parameter type. If a function doesn't have a hint for this or for that, you can consider that aspect of it to not be constrained, or that it's up to the user to judge when to use it. For example,
empty
doesn't hint at a parameter type (it doesn't matter);and
doesn't hint at the parameter count, since it can take any number of inputs. Conceivably, we could introduce a hint likeminimum_parameter_count
to constrain the lower bound of parameter count, but not the upper bound.That's the equivalent of the following SQL expression:
Note that every function is a dict with a single key-value pair. The key is always the function id, and the value is always a list of parameters. A parameter can be a literal, in the case of the
literal
or thecolumn_reference
functions, or a nested function expression (in the form of a dict).Note that everything is a function in this syntax. A column reference is the column name/identifier wrapped in the
column_reference
function; a list (as in the right hand argument toIN
) is thelist
function and its parameters are list items; a literal, like the string"foo"
, is wrapped in theliteral
function. All of these functions are exposed in the/functions
endpoint.Please submit your feedback and/or questions.
The text was updated successfully, but these errors were encountered: