Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document table function support #12338

Merged
merged 1 commit into from
May 16, 2022
Merged

Document table function support #12338

merged 1 commit into from
May 16, 2022

Conversation

kasiafi
Copy link
Member

@kasiafi kasiafi commented May 11, 2022

This is documentation for Table Functions #11336.
The feature is already in.

@cla-bot cla-bot bot added the cla-signed label May 11, 2022
@kasiafi
Copy link
Member Author

kasiafi commented May 11, 2022

@martint @mosabua could we try to merge the docs before the upcoming release so that the new feature goes live already documented?

Comment on lines 182 to 186
public Optional<TableFunctionApplicationResult<ConnectorTableHandle>> applyTableFunction(ConnectorSession session, ConnectorTableFunctionHandle handle)
{
// get the ConnectorTableHandle and a list of ColumnHandles representing the table function result

return Optional.of(new TableFunctionApplicationResult<>(tableHandle, columnHandles));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not copy code into docs to avoid going out of sync. Pointer to a method (interface method name to implement) should suffice and then that method javadoc should describe the technical details

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I can indeed remove the pasted code, as it doesn't add value to the docs.
However, I'd like to leave the other code samples (the constructor and analyze()), even at the risk of going out of sync, because they provide a convenient way of explaining things. Anyway, I hope I won't miss the docs update if the SPI changes in the future.

Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass..

docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
docs/src/main/sphinx/sql/table-functions.rst Outdated Show resolved Hide resolved
@kasiafi
Copy link
Member Author

kasiafi commented May 12, 2022

@mosabua thanks for your feedback.

I split the document in two, one from the user's perspective, and one from the developer's perspective.

  • The "user guide" is linked from select.rst, and it is in the functions folder. We will add there a list of built-in functions when we have them.
  • The "developer guide" went to develop.

The changes are in a separate commit.
Please take a look.

===============
Table functions
===============

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need an intro sentence about table functions .. what is it? And why would I use it..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could just link to the user docs actually

Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a bunch of clarifications to allow users to actually use this

===============
Table functions
===============

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could just link to the user docs actually

Table function declaration
--------------------------

Trino supports adding custom table functions. They are declared by connectors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ... thats kinda weird .. so users have to maintain a fork of the connector? It cant be a separate plugin that is somehow associated with a specific connector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't require a fork or a separate plugin. Any existing connector can expose table functions. Soon we'll have the query pass-through feature, which is a good examples of (existing) connectors registering polymorphic table functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I believe that we will be adding a lot of custom table functions to Trino. Of course, we will have to judge if the function is generally useful. If someone need a function that we find "controversial", they might have to fork.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay ... maybe we should say something like.. want another table function added to trino .. file a ticket or PR and contact us ... although thats kinda odd for the docs.. not sure at this stage how we should proceed .. if there are not currently usable table functions maybe we should merge..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don't write things like "want another connector...file a ticket..." in the docs.

At the moment, we have a bunch of table functions in review. They should be in soon. When we merge them, we'll document them, and they will serve as a good example of what people have to do if they want a new table function in Trino.

Also, we will likely have some publications around table functions.

Trino supports adding custom table functions. They are declared by connectors
through implementing dedicated interfaces.

To declare a table function, you need to subclass ``ConnectorTableFunction``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need to add a whole bunch more info about the mechanics .. like separate module or fork of specific connector, pom.xml, dependencies?, what about package structure and so on..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another sentence about passing the custom implementation through a Provider. It is all that the implementor must do. Also, this is a pretty common pattern of exposing things from connectors in Trino.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the user has to do the development .. and that is not the case it seems.. so it just goes into existing connectors .. we should mention that somehow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When thinking of it, I can identify at least four categories of table functions, and they will be added in different ways:

  1. functions that we will add to the existing connectors. The query pass-through capability is a good example, because it is specific to the connector: you pass different syntax to Elasticsearch than to Teradata.
  2. functions that users add to their own connectors (forked). Generally, these are functions too controversial or too specific to be included in Trino
  3. "global" functions, which will be built-in in Trino, but they don't belong to any specific connector. Example: the pivot capability. We will support them only when we have the dedicated Operator
  4. I think that having table function support might result in creating a whole new kind of connectors. Currently, we have connectors over different data sources. Now that we support adding table functions, it makes sense to connect to an outer system for its processing capabilities and not necessarily for the data. Such connector could expose table functions that give access to the outer system's abilities or resources, so that they can be used for processing data in Trino.

@mosabua could you help me with the wording? The problem is, the table functions can go both to the existing and new connectors. I can't predict what kind of use case will be dominant.

- returned row type

In the example, the returned row type is ``GENERIC_TABLE``, which means that
the row type is not known statically, and it will be determined dynamically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it is determined


In order to provide all the necessary information to the Trino engine, the
class must implement the ``analyze()`` method. This method is called by the
engine during the Analysis phase of query processing. The ``analyze()`` method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the analysis phase


The access control for table functions can be provided both on system and
connector level. It is based on the fully qualified table function name,
which consists of the catalog name, the schema name, and the function name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the syntax of ?

catalog.schema.function

I assume.. but spell that out..

the catalog. You can qualify the function name with a schema name, or with
catalog and schema names::

SELECT * FROM schema_name.my_function(1, 100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really.. I assume that only works if you specified the current catalog with USE
does that also work for catalog.schema ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we now support path resolution (#11476).


Otherwise, the standard Trino name resolution is applied. The connection
between the function and the catalog must be identified, because the function
is executed by the corresponding connector. If the function is not registered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not registered ... that begs the question .. is there anything I have to do as a user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... that probably means the user misspelled something or they don't have the right catalog on the path. How could I word it?

within the SQL query. They can be used for working with external systems as
well as for enhancing Trino with capabilities going beyond the SQL standard.

Trino supports adding custom table functions. They are declared by connectors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add info about what table functions are there for each connector .. even if that is none

and then be explicit about linking to the dev guide to figure out how to add any


There are two conventions of passing arguments to a table function:

- arguments passed by name::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should start with uppercase .. and maybe bold it .. whatever you think looks good in the output

@kasiafi
Copy link
Member Author

kasiafi commented May 13, 2022

@mosabua I applied your comments as another fixup commit. Please take a look.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants