Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-8381] [table] Document more flexible schema definition #5257

Closed
wants to merge 2 commits into from

Conversation

twalthr
Copy link
Contributor

@twalthr twalthr commented Jan 8, 2018

What is the purpose of the change

Documentation for schema definition modes.

Copy link
Contributor

@fhueske fhueske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @twalthr, the description of the mapping modes is good, but I think it should be better integrated with the discussion of the different types below. The descriptions are a out-dated and should be updated as well.

Thanks, Fabian

@@ -802,7 +802,87 @@ val dsTuple: DataSet[(String, Int)] = tableEnv.toDataSet[(String, Int)](table)

### Mapping of Data Types to Table Schema

Flink's DataStream and DataSet APIs support very diverse types, such as Tuples (built-in Scala and Flink Java tuples), POJOs, case classes, and atomic types. In the following we describe how the Table API converts these types into an internal row representation and show examples of converting a `DataStream` into a `Table`.
Flink's DataStream and DataSet APIs support very diverse types. Composite types such as Tuples (built-in Scala and Flink Java tuples), POJOs, Scala case classes, and Flink's Row type allow for nested data structures with multiple fields that can be accessed in table expressions. Other types are treated as atomic types. In the following, we describe how the Table API converts these types into an internal row representation and show examples of converting a `DataStream` into a `Table`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description is the two modes is good, but the following sections for the different types was not updated.
I think we could describe Position-based Mapping and Name-based Mapping first and move the concrete code examples to the individual type sections. For example for Tuples we would show position and name base mappings in the same code example. This would also highlight the difference.
We should also double-check the text descriptions for the different types.


When defining a position-based mapping, the specified names must not exist in the input data type, otherwise the API will assume that the mapping should happen based on the field names. If no field names are specified, the default field names and field order of the composite type are used or `f0` for atomic types.

<div class="codetabs" markdown="1">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move and split code examples to the discussion of the individual types.


If no field names are specified, the default field names and field order of the composite type are used or `f0` for atomic types.

<div class="codetabs" markdown="1">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move and split code examples to the discussion of the individual types.

@twalthr
Copy link
Contributor Author

twalthr commented Jan 9, 2018

Thanks for your feedback @fhueske. I hope I could address all your comments. I will merge this now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants