New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-22940][SQL-CLIENT] Make sql client column max width configurable #16245
[FLINK-22940][SQL-CLIENT] Make sql client column max width configurable #16245
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit c333291 (Thu Sep 23 17:51:49 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
…L_client_column_max_width_configurable
I think this PR contains too many changes... |
@@ -85,16 +94,6 @@ public CliChangelogResultView(CliClient client, ResultDescriptor resultDescripto | |||
return Arrays.copyOfRange(resultRow, 1, resultRow.length); | |||
} | |||
|
|||
@Override | |||
protected int computeColumnWidth(int idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To optimize changelog table result view ,I think computeColumnWIdth with following code will be better.
protected int computeColumnWidth(int idx) {
PrintUtils.columnWidthByType(column, ....)
}
Another benefit is that constructor of CliResultView is no need to change, subclass of CliResultView can implement computeColumnWIdth differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pensz .
The PR is 200 lines indeed. About half of it is documentation though, and the previous behavior of column width was implemented differently for all 3 modes, so one factor that made the PR grow is my change to make them all depend on the same logic.
The computeColumnWidth(int idx)
you refer to would be called for every row, which would be marginally slower I think. In this PR the width are now initialized only once, in the constructor, which seems a natural place for performing initialization.
IMHO, adding methods like protected int computeColumnWidth(int idx)
adds complexity, since it contributes to making CliResultView
more different than the tableau mode, and it adds a degree of freedom to the code which might not add that much value, since using the same column width policy everywhere increase the user experience I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second @sv3ndk 's point here. computeColumnWidth
is not a good design before, because tableau
mode doesn't use it. ResultDescriptor
is a good place to put the initialized column widths and make all modes to have the same getting column width logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great contribution and nice videos @sv3ndk !
The changes looks good in general. I think we need to add a test to verify it. You can add a test in flink-table/flink-sql-client/src/test/resources/sql/select.q
to change max width using SET
command, and execute SELECT again.
private final boolean isTableauMode; | ||
|
||
private final boolean isStreamingMode; | ||
public final ReadableConfig config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a public int getMaxColumnWidth()
method just like isStreamingMode
, so that we don't need to expose the whole config
. And users don't need to remember the config name when using, e.g. resultDescriptor.config.get(DISPLAY_MAX_COLUMN_WIDTH)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very kind and useful feed-back @wuchong :)
I'll create the getMaxColumnWidth()
as you described and hide the config
as a private field.
|
||
@Documentation.TableOption(execMode = Documentation.ExecMode.BATCH_STREAMING) | ||
public static final ConfigOption<Integer> DISPLAY_MAX_COLUMN_WIDTH = | ||
ConfigOptions.key("sql-client.display.max_column_width") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to use sql-client.display.max-column-width
, Flink configuration doesn't use _
as separator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah indeed, I'm not respecting the convention here, I'll fix it.
This query performs a bounded word count example. | ||
|
||
In *changelog mode*, the visualized changelog should be similar to: | ||
The documentation of the SQL client commands can be accessed as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's not easy to maintain the HELP list (it's not complete too), would be better to just add a link to SQL page dev/table/sql/overview/
where shows all supported statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll remove the output of the HELP
command and add the link you mentioned.
Maybe we still keep the mention of the HELP
command though? Even incomplete I find it's useful to know it exists I think.
Would it be useful if I opened another jira ticket for updating the output HELP and make it accurate? For example I learned recently the existence of ADD JAR
and SHOW JAR
that seems useful to know but are not documented yet AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an pending PR for this #16060. Would be great to help to review the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, great! I'll have a look at that one and review it.
name cnt | ||
Alice 1 | ||
Greg 1 | ||
Bob 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you show this display example using tableau
mode which I think the community is tend to make it as default mode in the future. The tableau
mode is also more readable than table
mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea.
I'll also set the execution mode to batch, to avoid showing the +I
-U
and +U
to keep first example as simple as possible
@wuchong , I added the unit test you suggested in |
@sv3ndk , this is as expected. This config option should only affect variable-length types in streaming mode to control the displayed characters. Fixed-length types and all types in batch mode should be able to be displayed in a deterministic column width. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is a good shape and can be merged now. It depends on you whether to also mention the affect scope of this new config option, e.g. varaible-length types in streaming mode.
Besides, a side note, please do not use "git merge" to update branches, otherwise it's hard to track the commit changes. Please use "git rebase" instead. IntelliJ IDEA provide an easy tool to do git rebase, you can find the tool via |
I added one last commit clarifying the documentation of the new option as you suggested and decorated it with I'm taking note rebase is preferred over merge, sure, I'll use that in the future then. As far as I know I think as well the PR is ready to be merged, thanks again for the review and all the feed-back @wuchong :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
What is the purpose of the change
As a resolution for FLINK-22940, this PR aims at replacing the hard-coded maximum display width currently used by the SQL client with a configurable one.
In the linked ticked we discussed allowing the user to set the max width as unlimited, but on second thought I find is hard to implement since the resulting data may not be materialized => we do not know the maximum width required by the data. I suggest we do not implement this as part of this ticket.
Brief change log
Aligned the column width strategy across all modes
In tableau mode, the column width is computed by
PrintUtils.columnWidthsByType()
and depends on the column type (e.g. 10 for a DATE).In Table and Changelog modes however, all columns defaulted to
MAX_COLUMN_WIDTH
, as provided bycomputeColumnWidth(int idx)
, which lead to waste of screen space, inconsistent end-user experience and harder code to maintain.In order to use the same logic for all modes, I removed the
computeColumnWidth(int idx)
methods and replaced themwith an initialization of the column widths in the constructor that relies on the same
PrintUtils.columnWidthsByType()
method as the Tableau mode.Added a new option
sql-client.display.max_column_width
toSqlClientOptions
I defined the default value as 30, which is the current hard-coded
MAX_COLUMN_WIDTH
Added an instance of
ReadableConfig
as a member ofResultDescriptor
ResultDescriptor
already has 2 methods providing access to configuration:isTableauMode()
andisStreamingMode()
, that were previously initialized explicitly by the caller.=> in order to simplify the initialization of
ResultDescriptor
and allow access to all configuration keys, I placed an instance ofReadableConfig
directly inside it.Since this class is read-only, I made it public for simplicity.
Updated the view result in all modes to use the new parameter
In any call to
PrintUtils.columnWidthsByType()
from within the SQL client, the logic is now depending on the new config parameter, obtained via theResultDescriptor
instance.Updated documentation
I updated the example of display for each mode in the documentation + re-organized the structure a bit in order to move all configuration-related aspects in the "Configuration" section.
Fixes a few deprecation warnings
Some unit tests were using
org.junit.Assert.assertThat
, which is now deprecated in favour oforg.hamcrest.MatcherAssert.assertThat
=> I replaced.Verifying this change
See the 2 videos, shoing the old behavior (including inconsistencies of column width across modes) and the new behavior (including usage of the new parameter).
old_behavior.mp4
new_behavior.mp4
I also validated the existing tests and style with:
And validated the documentation updates with
hugo -b "" serve
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation