Skip to content

Conversation

@JingsongLi
Copy link
Contributor

What is the purpose of the change

Add document to dataGen, print, blackhole connectors

Verifying this change

image
image
image
image

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit cb675d7 (Thu Jun 11 13:45:42 UTC 2020)

✅no warnings

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 11, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@sjwiesman sjwiesman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just a few nits.

How to create an Blackhole table
----------------

Although it doesn't make sense to define the fields of print table, you need to write them all in DDL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you are trying to say but the statement sounds like a put down on Flink. Lets just drop it.

Suggested change
Although it doesn't make sense to define the fields of print table, you need to write them all in DDL.


Just like /dev/null device on Unix-like operating systems.

The Print connector is built-in.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Print connector is built-in.
The Black connector is built-in.

How to create an Print table
----------------

Although it doesn't make sense to define the fields of print table, you need to write them all in DDL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, lets drop this.

Suggested change
Although it doesn't make sense to define the fields of print table, you need to write them all in DDL.

</div>
</div>

Another way is using [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Another way is using [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).
Alternativly, it may be based on an existing schema using the [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).

</div>
</div>

Another way is using [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Another way is using [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).
Alternatively, it may be based on an existing schema using the [LIKE Clause]({{ site.baseurl }}/dev/table/sql/create.html#create-table).

Copy link
Contributor

@godfreyhe godfreyhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, I left some minor comments

For each field, there are two ways to generate data:

- Random generator: default, you can specify random max and min values. For char/varchar/string, the length can be specified.
- Sequence generator: you can specify sequence start and end values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain more about the behavior of after reaching the end value ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default -> By default.


Four possible format options:

- PRINT_IDENTIFIER:taskId> output <- PRINT_IDENTIFIER provided, parallelism > 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The web display is not friendly

- taskId> output <- no PRINT_IDENTIFIER provided, parallelism > 1
- output <- no PRINT_IDENTIFIER provided, parallelism == 1

The output string format is "$RowKind(f0,f1,f2...)", example is: "+I(1,1)".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain more about RowKind here or give a link where already explained it?

The Print connector is built-in.

How to create an Blackhole table
----------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an Blackhole -> a Blackhole

{% highlight sql %}
CREATE TABLE blackhole_table () WITH ('connector' = 'blackhole')
LIKE source_table (EXCLUDING ALL)
{% endhighlight %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The () can be dropped.

The Datagen connector is built-in.

<span class="label label-danger">Attention</span> Not support complex types: Array, Map, Row. Please construct these types by computed column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not support complex types -> Complex types are not supported

<span class="label label-danger">Attention</span> Not support complex types: Array, Map, Row. Please construct these types by computed column.

How to create an Datagen table
----------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an -> a

For each field, there are two ways to generate data:

- Random generator: default, you can specify random max and min values. For char/varchar/string, the length can be specified.
- Sequence generator: you can specify sequence start and end values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default -> By default.

@JingsongLi
Copy link
Contributor Author

Thanks @sjwiesman @godfreyhe @danny0405 for your review, updated.

</tbody>
</table>

The output string format is "$row_kind(f0,f1,f2...)", row_kind is the short string of [RowKind](https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/types/RowKind.html), example is: "+I(1,1)".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The url should be changed with flink version. such as for 1.11, flink-doces-master should be changed to flink-docs-release-1.11

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use { site.baseurl }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to {{ site.baseurl }}

Copy link
Contributor

@godfreyhe godfreyhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


The Print connector is built-in.

<span class="label label-danger">Attention</span> Print sink print records in tasks, you need to observe the task log.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Print sink print ... -> Print sink prints ...

@JingsongLi JingsongLi merged commit 97fc6b3 into apache:master Jun 15, 2020
JingsongLi added a commit that referenced this pull request Jun 15, 2020
@JingsongLi
Copy link
Contributor Author

Thanks all for your review, merged~

bigdata-ny pushed a commit to bigdata-ny/flink that referenced this pull request Jun 19, 2020
zhangjun0x01 pushed a commit to zhangjun0x01/flink that referenced this pull request Jul 8, 2020
@JingsongLi JingsongLi deleted the datagen_doc branch November 5, 2020 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants