From 7053cf5541922e438edfc7709e3f17e7fca78db6 Mon Sep 17 00:00:00 2001 From: Dan Halperin Date: Tue, 13 Dec 2016 07:49:19 -0800 Subject: [PATCH 1/2] checkstyle: improve Javadoc checking This is a spiritual backport of https://github.com/apache/incubator-beam/pull/1060 Same checkstyle.xml changes, very similar fixes but recreated because of code divergence --- checkstyle.xml | 3 ++ .../examples/cookbook/TriggerExample.java | 30 +++++++++---------- .../examples/complete/game/GameStats.java | 8 ++--- .../complete/game/HourlyTeamScore.java | 6 ++-- .../examples/complete/game/LeaderBoard.java | 10 +++---- .../examples/complete/game/UserScore.java | 8 ++--- .../complete/game/injector/Injector.java | 10 +++---- .../src/main/java/DebuggingWordCount.java | 2 +- .../src/main/java/WindowedWordCount.java | 2 +- .../src/main/java/WordCount.java | 4 +-- .../java/common/DataflowExampleUtils.java | 2 +- .../common/ExampleBigQueryTableOptions.java | 2 +- .../common/ExamplePubsubTopicOptions.java | 2 +- .../cloud/dataflow/sdk/io/AvroSource.java | 1 + .../cloud/dataflow/sdk/io/BigQueryIO.java | 5 ++++ .../cloud/dataflow/sdk/io/BoundedSource.java | 14 ++++++--- .../cloud/dataflow/sdk/io/DatastoreIO.java | 2 +- .../cloud/dataflow/sdk/io/FileBasedSink.java | 1 + .../dataflow/sdk/io/OffsetBasedSource.java | 2 ++ .../cloud/dataflow/sdk/io/PubsubIO.java | 1 + .../google/cloud/dataflow/sdk/io/Sink.java | 3 ++ .../google/cloud/dataflow/sdk/io/TextIO.java | 11 +++---- .../google/cloud/dataflow/sdk/io/Write.java | 2 +- .../sdk/io/datastore/DatastoreIO.java | 2 +- .../sdk/io/datastore/DatastoreV1.java | 2 +- .../dataflow/sdk/io/range/RangeTracker.java | 2 ++ .../options/DataflowPipelineDebugOptions.java | 3 +- .../dataflow/sdk/options/PipelineOptions.java | 6 ++-- .../sdk/options/PipelineOptionsReflector.java | 2 +- .../sdk/options/ProxyInvocationHandler.java | 4 +-- .../sdk/runners/dataflow/AssignWindows.java | 6 ++-- .../sdk/runners/inprocess/EvaluatorKey.java | 4 +-- .../ExecutorServiceParallelExecutor.java | 2 +- .../inprocess/InMemoryWatermarkManager.java | 4 +-- .../runners/inprocess/InProcessCreate.java | 4 +-- .../inprocess/InProcessEvaluationContext.java | 2 +- .../inprocess/InProcessExecutionContext.java | 4 +-- .../inprocess/InProcessPipelineRunner.java | 13 ++++---- .../inprocess/InProcessTransformResult.java | 4 +-- .../runners/inprocess/TransformEvaluator.java | 2 +- .../inprocess/ViewEvaluatorFactory.java | 4 +-- .../sdk/runners/worker/IsmFormat.java | 6 ++-- .../sdk/testing/SerializableMatchers.java | 5 ++-- .../dataflow/sdk/testing/SourceTestUtils.java | 3 +- .../sdk/transforms/ApproximateQuantiles.java | 2 +- .../dataflow/sdk/transforms/Combine.java | 6 ++-- .../dataflow/sdk/transforms/CombineFns.java | 12 +++----- .../sdk/transforms/CombineWithContext.java | 4 +-- .../cloud/dataflow/sdk/transforms/DoFn.java | 2 +- .../dataflow/sdk/transforms/PTransform.java | 8 ++--- .../cloud/dataflow/sdk/transforms/ParDo.java | 2 +- .../sdk/transforms/RemoveDuplicates.java | 7 +++-- .../dataflow/sdk/transforms/WithKeys.java | 2 +- .../sdk/transforms/display/DisplayData.java | 8 ++--- .../transforms/windowing/AfterWatermark.java | 4 +-- .../sdk/transforms/windowing/Never.java | 5 ++-- .../sdk/transforms/windowing/PaneInfo.java | 4 +-- .../sdk/transforms/windowing/Window.java | 2 +- .../cloud/dataflow/sdk/util/AvroUtils.java | 2 +- .../sdk/util/BaseExecutionContext.java | 9 +++--- .../dataflow/sdk/util/BigQueryServices.java | 2 +- .../cloud/dataflow/sdk/util/DoFnRunner.java | 2 +- .../dataflow/sdk/util/DoFnRunnerBase.java | 2 +- .../util/ExposedByteArrayOutputStream.java | 1 + .../sdk/util/PerKeyCombineFnRunners.java | 4 +-- .../cloud/dataflow/sdk/util/PubsubClient.java | 3 ++ .../dataflow/sdk/util/PubsubTestClient.java | 2 +- .../dataflow/sdk/util/RandomAccessData.java | 8 ++--- .../dataflow/sdk/util/TimerInternals.java | 2 +- .../dataflow/sdk/util/common/Counter.java | 4 +-- .../cloud/dataflow/sdk/values/PInput.java | 2 +- .../sdk/testing/SystemNanoTimeSleeper.java | 2 +- 72 files changed, 176 insertions(+), 152 deletions(-) diff --git a/checkstyle.xml b/checkstyle.xml index e6f4f2af87..3d99556795 100644 --- a/checkstyle.xml +++ b/checkstyle.xml @@ -180,6 +180,9 @@ page at http://checkstyle.sourceforge.net/config.html --> + + + diff --git a/examples/src/main/java/com/google/cloud/dataflow/examples/cookbook/TriggerExample.java b/examples/src/main/java/com/google/cloud/dataflow/examples/cookbook/TriggerExample.java index ce5e08e7d2..b40d2b6f6e 100644 --- a/examples/src/main/java/com/google/cloud/dataflow/examples/cookbook/TriggerExample.java +++ b/examples/src/main/java/com/google/cloud/dataflow/examples/cookbook/TriggerExample.java @@ -66,11 +66,11 @@ * data into {@link Window windows} to be processed, and demonstrates using various kinds of {@link * Trigger triggers} to control when the results for each window are emitted. * - *

This example uses a portion of real traffic data from San Diego freeways. It contains + *

This example uses a portion of real traffic data from San Diego freeways. It contains * readings from sensor stations set up along each freeway. Each sensor reading includes a * calculation of the 'total flow' across all lanes in that freeway direction. * - *

Concepts: + *

Concepts: *

  *   1. The default triggering behavior
  *   2. Late data with the default trigger
@@ -78,14 +78,14 @@
  *   4. Combining late data and speculative estimates
  * 
* - *

Before running this example, it will be useful to familiarize yourself with Dataflow triggers + *

Before running this example, it will be useful to familiarize yourself with Dataflow triggers * and understand the concept of 'late data', * See: * https://cloud.google.com/dataflow/model/triggers and * * https://cloud.google.com/dataflow/model/windowing#Advanced * - *

The example pipeline reads data from a Pub/Sub topic. By default, running the example will + *

The example pipeline reads data from a Pub/Sub topic. By default, running the example will * also run an auxiliary pipeline to inject data from the default {@code --input} file to the * {@code --pubsubTopic}. The auxiliary pipeline puts a timestamp on the injected data so that the * example pipeline can operate on event time (rather than arrival time). The auxiliary @@ -94,24 +94,24 @@ * choosing or set {@code --input=""} which will disable the automatic Pub/Sub injection, and allow * you to use a separate tool to publish to the given topic. * - *

The example is configured to use the default Pub/Sub topic and the default BigQuery table + *

The example is configured to use the default Pub/Sub topic and the default BigQuery table * from the example common package (there are no defaults for a general Dataflow pipeline). * You can override them by using the {@code --pubsubTopic}, {@code --bigQueryDataset}, and * {@code --bigQueryTable} options. If the Pub/Sub topic or the BigQuery table do not exist, * the example will try to create them. * - *

The pipeline outputs its results to a BigQuery table. + *

The pipeline outputs its results to a BigQuery table. * Here are some queries you can use to see interesting results: * Replace {@code } in the query below with the name of the BigQuery table. * Replace {@code } in the query below with the window interval. * - *

To see the results of the default trigger, + *

To see the results of the default trigger, * Note: When you start up your pipeline, you'll initially see results from 'late' data. Wait after * the window duration, until the first pane of non-late data has been emitted, to see more * interesting results. * {@code SELECT * FROM enter_table_name WHERE trigger_type = "default" ORDER BY window DESC} * - *

To see the late data i.e. dropped by the default trigger, + *

To see the late data i.e. dropped by the default trigger, * {@code SELECT * FROM WHERE trigger_type = "withAllowedLateness" and * (timing = "LATE" or timing = "ON_TIME") and freeway = "5" ORDER BY window DESC, processing_time} * @@ -120,23 +120,23 @@ * (trigger_type = "withAllowedLateness" or trigger_type = "sequential") and freeway = "5" ORDER BY * window DESC, processing_time} * - *

To see speculative results every minute, + *

To see speculative results every minute, * {@code SELECT * FROM WHERE trigger_type = "speculative" and freeway = "5" * ORDER BY window DESC, processing_time} * - *

To see speculative results every five minutes after the end of the window + *

To see speculative results every five minutes after the end of the window * {@code SELECT * FROM WHERE trigger_type = "sequential" and timing != "EARLY" * and freeway = "5" ORDER BY window DESC, processing_time} * - *

To see the first and the last pane for a freeway in a window for all the trigger types, + *

To see the first and the last pane for a freeway in a window for all the trigger types, * {@code SELECT * FROM WHERE (isFirst = true or isLast = true) ORDER BY window} * - *

To reduce the number of results for each query we can add additional where clauses. + *

To reduce the number of results for each query we can add additional where clauses. * For examples, To see the results of the default trigger, * {@code SELECT * FROM WHERE trigger_type = "default" AND freeway = "5" AND * window = ""} * - *

The example will try to cancel the pipelines on the signal to terminate the process (CTRL-C) + *

The example will try to cancel the pipelines on the signal to terminate the process (CTRL-C) * and then exits. */ @@ -170,13 +170,13 @@ public class TriggerExample { * 5 | 60 | 10:27:20 | 10:27:25 * 5 | 60 | 10:29:00 | 11:11:00 * - *

Dataflow tracks a watermark which records up to what point in event time the data is + *

Dataflow tracks a watermark which records up to what point in event time the data is * complete. For the purposes of the example, we'll assume the watermark is approximately 15m * behind the current processing time. In practice, the actual value would vary over time based * on the systems knowledge of the current PubSub delay and contents of the backlog (data * that has not yet been processed). * - *

If the watermark is 15m behind, then the window [10:00:00, 10:30:00) (in event time) would + *

If the watermark is 15m behind, then the window [10:00:00, 10:30:00) (in event time) would * close at 10:44:59, when the watermark passes 10:30:00. */ static class CalculateTotalFlow diff --git a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/GameStats.java b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/GameStats.java index cc77997bef..63b875af02 100644 --- a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/GameStats.java +++ b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/GameStats.java @@ -65,21 +65,21 @@ * New concepts: session windows and finding session duration; use of both * singleton and non-singleton side inputs. * - *

This pipeline builds on the {@link LeaderBoard} functionality, and adds some "business + *

This pipeline builds on the {@link LeaderBoard} functionality, and adds some "business * intelligence" analysis: abuse detection and usage patterns. The pipeline derives the Mean user * score sum for a window, and uses that information to identify likely spammers/robots. (The robots * have a higher click rate than the human users). The 'robot' users are then filtered out when * calculating the team scores. * - *

Additionally, user sessions are tracked: that is, we find bursts of user activity using + *

Additionally, user sessions are tracked: that is, we find bursts of user activity using * session windows. Then, the mean session duration information is recorded in the context of * subsequent fixed windowing. (This could be used to tell us what games are giving us greater * user retention). * - *

Run {@link injector.Injector} to generate pubsub data for this pipeline. The Injector + *

Run {@link injector.Injector} to generate pubsub data for this pipeline. The Injector * documentation provides more detail. * - *

To execute this pipeline using the Dataflow service, specify the pipeline configuration + *

To execute this pipeline using the Dataflow service, specify the pipeline configuration * like this: *

{@code
  *   --project=YOUR_PROJECT_ID
diff --git a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/HourlyTeamScore.java b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/HourlyTeamScore.java
index ff9b247b01..82904c2f18 100644
--- a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/HourlyTeamScore.java
+++ b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/HourlyTeamScore.java
@@ -46,7 +46,7 @@
  * domain, following {@link UserScore}. In addition to the concepts introduced in {@link UserScore},
  * new concepts include: windowing and element timestamps; use of {@code Filter.byPredicate()}.
  *
- * 

This pipeline processes data collected from gaming events in batch, building on {@link + *

This pipeline processes data collected from gaming events in batch, building on {@link * UserScore} but using fixed windows. It calculates the sum of scores per team, for each window, * optionally allowing specification of two timestamps before and after which data is filtered out. * This allows a model where late data collected after the intended analysis window can be included, @@ -55,7 +55,7 @@ * {@link UserScore} pipeline. However, our batch processing is high-latency, in that we don't get * results from plays at the beginning of the batch's time period until the batch is processed. * - *

To execute this pipeline using the Dataflow service, specify the pipeline configuration + *

To execute this pipeline using the Dataflow service, specify the pipeline configuration * like this: *

{@code
  *   --project=YOUR_PROJECT_ID
@@ -66,7 +66,7 @@
  * 
* where the BigQuery dataset you specify must already exist. * - *

Optionally include {@code --input} to specify the batch input file path. + *

Optionally include {@code --input} to specify the batch input file path. * To indicate a time after which the data should be filtered out, include the * {@code --stopMin} arg. E.g., {@code --stopMin=2015-10-18-23-59} indicates that any data * timestamped after 23:59 PST on 2015-10-18 should not be included in the analysis. diff --git a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/LeaderBoard.java b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/LeaderBoard.java index f332a2da10..ae1491d136 100644 --- a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/LeaderBoard.java +++ b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/LeaderBoard.java @@ -57,26 +57,26 @@ * early/speculative results; using .accumulatingFiredPanes() to do cumulative processing of late- * arriving data. * - *

This pipeline processes an unbounded stream of 'game events'. The calculation of the team + *

This pipeline processes an unbounded stream of 'game events'. The calculation of the team * scores uses fixed windowing based on event time (the time of the game play event), not * processing time (the time that an event is processed by the pipeline). The pipeline calculates * the sum of scores per team, for each window. By default, the team scores are calculated using * one-hour windows. * - *

In contrast-- to demo another windowing option-- the user scores are calculated using a + *

In contrast-- to demo another windowing option-- the user scores are calculated using a * global window, which periodically (every ten minutes) emits cumulative user score sums. * - *

In contrast to the previous pipelines in the series, which used static, finite input data, + *

In contrast to the previous pipelines in the series, which used static, finite input data, * here we're using an unbounded data source, which lets us provide speculative results, and allows * handling of late data, at much lower latency. We can use the early/speculative results to keep a * 'leaderboard' updated in near-realtime. Our handling of late data lets us generate correct * results, e.g. for 'team prizes'. We're now outputting window results as they're * calculated, giving us much lower latency than with the previous batch examples. * - *

Run {@link injector.Injector} to generate pubsub data for this pipeline. The Injector + *

Run {@link injector.Injector} to generate pubsub data for this pipeline. The Injector * documentation provides more detail on how to do this. * - *

To execute this pipeline using the Dataflow service, specify the pipeline configuration + *

To execute this pipeline using the Dataflow service, specify the pipeline configuration * like this: *

{@code
  *   --project=YOUR_PROJECT_ID
diff --git a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/UserScore.java b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/UserScore.java
index ae4f8f5d06..f0215cf8aa 100644
--- a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/UserScore.java
+++ b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/UserScore.java
@@ -49,15 +49,15 @@
  * BigQuery; using standalone DoFns; use of the sum by key transform; examples of
  * Java 8 lambda syntax.
  *
- * 

In this gaming scenario, many users play, as members of different teams, over the course of a + *

In this gaming scenario, many users play, as members of different teams, over the course of a * day, and their actions are logged for processing. Some of the logged game events may be late- * arriving, if users play on mobile devices and go transiently offline for a period. * - *

This pipeline does batch processing of data collected from gaming events. It calculates the + *

This pipeline does batch processing of data collected from gaming events. It calculates the * sum of scores per user, over an entire batch of gaming data (collected, say, for each day). The * batch processing will not include any late data that arrives after the day's cutoff point. * - *

To execute this pipeline using the Dataflow service and static example input data, specify + *

To execute this pipeline using the Dataflow service and static example input data, specify * the pipeline configuration like this: *

{@code
  *   --project=YOUR_PROJECT_ID
@@ -68,7 +68,7 @@
  * 
* where the BigQuery dataset you specify must already exist. * - *

Optionally include the --input argument to specify a batch input file. + *

Optionally include the --input argument to specify a batch input file. * See the --input default value for example batch data file, or use {@link injector.Injector} to * generate your own batch data. */ diff --git a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/injector/Injector.java b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/injector/Injector.java index 6d7b901544..39584f403a 100644 --- a/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/injector/Injector.java +++ b/examples/src/main/java8/com/google/cloud/dataflow/examples/complete/game/injector/Injector.java @@ -41,22 +41,22 @@ * This is a generator that simulates usage data from a mobile game, and either publishes the data * to a pubsub topic or writes it to a file. * - *

The general model used by the generator is the following. There is a set of teams with team + *

The general model used by the generator is the following. There is a set of teams with team * members. Each member is scoring points for their team. After some period, a team will dissolve * and a new one will be created in its place. There is also a set of 'Robots', or spammer users. * They hop from team to team. The robots are set to have a higher 'click rate' (generate more * events) than the regular team members. * - *

Each generated line of data has the following form: + *

Each generated line of data has the following form: * username,teamname,score,timestamp_in_ms,readable_time * e.g.: * user2_AsparagusPig,AsparagusPig,10,1445230923951,2015-11-02 09:09:28.224 * - *

The Injector writes either to a PubSub topic, or a file. It will use the PubSub topic if + *

The Injector writes either to a PubSub topic, or a file. It will use the PubSub topic if * specified. It takes the following arguments: * {@code Injector project-name (topic-name|none) (filename|none)}. * - *

To run the Injector in the mode where it publishes to PubSub, you will need to authenticate + *

To run the Injector in the mode where it publishes to PubSub, you will need to authenticate * locally using project-based service account credentials to avoid running over PubSub * quota. * See https://developers.google.com/identity/protocols/application-default-credentials @@ -75,7 +75,7 @@ *

* The pubsub topic will be created if it does not exist. * - *

To run the injector in write-to-file-mode, set the topic name to "none" and specify the + *

To run the injector in write-to-file-mode, set the topic name to "none" and specify the * filename: *

{@code
  * Injector  none 
diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/DebuggingWordCount.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/DebuggingWordCount.java
index d0fe12611c..0d5afdef12 100644
--- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/DebuggingWordCount.java
+++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/DebuggingWordCount.java
@@ -158,7 +158,7 @@ public void processElement(ProcessContext c) {
    * 

Inherits standard configuration options and all options defined in * {@link WordCount.WordCountOptions}. */ - public static interface WordCountOptions extends WordCount.WordCountOptions { + public interface WordCountOptions extends WordCount.WordCountOptions { @Description("Regex filter pattern to use in DebuggingWordCount. " + "Only words matching this pattern will be counted.") diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WindowedWordCount.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WindowedWordCount.java index 29921e235f..48b54069c4 100644 --- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WindowedWordCount.java +++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WindowedWordCount.java @@ -178,7 +178,7 @@ private static TableReference getTableReference(Options options) { * table and the PubSub topic, as well as the {@link WordCount.WordCountOptions} support for * specification of the input file. */ - public static interface Options + public interface Options extends WordCount.WordCountOptions, DataflowExampleUtils.DataflowExampleUtilsOptions { @Description("Fixed window duration, in minutes") @Default.Integer(WINDOW_SIZE) diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WordCount.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WordCount.java index 150b60d2d2..c787585e25 100644 --- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WordCount.java +++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/WordCount.java @@ -158,7 +158,7 @@ public PCollection> apply(PCollection lines) { * *

Inherits standard configuration options. */ - public static interface WordCountOptions extends PipelineOptions { + public interface WordCountOptions extends PipelineOptions { @Description("Path of the file to read from") @Default.String("gs://dataflow-samples/shakespeare/kinglear.txt") String getInputFile(); @@ -172,7 +172,7 @@ public static interface WordCountOptions extends PipelineOptions { /** * Returns "gs://${YOUR_STAGING_DIRECTORY}/counts.txt" as the default destination. */ - public static class OutputFactory implements DefaultValueFactory { + class OutputFactory implements DefaultValueFactory { @Override public String create(PipelineOptions options) { DataflowPipelineOptions dataflowOptions = options.as(DataflowPipelineOptions.class); diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java index ccf1f9e8b9..e663224db9 100644 --- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java +++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java @@ -67,7 +67,7 @@ public class DataflowExampleUtils { /** * Define an interface that supports the PubSub and BigQuery example options. */ - public static interface DataflowExampleUtilsOptions + public interface DataflowExampleUtilsOptions extends DataflowExampleOptions, ExamplePubsubTopicOptions, ExampleBigQueryTableOptions { } diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExampleBigQueryTableOptions.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExampleBigQueryTableOptions.java index bef5bfdd83..45546b1717 100644 --- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExampleBigQueryTableOptions.java +++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExampleBigQueryTableOptions.java @@ -43,7 +43,7 @@ public interface ExampleBigQueryTableOptions extends DataflowPipelineOptions { /** * Returns the job name as the default BigQuery table name. */ - static class BigQueryTableFactory implements DefaultValueFactory { + class BigQueryTableFactory implements DefaultValueFactory { @Override public String create(PipelineOptions options) { return options.as(DataflowPipelineOptions.class).getJobName() diff --git a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExamplePubsubTopicOptions.java b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExamplePubsubTopicOptions.java index 525de69cdd..f217821f1c 100644 --- a/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExamplePubsubTopicOptions.java +++ b/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/ExamplePubsubTopicOptions.java @@ -37,7 +37,7 @@ public interface ExamplePubsubTopicOptions extends DataflowPipelineOptions { /** * Returns a default Pub/Sub topic based on the project and the job names. */ - static class PubsubTopicFactory implements DefaultValueFactory { + class PubsubTopicFactory implements DefaultValueFactory { @Override public String create(PipelineOptions options) { DataflowPipelineOptions dataflowPipelineOptions = diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/AvroSource.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/AvroSource.java index cbecba92be..f9f07a2311 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/AvroSource.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/AvroSource.java @@ -122,6 +122,7 @@ * }

* *

Permissions

+ * *

Permission requirements depend on the {@link PipelineRunner} that is used to execute the * Dataflow job. Please refer to the documentation of corresponding {@link PipelineRunner}s for * more details. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BigQueryIO.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BigQueryIO.java index 61b66e0a98..8c120c4439 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BigQueryIO.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BigQueryIO.java @@ -141,6 +141,7 @@ * BigQuery tables. * *

Table References

+ * *

A fully-qualified BigQuery table name consists of three components: *

    *
  • {@code projectId}: the Cloud project id (defaults to @@ -162,6 +163,7 @@ *
* *

Reading

+ * *

To read from a BigQuery table, apply a {@link BigQueryIO.Read} transformation. * This produces a {@link PCollection} of {@link TableRow TableRows} as output: *

{@code
@@ -186,6 +188,7 @@
  * Pipeline construction will fail with a validation error if neither or both are specified.
  *
  * 

Writing

+ * *

To write to a BigQuery table, apply a {@link BigQueryIO.Write} transformation. * This consumes a {@link PCollection} of {@link TableRow TableRows} as input. *

{@code
@@ -209,6 +212,7 @@
  * dispositions are not supported in streaming mode.
  *
  * 

Sharding BigQuery output tables

+ * *

A common use case is to dynamically generate BigQuery table names based on * the current window. To support this, * {@link BigQueryIO.Write#to(SerializableFunction)} @@ -234,6 +238,7 @@ *

Per-window tables are not yet supported in batch mode. * *

Permissions

+ * *

Permission requirements depend on the {@link PipelineRunner} that is used to execute the * Dataflow job. Please refer to the documentation of corresponding {@link PipelineRunner}s for * more details. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BoundedSource.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BoundedSource.java index 170b56beb1..fd1a85656a 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BoundedSource.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/BoundedSource.java @@ -87,11 +87,13 @@ public abstract List> splitIntoBundles( * operations, such as progress estimation and dynamic work rebalancing. * *

Boundedness

+ * *

Once {@link #start} or {@link #advance} has returned false, neither will be called * again on this object. * *

Thread safety

- * All methods will be run from the same thread except {@link #splitAtFraction}, + * + *

All methods will be run from the same thread except {@link #splitAtFraction}, * {@link #getFractionConsumed}, {@link #getCurrentSource}, {@link #getSplitPointsConsumed()}, * and {@link #getSplitPointsRemaining()}, all of which can be called concurrently * from a different thread. There will not be multiple concurrent calls to @@ -108,7 +110,8 @@ public abstract List> splitIntoBundles( * {@link #getCurrentSource} which do not change between {@link #splitAtFraction} calls. * *

Implementing {@link #splitAtFraction}

- * In the course of dynamic work rebalancing, the method {@link #splitAtFraction} + * + *

In the course of dynamic work rebalancing, the method {@link #splitAtFraction} * may be called concurrently with {@link #advance} or {@link #start}. It is critical that * their interaction is implemented in a thread-safe way, otherwise data loss is possible. * @@ -263,14 +266,17 @@ public long getSplitPointsRemaining() { * (including items already read). * *

Usage

+ * *

Reader subclasses can use this method for convenience to access unchanging properties of * the source being read. Alternatively, they can cache these properties in the constructor. + * *

The framework will call this method in the course of dynamic work rebalancing, e.g. after * a successful {@link BoundedSource.BoundedReader#splitAtFraction} call. * *

Mutability and thread safety

- * Remember that {@link Source} objects must always be immutable. However, the return value of - * this function may be affected by dynamic work rebalancing, happening asynchronously via + * + *

Remember that {@link Source} objects must always be immutable. However, the return value + * of this function may be affected by dynamic work rebalancing, happening asynchronously via * {@link BoundedSource.BoundedReader#splitAtFraction}, meaning it can return a different * {@link Source} object. However, the returned object itself will still itself be immutable. * Callers must take care not to rely on properties of the returned source that may be diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/DatastoreIO.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/DatastoreIO.java index 004db8f88c..162453cd48 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/DatastoreIO.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/DatastoreIO.java @@ -78,7 +78,7 @@ import javax.annotation.Nullable; /** - *

{@link DatastoreIO} provides an API to Read and Write {@link PCollection PCollections} of + * {@link DatastoreIO} provides an API to Read and Write {@link PCollection PCollections} of * Google Cloud Datastore * {@link Entity} objects. * diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/FileBasedSink.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/FileBasedSink.java index 87102e1bac..976bc86269 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/FileBasedSink.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/FileBasedSink.java @@ -185,6 +185,7 @@ private static String getFileExtension(String usersExtension) { * {@link FileBasedSink#fileNamingTemplate}. * *

Temporary Bundle File Handling:

+ * *

{@link FileBasedSink.FileBasedWriteOperation#temporaryFileRetention} controls the behavior * for managing temporary files. By default, temporary files will be removed. Subclasses can * provide a different value to the constructor. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/OffsetBasedSource.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/OffsetBasedSource.java index 267b88fa53..170c7dd342 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/OffsetBasedSource.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/OffsetBasedSource.java @@ -258,7 +258,9 @@ public OffsetBasedReader(OffsetBasedSource source) { * Returns the starting offset of the {@link Source.Reader#getCurrent current record}, * which has been read by the last successful {@link Source.Reader#start} or * {@link Source.Reader#advance} call. + * *

If no such call has been made yet, the return value is unspecified. + * *

See {@link RangeTracker} for description of offset semantics. */ protected abstract long getCurrentOffset() throws NoSuchElementException; diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/PubsubIO.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/PubsubIO.java index 48114e61c1..875e25bcd8 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/PubsubIO.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/PubsubIO.java @@ -62,6 +62,7 @@ * and consume unbounded {@link PCollection PCollections}. * *

Permissions

+ * *

Permission requirements depend on the {@link PipelineRunner} that is used to execute the * Dataflow job. Please refer to the documentation of corresponding * {@link PipelineRunner PipelineRunners} for more details. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Sink.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Sink.java index c7bb68beb1..af244d933a 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Sink.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Sink.java @@ -69,6 +69,7 @@ * * *

WriteOperation

+ * *

{@link WriteOperation#initialize} and {@link WriteOperation#finalize} are conceptually called * once: at the beginning and end of a Write transform. However, implementors must ensure that these * methods are idempotent, as they may be called multiple times on different machines in the case of @@ -89,6 +90,7 @@ * these mutations will not be visible in finalize). * *

Bundle Ids:

+ * *

In order to ensure fault-tolerance, a bundle may be executed multiple times (e.g., in the * event of failure/retry or for redundancy). However, exactly one of these executions will have its * result passed to the WriteOperation's finalize method. Each call to {@link Writer#open} is passed @@ -108,6 +110,7 @@ * of output file names that it can then merge or rename using some bundle naming scheme. * *

Writer Results:

+ * *

{@link WriteOperation}s and {@link Writer}s must agree on a writer result type that will be * returned by a Writer after it writes a bundle. This type can be a client-defined object or an * existing type; {@link WriteOperation#getWriterResultCoder} should return a {@link Coder} for the diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/TextIO.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/TextIO.java index 40a71132c1..124152e6f2 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/TextIO.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/TextIO.java @@ -109,6 +109,7 @@ * }

* *

Permissions

+ * *

When run using the {@link DirectPipelineRunner}, your pipeline can read and write text files * on your local drive and remote text files on Google Cloud Storage that you have access to using * your {@code gcloud} credentials. When running in the Dataflow service using @@ -249,7 +250,7 @@ public Bound from(String filepattern) { /** * Returns a new transform for reading from text files that's like this one but - * that uses the given {@link Coder Coder} to decode each of the + * that uses the given {@link Coder} to decode each of the * lines of the file into a value of type {@code X}. * *

Does not modify this object. @@ -634,8 +635,8 @@ public Bound withoutSharding() { /** * Returns a transform for writing to text files that's like this one - * but that uses the given {@link Coder Coder} to encode each of - * the elements of the input {@link PCollection PCollection} into an + * but that uses the given {@link Coder} to encode each of + * the elements of the input {@link PCollection} into an * output text line. Does not modify this object. * * @param the type of the elements of the input {@link PCollection} @@ -881,7 +882,7 @@ public Coder getDefaultOutputCoder() { * A {@link com.google.cloud.dataflow.sdk.io.FileBasedSource.FileBasedReader FileBasedReader} * which can decode records delimited by newline characters. * - * See {@link TextSource} for further details. + *

See {@link TextSource} for further details. */ @VisibleForTesting static class TextBasedReader extends FileBasedReader { @@ -1013,7 +1014,7 @@ protected boolean readNextRecord() throws IOException { /** * Decodes the current element updating the buffer to only contain the unconsumed bytes. * - * This invalidates the currently stored {@code startOfSeparatorInBuffer} and + *

This invalidates the currently stored {@code startOfSeparatorInBuffer} and * {@code endOfSeparatorInBuffer}. */ private void decodeCurrentElement() throws IOException { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Write.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Write.java index 5881951996..3aec6fff22 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Write.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/Write.java @@ -67,7 +67,7 @@ *

Example usage with runner-controlled sharding: * *

{@code p.apply(Write.to(new MySink(...)));}
- + * *

Example usage with a fixed number of shards: * *

{@code p.apply(Write.to(new MySink(...)).withNumShards(3));}
diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreIO.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreIO.java index 2ac1797155..403036e4f3 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreIO.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreIO.java @@ -19,7 +19,7 @@ import com.google.cloud.dataflow.sdk.annotations.Experimental; /** - *

{@link DatastoreIO} provides an API for reading from and writing to + * {@link DatastoreIO} provides an API for reading from and writing to * Google Cloud Datastore over different * versions of the Cloud Datastore Client libraries. * diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreV1.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreV1.java index 3af3932978..3cb7a6b4d5 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreV1.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/datastore/DatastoreV1.java @@ -88,7 +88,7 @@ import javax.annotation.Nullable; /** - *

{@link DatastoreV1} provides an API to Read, Write and Delete {@link PCollection PCollections} + * {@link DatastoreV1} provides an API to Read, Write and Delete {@link PCollection PCollections} * of Google Cloud Datastore version v1 * {@link Entity} objects. * diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/range/RangeTracker.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/range/RangeTracker.java index 84359f1aa1..cd67b85548 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/range/RangeTracker.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/range/RangeTracker.java @@ -185,12 +185,14 @@ public interface RangeTracker { /** * Atomically determines whether a record at the given position can be returned and updates * internal state. In particular: + * *

    *
  • If {@code isAtSplitPoint} is {@code true}, and {@code recordStart} is outside the current * range, returns {@code false}; *
  • Otherwise, updates the last-consumed position to {@code recordStart} and returns * {@code true}. *
+ * *

This method MUST be called on all split point records. It may be called on every record. */ boolean tryReturnRecordAt(boolean isAtSplitPoint, PositionT recordStart); diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java index 202003e275..12f8d40037 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java @@ -228,8 +228,7 @@ public Dataflow create(PipelineOptions options) { * thrashing or out of memory. The location of the heap file will either be echoed back * to the user, or the user will be given the opportunity to download the heap file. * - *

- * CAUTION: Heap dumps can of comparable size to the default boot disk. Consider increasing + *

CAUTION: Heap dumps can of comparable size to the default boot disk. Consider increasing * the boot disk size before setting this flag to true. */ @Description("If {@literal true}, save a heap dump before killing a thread or process " diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptions.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptions.java index 701fe47976..f9c26f6136 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptions.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptions.java @@ -91,7 +91,7 @@ * *

Defining Your Own PipelineOptions

* - * Defining your own {@link PipelineOptions} is the way for you to make configuration + *

Defining your own {@link PipelineOptions} is the way for you to make configuration * options available for both local execution and execution via a {@link PipelineRunner}. * By having PipelineOptionsFactory as your command-line interpreter, you will provide * a standardized way for users to interact with your application via the command-line. @@ -115,7 +115,7 @@ * *

Restrictions

* - * Since PipelineOptions can be "cast" to multiple types dynamically using + *

Since PipelineOptions can be "cast" to multiple types dynamically using * {@link PipelineOptions#as(Class)}, a property must conform to the following set of restrictions: *

    *
  • Any property with the same name must have the same return type for all derived @@ -156,7 +156,7 @@ * *

    Registration Of PipelineOptions

    * - * Registration of {@link PipelineOptions} by an application guarantees that the + *

    Registration of {@link PipelineOptions} by an application guarantees that the * {@link PipelineOptions} is composable during execution of their {@link Pipeline} and * meets the restrictions listed above or will fail during registration. Registration * also lists the registered {@link PipelineOptions} when {@code --help} diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsReflector.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsReflector.java index 094590a322..77cf91592f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsReflector.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsReflector.java @@ -67,7 +67,7 @@ static Set getOptionSpecs(Class o /** * Retrieve metadata for the full set of pipeline options visible within the type hierarchy * closure of the set of input interfaces. An option is "visible" if: - *

    + * *

      *
    • The option is defined within the interface hierarchy closure of the input * {@link PipelineOptions}.
    • diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/ProxyInvocationHandler.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/ProxyInvocationHandler.java index 5070a52569..8ea2a3a49d 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/ProxyInvocationHandler.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/ProxyInvocationHandler.java @@ -237,7 +237,7 @@ public boolean equals(Object obj) { * Backing implementation for {@link PipelineOptions#as(Class)}. * * @param iface The interface that the returned object needs to implement. - * @return An object that implements the interface . + * @return An object that implements the interface {@code }. */ synchronized T as(Class iface) { checkNotNull(iface); @@ -389,7 +389,7 @@ private String displayDataString(Object value) { /** * Marker interface used when the original {@link PipelineOptions} interface is not known at * runtime. This can occur if {@link PipelineOptions} are deserialized from JSON. - *

      + * *

      Pipeline authors can ensure {@link PipelineOptions} type information is available at * runtime by registering their {@link PipelineOptions options} interfaces. See the "Registration" * section of {@link PipelineOptions} documentation. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/dataflow/AssignWindows.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/dataflow/AssignWindows.java index 093783de85..846de43bb6 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/dataflow/AssignWindows.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/dataflow/AssignWindows.java @@ -28,14 +28,14 @@ * A primitive {@link PTransform} that implements the {@link Window#into(WindowFn)} * {@link PTransform}. * - * For an application of {@link Window#into(WindowFn)} that changes the {@link WindowFn}, applies + *

      For an application of {@link Window#into(WindowFn)} that changes the {@link WindowFn}, applies * a primitive {@link PTransform} in the Dataflow service. * - * For an application of {@link Window#into(WindowFn)} that does not change the {@link WindowFn}, + *

      For an application of {@link Window#into(WindowFn)} that does not change the {@link WindowFn}, * applies an identity {@link ParDo} and sets the windowing strategy of the output * {@link PCollection}. * - * For internal use only. + *

      For internal use only. * * @param the type of input element */ diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/EvaluatorKey.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/EvaluatorKey.java index 307bc5cdb5..f6aafcc1b3 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/EvaluatorKey.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/EvaluatorKey.java @@ -22,8 +22,8 @@ /** * A (Transform, Pipeline Execution) key for stateful evaluators. * - * Source evaluators are stateful to ensure data is not read multiple times. Evaluators are cached - * to ensure that the reader is not restarted if the evaluator is retriggered. An + *

      Source evaluators are stateful to ensure data is not read multiple times. Evaluators are + * cached to ensure that the reader is not restarted if the evaluator is retriggered. An * {@link EvaluatorKey} is used to ensure that multiple Pipelines can be executed without sharing * the same evaluators. */ diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java index 4855af6496..88c93f795c 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java @@ -272,7 +272,7 @@ public final void handleThrowable(CommittedBundle inputBundle, Throwable t) { /** * An internal status update on the state of the executor. * - * Used to signal when the executor should be shut down (due to an exception). + *

      Used to signal when the executor should be shut down (due to an exception). */ @AutoValue abstract static class ExecutorUpdate { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InMemoryWatermarkManager.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InMemoryWatermarkManager.java index 80dbd9917a..9d145e8c09 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InMemoryWatermarkManager.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InMemoryWatermarkManager.java @@ -167,7 +167,7 @@ public boolean isAdvanced() { /** * Returns the {@link WatermarkUpdate} that is a result of combining the two watermark updates. * - * If either of the input {@link WatermarkUpdate WatermarkUpdates} were advanced, the result + *

      If either of the input {@link WatermarkUpdate WatermarkUpdates} were advanced, the result * {@link WatermarkUpdate} has been advanced. */ public WatermarkUpdate union(WatermarkUpdate that) { @@ -640,7 +640,7 @@ public Iterable> apply(WindowedValue input) { * latestTime argument and put in in the result with the same key, then remove all of the keys * which have no more pending timers. * - * The result collection retains ordering of timers (from earliest to latest). + *

      The result collection retains ordering of timers (from earliest to latest). */ private static Map, List> extractFiredTimers( Instant latestTime, Map, NavigableSet> objectTimers) { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java index 48ef560f88..802851a8db 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java @@ -44,8 +44,8 @@ * An in-process implementation of the {@link Values Create.Values} {@link PTransform}, implemented * using a {@link BoundedSource}. * - * The coder is inferred via the {@link Values#getDefaultOutputCoder(PInput)} method on the original - * transform. + *

      The coder is inferred via the {@link Values#getDefaultOutputCoder(PInput)} method on the + * original transform. */ class InProcessCreate extends ForwardingPTransform> { private final Create.Values original; diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessEvaluationContext.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessEvaluationContext.java index b9bea5b2f4..7669f4ba97 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessEvaluationContext.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessEvaluationContext.java @@ -351,7 +351,7 @@ public ReadyCheckingSideInputReader createSideInputReader( * Create a {@link CounterSet} for this {@link Pipeline}. The {@link CounterSet} is independent * of all other {@link CounterSet CounterSets} created by this call. * - * The {@link InProcessEvaluationContext} is responsible for unifying the counters present in + *

      The {@link InProcessEvaluationContext} is responsible for unifying the counters present in * all created {@link CounterSet CounterSets} when the transforms that call this method * complete. */ diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessExecutionContext.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessExecutionContext.java index 31557d35fa..46894cdaf5 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessExecutionContext.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessExecutionContext.java @@ -26,8 +26,8 @@ /** * Execution Context for the {@link InProcessPipelineRunner}. * - * This implementation is not thread safe. A new {@link InProcessExecutionContext} must be created - * for each thread that requires it. + *

      This implementation is not thread safe. A new {@link InProcessExecutionContext} must be + * created for each thread that requires it. */ class InProcessExecutionContext extends BaseExecutionContext { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java index c51a2a027b..9f19c588ef 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java @@ -157,11 +157,10 @@ public interface CommittedBundle { * Return a new {@link CommittedBundle} that is like this one, except calls to * {@link #getElements()} will return the provided elements. This bundle is unchanged. * - *

      - * The value of the {@link #getSynchronizedProcessingOutputWatermark() synchronized processing - * output watermark} of the returned {@link CommittedBundle} is equal to the value returned from - * the current bundle. This is used to ensure a {@link PTransform} that could not complete - * processing on input elements properly holds the synchronized processing time to the + *

      The value of the {@link #getSynchronizedProcessingOutputWatermark() synchronized + * processing output watermark} of the returned {@link CommittedBundle} is equal to the value + * returned from the current bundle. This is used to ensure a {@link PTransform} that could not + * complete processing on input elements properly holds the synchronized processing time to the * appropriate value. */ CommittedBundle withElements(Iterable> elements); @@ -315,7 +314,7 @@ private BundleFactory createBundleFactory(InProcessPipelineOptions pipelineOptio /** * The result of running a {@link Pipeline} with the {@link InProcessPipelineRunner}. * - * Throws {@link UnsupportedOperationException} for all methods. + *

      Throws {@link UnsupportedOperationException} for all methods. */ public static class InProcessPipelineResult implements PipelineResult { private final InProcessExecutor executor; @@ -372,7 +371,7 @@ public AggregatorValues getAggregatorValues(Aggregator aggregator) * {@link InProcessPipelineOptions#isShutdownUnboundedProducersWithMaxWatermark()} set to false, * this method will never return. * - * See also {@link InProcessExecutor#awaitCompletion()}. + *

      See also {@link InProcessExecutor#awaitCompletion()}. */ public State awaitCompletion() throws Throwable { if (!state.isTerminal()) { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessTransformResult.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessTransformResult.java index fcd37a09a1..996d51b6b1 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessTransformResult.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessTransformResult.java @@ -58,7 +58,7 @@ interface InProcessTransformResult { /** * Returns the Watermark Hold for the transform at the time this result was produced. * - * If the transform does not set any watermark hold, returns + *

      If the transform does not set any watermark hold, returns * {@link BoundedWindow#TIMESTAMP_MAX_VALUE}. */ Instant getWatermarkHold(); @@ -66,7 +66,7 @@ interface InProcessTransformResult { /** * Returns the State used by the transform. * - * If this evaluation did not access state, this may return null. + *

      If this evaluation did not access state, this may return null. */ @Nullable CopyOnAccessInMemoryStateInternals getState(); diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/TransformEvaluator.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/TransformEvaluator.java index be54ec20ee..f4e9d028f7 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/TransformEvaluator.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/TransformEvaluator.java @@ -35,7 +35,7 @@ interface TransformEvaluator { /** * Finish processing the bundle of this {@link TransformEvaluator}. * - * After {@link #finishBundle()} is called, the {@link TransformEvaluator} will not be reused, + *

      After {@link #finishBundle()} is called, the {@link TransformEvaluator} will not be reused, * and no more elements will be processed. * * @return an {@link InProcessTransformResult} containing the results of this bundle evaluation. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ViewEvaluatorFactory.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ViewEvaluatorFactory.java index e2aabc25de..9b296dbdc7 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ViewEvaluatorFactory.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ViewEvaluatorFactory.java @@ -130,8 +130,8 @@ protected PTransform, PCollectionView> delegate() { /** * An in-process implementation of the {@link CreatePCollectionView} primitive. * - * This implementation requires the input {@link PCollection} to be an iterable, which is provided - * to {@link PCollectionView#fromIterableInternal(Iterable)}. + *

      This implementation requires the input {@link PCollection} to be an iterable, which is + * provided to {@link PCollectionView#fromIterableInternal(Iterable)}. */ public static final class WriteView extends PTransform>, PCollectionView> { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker/IsmFormat.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker/IsmFormat.java index 289a77beb1..b2a3fd592f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker/IsmFormat.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/worker/IsmFormat.java @@ -331,7 +331,7 @@ int getNumberOfShardKeyCoders(List keyComponents) { /** * Computes the shard id for the given key component(s). * - * The shard keys are encoded into their byte representations and hashed using the + *

      The shard keys are encoded into their byte representations and hashed using the * * 32-bit murmur3 algorithm, x86 variant (little-endian variant), * using {@code 1225801234} as the seed value. We ensure that shard ids for @@ -344,7 +344,7 @@ public int hash(List keyComponents) { /** * Computes the shard id for the given key component(s). * - * Mutates {@code keyBytes} such that when returned, contains the encoded + *

      Mutates {@code keyBytes} such that when returned, contains the encoded * version of the key components. */ int encodeAndHash(List keyComponents, RandomAccessData keyBytesToMutate) { @@ -684,7 +684,7 @@ public int hashCode() { /** * A coder for {@link IsmShard}s. * - * The shard descriptor is encoded as: + *

      The shard descriptor is encoded as: *

        *
      • id (variable length integer encoding)
      • *
      • blockOffset (variable length long encoding)
      • diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SerializableMatchers.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SerializableMatchers.java index da5171e21f..85b1d5b072 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SerializableMatchers.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SerializableMatchers.java @@ -1046,14 +1046,13 @@ public String toString() { * the {@link Matcher} returned by {@link SerializableSupplier#get() get()} when it is invoked * during matching (which may occur on another machine, such as a Dataflow worker). * - * + *
        {@code
            * return fromSupplier(new SerializableSupplier>() {
        -   *   *     @Override
            *     public Matcher get() {
            *       return new MyMatcherForT();
            *     }
            * });
        -   * 
        +   * }
        */ public static SerializableMatcher fromSupplier( SerializableSupplier> supplier) { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SourceTestUtils.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SourceTestUtils.java index 0b7fb3ea8c..8dbc600b8f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SourceTestUtils.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/testing/SourceTestUtils.java @@ -287,7 +287,8 @@ public SplitAtFractionResult(int numPrimaryItems, int numResidualItems) { * Asserts that the {@code source}'s reader either fails to {@code splitAtFraction(fraction)} * after reading {@code numItemsToReadBeforeSplit} items, or succeeds in a way that is * consistent according to {@link #assertSplitAtFractionSucceedsAndConsistent}. - *

        Returns SplitAtFractionResult. + * + *

        Returns SplitAtFractionResult. */ public static SplitAtFractionResult assertSplitAtFractionBehavior( diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ApproximateQuantiles.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ApproximateQuantiles.java index 75ec7a8e1d..a995431b7a 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ApproximateQuantiles.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ApproximateQuantiles.java @@ -208,7 +208,7 @@ PTransform, PCollection>> globally(int numQuantiles) { *

* *

The default error bound is {@code 1 / N}, though in practice - * the accuracy tends to be much better.

See + * the accuracy tends to be much better. See * {@link #create(int, Comparator, long, double)} for * more information about the meaning of {@code epsilon}, and * {@link #withEpsilon} for a convenient way to adjust it. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java index a1d124ca78..bd1ccf977f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java @@ -672,7 +672,7 @@ public void verifyDeterministic() throws NonDeterministicException { * An abstract subclass of {@link CombineFn} for implementing combiners that are more * easily and efficiently expressed as binary operations on ints * - *

It uses {@code int[0]} as the mutable accumulator. + *

It uses {@code int[0]} as the mutable accumulator. */ public abstract static class BinaryCombineIntegerFn extends CombineFn { @@ -754,7 +754,7 @@ public Counter getCounter(String name) { * An abstract subclass of {@link CombineFn} for implementing combiners that are more * easily and efficiently expressed as binary operations on longs. * - *

It uses {@code long[0]} as the mutable accumulator. + *

It uses {@code long[0]} as the mutable accumulator. */ public abstract static class BinaryCombineLongFn extends CombineFn { /** @@ -834,7 +834,7 @@ public Counter getCounter(String name) { * An abstract subclass of {@link CombineFn} for implementing combiners that are more * easily and efficiently expressed as binary operations on doubles. * - *

It uses {@code double[0]} as the mutable accumulator. + *

It uses {@code double[0]} as the mutable accumulator. */ public abstract static class BinaryCombineDoubleFn extends CombineFn { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineFns.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineFns.java index 44004660d9..b9ee5aeeeb 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineFns.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineFns.java @@ -66,7 +66,7 @@ public class CombineFns { *

The same {@link TupleTag} cannot be used in a composition multiple times. * *

Example: - *

{ @code
+   * 
{@code
    * PCollection> latencies = ...;
    *
    * TupleTag maxLatencyTag = new TupleTag();
@@ -74,7 +74,6 @@ public class CombineFns {
    *
    * SimpleFunction identityFn =
    *     new SimpleFunction() {
-   *       @Override
    *       public Integer apply(Integer input) {
    *           return input;
    *       }};
@@ -87,7 +86,6 @@ public class CombineFns {
    * PCollection finalResultCollection = maxAndMean
    *     .apply(ParDo.of(
    *         new DoFn, T>() {
-   *           @Override
    *           public void processElement(ProcessContext c) throws Exception {
    *             KV e = c.element();
    *             Integer maxLatency = e.getValue().get(maxLatencyTag);
@@ -96,7 +94,7 @@ public class CombineFns {
    *             c.output(...some T...);
    *           }
    *         }));
-   * } 
+ * }
*/ public static ComposeKeyedCombineFnBuilder composeKeyed() { return new ComposeKeyedCombineFnBuilder(); @@ -109,7 +107,7 @@ public static ComposeKeyedCombineFnBuilder composeKeyed() { *

The same {@link TupleTag} cannot be used in a composition multiple times. * *

Example: - *

{ @code
+   * 
{@code
    * PCollection globalLatencies = ...;
    *
    * TupleTag maxLatencyTag = new TupleTag();
@@ -117,7 +115,6 @@ public static ComposeKeyedCombineFnBuilder composeKeyed() {
    *
    * SimpleFunction identityFn =
    *     new SimpleFunction() {
-   *       @Override
    *       public Integer apply(Integer input) {
    *           return input;
    *       }};
@@ -130,7 +127,6 @@ public static ComposeKeyedCombineFnBuilder composeKeyed() {
    * PCollection finalResultCollection = maxAndMean
    *     .apply(ParDo.of(
    *         new DoFn() {
-   *           @Override
    *           public void processElement(ProcessContext c) throws Exception {
    *             CoCombineResult e = c.element();
    *             Integer maxLatency = e.get(maxLatencyTag);
@@ -139,7 +135,7 @@ public static ComposeKeyedCombineFnBuilder composeKeyed() {
    *             c.output(...some T...);
    *           }
    *         }));
-   * } 
+ * }
*/ public static ComposeCombineFnBuilder compose() { return new ComposeCombineFnBuilder(); diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineWithContext.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineWithContext.java index e9f29fccb4..d7439b1fc5 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineWithContext.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/CombineWithContext.java @@ -63,7 +63,7 @@ public interface RequiresContextInternal {} * A combine function that has access to {@code PipelineOptions} and side inputs through * {@code CombineWithContext.Context}. * - * See the equivalent {@link CombineFn} for details about combine functions. + *

See the equivalent {@link CombineFn} for details about combine functions. */ public abstract static class CombineFnWithContext extends CombineFnBase.AbstractGlobalCombineFn @@ -180,7 +180,7 @@ public void populateDisplayData(DisplayData.Builder builder) { * A keyed combine function that has access to {@code PipelineOptions} and side inputs through * {@code CombineWithContext.Context}. * - * See the equivalent {@link KeyedCombineFn} for details about keyed combine functions. + *

See the equivalent {@link KeyedCombineFn} for details about keyed combine functions. */ public abstract static class KeyedCombineFnWithContext extends CombineFnBase.AbstractPerKeyCombineFn diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/DoFn.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/DoFn.java index fffe769995..c9dc9f433a 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/DoFn.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/DoFn.java @@ -298,7 +298,7 @@ public abstract class ProcessContext extends Context { * timestamps can only be shifted forward to future. For infinite * skew, return {@code Duration.millis(Long.MAX_VALUE)}. * - *

Note that producing an element whose timestamp is less than the + *

Note that producing an element whose timestamp is less than the * current timestamp may result in late data, i.e. returning a non-zero * value here does not impact watermark calculations used for firing * windows. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/PTransform.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/PTransform.java index 8499bcb141..38e0312857 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/PTransform.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/PTransform.java @@ -287,9 +287,9 @@ protected Coder getDefaultOutputCoder() throws CannotProvideCoderException { * Returns the default {@code Coder} to use for the output of this * single-output {@code PTransform} when applied to the given input. * - * @throws CannotProvideCoderException if none can be inferred. - * *

By default, always throws. + * + * @throws CannotProvideCoderException if none can be inferred. */ protected Coder getDefaultOutputCoder(@SuppressWarnings("unused") InputT input) throws CannotProvideCoderException { @@ -300,9 +300,9 @@ protected Coder getDefaultOutputCoder(@SuppressWarnings("unused") InputT inpu * Returns the default {@code Coder} to use for the given output of * this single-output {@code PTransform} when applied to the given input. * - * @throws CannotProvideCoderException if none can be inferred. - * *

By default, always throws. + * + * @throws CannotProvideCoderException if none can be inferred. */ public Coder getDefaultOutputCoder( InputT input, @SuppressWarnings("unused") TypedPValue output) diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ParDo.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ParDo.java index 29bfb7f549..2134de6474 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ParDo.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/ParDo.java @@ -101,7 +101,7 @@ * If this method is not overridden, this call may be optimized away. * * - * Each of the calls to any of the {@link DoFn DoFn's} processing + *

Each of the calls to any of the {@link DoFn DoFn's} processing * methods can produce zero or more output elements. All of the * of output elements from all of the {@link DoFn} instances * are included in the output {@link PCollection}. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicates.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicates.java index 8913138abb..4388c2d5b8 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicates.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicates.java @@ -105,7 +105,7 @@ public Void apply(Iterable iter) { * A {@link RemoveDuplicates} {@link PTransform} that uses a {@link SerializableFunction} to * obtain a representative value for each input element. * - * Construct via {@link RemoveDuplicates#withRepresentativeValueFn(SerializableFunction)}. + *

Construct via {@link RemoveDuplicates#withRepresentativeValueFn(SerializableFunction)}. * * @param the type of input and output element * @param the type of representative values used to dedup @@ -143,8 +143,9 @@ public T apply(T left, T right) { * Return a {@code WithRepresentativeValues} {@link PTransform} that is like this one, but with * the specified output type descriptor. * - * Required for use of {@link RemoveDuplicates#withRepresentativeValueFn(SerializableFunction)} - * in Java 8 with a lambda as the fn. + *

Required for use of + * {@link RemoveDuplicates#withRepresentativeValueFn(SerializableFunction)} in Java 8 with a + * lambda as the fn. * * @param type a {@link TypeDescriptor} describing the representative type of this * {@code WithRepresentativeValues} diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/WithKeys.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/WithKeys.java index c06795c703..4c1ce7f073 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/WithKeys.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/WithKeys.java @@ -98,7 +98,7 @@ private WithKeys(SerializableFunction fn, Class keyClass) { /** * Return a {@link WithKeys} that is like this one with the specified key type descriptor. * - * For use with lambdas in Java 8, either this method must be called with an appropriate type + *

For use with lambdas in Java 8, either this method must be called with an appropriate type * descriptor or {@link PCollection#setCoder(Coder)} must be called on the output * {@link PCollection}. */ diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/display/DisplayData.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/display/DisplayData.java index 0dadedf7ea..599245f911 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/display/DisplayData.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/display/DisplayData.java @@ -164,7 +164,7 @@ public interface Builder { * } *

* - * Using {@code include(subcomponent)} will associate each of the registered items with the + *

Using {@code include(subcomponent)} will associate each of the registered items with the * namespace of the {@code subcomponent} being registered. To register display data in the * current namespace, such as from a base class implementation, use * {@code subcomponent.populateDisplayData(builder)} instead. @@ -233,7 +233,7 @@ public String getNamespace() { /** * The key for the display item. Each display item is created with a key and value - * via {@link DisplayData#item). + * via {@link DisplayData#item}. */ @JsonGetter("key") public String getKey() { @@ -269,8 +269,8 @@ public Object getValue() { * value. For example, the {@link #getValue() value} for {@link Type#JAVA_CLASS} items contains * the full class name with package, while the short value contains just the class name. * - * A {@link #getValue() value} will be provided for each display item, and some types may also - * provide a short-value. If a short value is provided, display data consumers may + *

A {@link #getValue() value} will be provided for each display item, and some types may + * also provide a short-value. If a short value is provided, display data consumers may * choose to display it instead of or in addition to the {@link #getValue() value}. */ @JsonGetter("shortValue") diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/AfterWatermark.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/AfterWatermark.java index 6ffa85c6bd..5fd8554c90 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/AfterWatermark.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/AfterWatermark.java @@ -32,7 +32,7 @@ import javax.annotation.Nullable; /** - *

{@code AfterWatermark} triggers fire based on progress of the system watermark. This time is a + * {@code AfterWatermark} triggers fire based on progress of the system watermark. This time is a * lower-bound, sometimes heuristically established, on event times that have been fully processed * by the pipeline. * @@ -56,7 +56,7 @@ * *

The watermark is the clock that defines {@link TimeDomain#EVENT_TIME}. * - * Additionaly firings before or after the watermark can be requested by calling + *

Additionaly firings before or after the watermark can be requested by calling * {@code AfterWatermark.pastEndOfWindow.withEarlyFirings(OnceTrigger)} or * {@code AfterWatermark.pastEndOfWindow.withEarlyFirings(OnceTrigger)}. * diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Never.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Never.java index 9fa8498220..12a3cc707c 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Never.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Never.java @@ -23,9 +23,8 @@ /** * A trigger which never fires. * - *

- * Using this trigger will only produce output when the watermark passes the end of the - * {@link BoundedWindow window} plus the {@link Window#withAllowedLateness() allowed + *

Using this trigger will only produce output when the watermark passes the end of the + * {@link BoundedWindow window} plus the {@link Window#withAllowedLateness allowed * lateness}. */ public final class Never { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/PaneInfo.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/PaneInfo.java index 0fdf590a87..3cb556852f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/PaneInfo.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/PaneInfo.java @@ -94,7 +94,7 @@ public final class PaneInfo { * And a {@code LATE} pane cannot contain locally on-time elements. * * - * However, note that: + *

However, note that: *

    *
  1. An {@code ON_TIME} pane may contain locally late elements. It may even contain only * locally late elements. Provided a locally late element finds its way into an {@code ON_TIME} @@ -256,7 +256,7 @@ public long getIndex() { /** * The zero-based index of this trigger firing among non-speculative panes. * - *

    This will return 0 for the first non-{@link Timing#EARLY} timer firing, 1 for the next one, + *

    This will return 0 for the first non-{@link Timing#EARLY} timer firing, 1 for the next one, * etc. * *

    Always -1 for speculative data. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Window.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Window.java index 60b9e5d904..70ed44c71d 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Window.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/windowing/Window.java @@ -42,7 +42,7 @@ * {@link com.google.cloud.dataflow.sdk.transforms.GroupByKey GroupByKeys}, * including one within composite transforms, will group by the combination of * keys and windows. - + * *

    See {@link com.google.cloud.dataflow.sdk.transforms.GroupByKey} * for more information about how grouping with windows works. * diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/AvroUtils.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/AvroUtils.java index 15acc8899d..35c11f6c88 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/AvroUtils.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/AvroUtils.java @@ -193,7 +193,7 @@ static String formatTimestamp(String timestamp) { /** * Utility function to convert from an Avro {@link GenericRecord} to a BigQuery {@link TableRow}. * - * See + *

    See * "Avro format" for more information. */ public static TableRow convertGenericRecordToTableRow(GenericRecord record, TableSchema schema) { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BaseExecutionContext.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BaseExecutionContext.java index 6a0ccf3531..19ba8d6256 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BaseExecutionContext.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BaseExecutionContext.java @@ -38,16 +38,17 @@ *

    BaseExecutionContext is generic to allow implementing subclasses to return a concrete subclass * of {@link StepContext} from {@link #getOrCreateStepContext(String, String, StateSampler)} and * {@link #getAllStepContexts()} without forcing each subclass to override the method, e.g. - *

    - * @Override
    + *
    + * 
    {@code
    + * {@literal @}Override
      * StreamingModeExecutionContext.StepContext getOrCreateStepContext(...) {
      *   return (StreamingModeExecutionContext.StepContext) super.getOrCreateStepContext(...);
      * }
    - * 
    + * }
    * *

    When a subclass of {@code BaseExecutionContext} has been downcast, the return types of * {@link #createStepContext(String, String, StateSampler)}, - * {@link #getOrCreateStepContext(String, String, StateSampler}, and {@link #getAllStepContexts()} + * {@link #getOrCreateStepContext(String, String, StateSampler)}, and {@link #getAllStepContexts()} * will be appropriately specialized. */ public abstract class BaseExecutionContext diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BigQueryServices.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BigQueryServices.java index b13feeade5..ec96009494 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BigQueryServices.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/BigQueryServices.java @@ -105,7 +105,7 @@ JobStatistics dryRunQuery(String projectId, JobConfigurationQuery queryConfig) /** * Gets the specified {@link Job} by the given {@link JobReference}. * - * Returns null if the job is not found. + *

    Returns null if the job is not found. */ Job getJob(JobReference jobRef) throws IOException, InterruptedException; } diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunner.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunner.java index 81ccbf1fe8..e0ce0ae097 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunner.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunner.java @@ -47,7 +47,7 @@ public interface ReduceFnExecutor { /** * Gets this object as a {@link DoFn}. * - * Most implementors of this interface are expected to be {@link DoFn} instances, and will + *

    Most implementors of this interface are expected to be {@link DoFn} instances, and will * return themselves. */ DoFn, KV> asDoFn(); diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunnerBase.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunnerBase.java index 51237a03b8..0c2465161e 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunnerBase.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/DoFnRunnerBase.java @@ -56,7 +56,7 @@ /** * A base implementation of {@link DoFnRunner}. * - *

    Sub-classes should override {@link #invokeProcessElement}. + *

    Sub-classes should override {@link #invokeProcessElement}. */ public abstract class DoFnRunnerBase implements DoFnRunner { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/ExposedByteArrayOutputStream.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/ExposedByteArrayOutputStream.java index d8e4d50714..8bb931ae21 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/ExposedByteArrayOutputStream.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/ExposedByteArrayOutputStream.java @@ -56,6 +56,7 @@ private void fallback() { * Write {@code b} to the stream and take the ownership of {@code b}. * If the stream is empty, {@code b} itself will be used as the content of the stream and * no content copy will be involved. + * *

    Note: After passing any byte array to this method, it must not be modified again. * * @throws IOException diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PerKeyCombineFnRunners.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PerKeyCombineFnRunners.java index 6606c5451f..a11b275a4f 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PerKeyCombineFnRunners.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PerKeyCombineFnRunners.java @@ -49,7 +49,7 @@ public class PerKeyCombineFnRunners { /** * An implementation of {@link PerKeyCombineFnRunner} with {@link KeyedCombineFn}. * - * It forwards functions calls to the {@link KeyedCombineFn}. + *

    It forwards functions calls to the {@link KeyedCombineFn}. */ private static class KeyedCombineFnRunner implements PerKeyCombineFnRunner { @@ -145,7 +145,7 @@ public AccumT compact(K key, AccumT accumulator, PipelineOptions options, /** * An implementation of {@link PerKeyCombineFnRunner} with {@link KeyedCombineFnWithContext}. * - * It forwards functions calls to the {@link KeyedCombineFnWithContext}. + *

    It forwards functions calls to the {@link KeyedCombineFnWithContext}. */ private static class KeyedCombineFnWithContextRunner implements PerKeyCombineFnRunner { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubClient.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubClient.java index 2b24dd9891..e544bde333 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubClient.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubClient.java @@ -85,6 +85,7 @@ private static Long asMsSinceEpoch(@Nullable String timestamp) { /** * Return the timestamp (in ms since unix epoch) to use for a Pubsub message with {@code * attributes} and {@code pubsubTimestamp}. + * *

    If {@code timestampLabel} is non-{@literal null} then the message attributes must contain * that label, and the value of that label will be taken as the timestamp. * Otherwise the timestamp will be taken from the Pubsub publish timestamp {@code @@ -298,6 +299,7 @@ public static TopicPath topicPathFromName(String projectId, String topicName) { /** * A message to be sent to Pubsub. + * *

    NOTE: This class is {@link Serializable} only to support the {@link PubsubTestClient}. * Java serialization is never used for non-test clients. */ @@ -356,6 +358,7 @@ public int hashCode() { /** * A message received from Pubsub. + * *

    NOTE: This class is {@link Serializable} only to support the {@link PubsubTestClient}. * Java serialization is never used for non-test clients. */ diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubTestClient.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubTestClient.java index df8c1a72e6..c3a5a4e959 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubTestClient.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/PubsubTestClient.java @@ -42,7 +42,7 @@ public class PubsubTestClient extends PubsubClient { /** * Mimic the state of the simulated Pubsub 'service'. * - * Note that the {@link PubsubTestClientFactory} is serialized/deserialized even when running + *

    Note that the {@link PubsubTestClientFactory} is serialized/deserialized even when running * test pipelines. Meanwhile it is valid for multiple {@link PubsubTestClient}s to be created * from the same client factory and run in parallel. Thus we can't enforce aliasing of the * following data structures over all clients and must resort to a static. diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/RandomAccessData.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/RandomAccessData.java index 6c96c8e703..359e40ff64 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/RandomAccessData.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/RandomAccessData.java @@ -44,8 +44,8 @@ * also provides random access to bytes stored within. This wrapper allows users to finely * control the number of byte copies that occur. * - * Anything stored within the in-memory buffer from offset {@link #size()} is considered temporary - * unused storage. + *

    Anything stored within the in-memory buffer from offset {@link #size()} is considered + * temporary unused storage. */ @NotThreadSafe public class RandomAccessData { @@ -54,7 +54,7 @@ public class RandomAccessData { * This follows the same encoding scheme as {@link ByteArrayCoder}. * This coder is deterministic and consistent with equals. * - * This coder does not support encoding positive infinity. + *

    This coder does not support encoding positive infinity. */ public static class RandomAccessDataCoder extends AtomicCoder { private static final RandomAccessDataCoder INSTANCE = new RandomAccessDataCoder(); @@ -192,7 +192,7 @@ public int commonPrefixLength(RandomAccessData o1, RandomAccessData o2) { * is strictly greater than this. Note that if this is empty or is all 0xFF then * a token value of positive infinity is returned. * - * The {@link UnsignedLexicographicalComparator} supports comparing {@link RandomAccessData} + *

    The {@link UnsignedLexicographicalComparator} supports comparing {@link RandomAccessData} * with support for positive infinitiy. */ public RandomAccessData increment() throws IOException { diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/TimerInternals.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/TimerInternals.java index 8cbd393da6..57518b1ac3 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/TimerInternals.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/TimerInternals.java @@ -111,7 +111,7 @@ public interface TimerInternals { *

  2. However will never be behind the global input watermark for any following computation. *
* - *

In pictures: + *

In pictures: *

    *  |              |       |       |       |
    *  |              |   D   |   C   |   B   |   A
diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/common/Counter.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/common/Counter.java
index 5dda81189e..0b73d382b0 100644
--- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/common/Counter.java
+++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/common/Counter.java
@@ -364,11 +364,11 @@ public boolean committing() {
   /**
    * Changes the counter from {@code CommitState.COMMITTING} to {@code CommitState.COMMITTED}.
    *
-   * @return true if successful.
-   *
    * 

False return indicates that the counter was updated while the committing is pending. * That counter update might or might not has been committed. The {@code commitState} has to * stay in {@code CommitState.DIRTY}. + * + * @return true if successful. */ public boolean committed() { return commitState.compareAndSet(CommitState.COMMITTING, CommitState.COMMITTED); diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/values/PInput.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/values/PInput.java index c3805c7e49..77afe9d378 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/values/PInput.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/values/PInput.java @@ -46,7 +46,7 @@ public interface PInput { Collection expand(); /** - *

After building, finalizes this {@code PInput} to make it ready for + * After building, finalizes this {@code PInput} to make it ready for * being used as an input to a {@link com.google.cloud.dataflow.sdk.transforms.PTransform}. * *

Automatically invoked whenever {@code apply()} is invoked on diff --git a/sdk/src/test/java/com/google/cloud/dataflow/sdk/testing/SystemNanoTimeSleeper.java b/sdk/src/test/java/com/google/cloud/dataflow/sdk/testing/SystemNanoTimeSleeper.java index d8507f79b0..4a727bc7a7 100644 --- a/sdk/src/test/java/com/google/cloud/dataflow/sdk/testing/SystemNanoTimeSleeper.java +++ b/sdk/src/test/java/com/google/cloud/dataflow/sdk/testing/SystemNanoTimeSleeper.java @@ -28,7 +28,7 @@ * href="https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks"> * article goes into further detail about this issue. * - * This {@link Sleeper} uses {@link System#nanoTime} + *

This {@link Sleeper} uses {@link System#nanoTime} * as the timing source and {@link LockSupport#parkNanos} as the wait method. * Note that usage of this sleeper may impact performance because * of the relatively more expensive methods being invoked when compared to From 66f664ce8fadd5765d98912101be60d605c4680f Mon Sep 17 00:00:00 2001 From: Dan Halperin Date: Tue, 13 Dec 2016 08:13:47 -0800 Subject: [PATCH 2/2] Kafka contrib: checkstyle fixes --- .../google/cloud/dataflow/contrib/kafka/KafkaIO.java | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/contrib/kafka/src/main/java/com/google/cloud/dataflow/contrib/kafka/KafkaIO.java b/contrib/kafka/src/main/java/com/google/cloud/dataflow/contrib/kafka/KafkaIO.java index 47335bdbc9..67d523a1ac 100644 --- a/contrib/kafka/src/main/java/com/google/cloud/dataflow/contrib/kafka/KafkaIO.java +++ b/contrib/kafka/src/main/java/com/google/cloud/dataflow/contrib/kafka/KafkaIO.java @@ -106,7 +106,7 @@ *

Although most applications consumer single topic, the source can be configured to consume * multiple topics or even a specific set of {@link TopicPartition}s. * - *

To configure a Kafka source, you must specify at the minimum Kafka bootstrapServers + *

To configure a Kafka source, you must specify at the minimum Kafka bootstrapServers * and one or more topics to consume. The following example illustrates various options for * configuring the source : * @@ -153,7 +153,7 @@ * *

Writing to Kafka

* - * KafkaIO sink supports writing key-value pairs to a Kafka topic. Users can also write just the + *

KafkaIO sink supports writing key-value pairs to a Kafka topic. Users can also write just the * values. To configure a Kafka sink, you must specify at the minimum Kafka * bootstrapServers and the topic to write to. The following example illustrates various * options for configuring the sink: @@ -175,7 +175,7 @@ * ); * }

* - * Often you might want to write just values without any keys to Kafka. Use {@code values()} to + *

Often you might want to write just values without any keys to Kafka. Use {@code values()} to * write records with default empty(null) key: * *

{@code
@@ -578,7 +578,7 @@ private TypedRead(
     }
 
     /**
-     * Creates an {@link UnboundedSource, ?>} with the configuration in {@link
+     * Creates an {@link UnboundedSource} with the configuration in {@link
      * TypedRead}. Primary use case is unit tests, should not be used in an application.
      */
     @VisibleForTesting
@@ -751,7 +751,7 @@ public int compare(TopicPartition tp1, TopicPartition tp2) {
     /**
      * Returns one split for each of the Kafka partitions.
      *
-     * 

It is important to sort the partitions deterministically so that we can support + *

It is important to sort the partitions deterministically so that we can support * resuming a split from last checkpoint. The Kafka partitions are sorted by {@code }. */