[BEAM-2676] move BeamSqlRow and BeamSqlRowType to sdk/java/core #3633

mingmxu · 2017-07-24T23:11:44Z

It contains two parts:

remove SQL word in the name,
BeamSqlRow --> BeamRow
BeamSqlRowType --> BeamRowType
move from package org.apache.beam.dsls.sql.schema to org.apache.beam.sdk.sd (sd stands for structure data), in module beam-sdks-java-core

Hint:
The 4 files are changed to remove dependencies on Calcite/BeamSql, others are updated automate by IDE due to class name change:

mingmxu · 2017-07-24T23:34:53Z

R: @robertwb

robertwb

Looking good. Just a couple of questions.

robertwb · 2017-07-25T00:23:44Z

sdks/java/core/src/main/java/org/apache/beam/sdk/sd/package-info.java

+ * <p>Similar as the <em>row</em> concept in database, {@link org.apache.beam.sdk.sd.BeamRow}
+ * represents one row element in a {@link org.apache.beam.sdk.values.PCollection<BeamRow>}.
+ * Limited SQL types are supported now, visit
+ * <a href="https://beam.apache.org/blog/2017/07/21/sql-dsl.html#data-type">data types</a>


I assume this blog post is pending? Will the date embedded in the URL change?

The URL will not change as it's determined as below:

+--- +layout: post +title: "Use Beam SQL DSL to build a pipeline" +date: 2017-07-21 00:00:00 -0800 +excerpt_separator:  +categories: blog +authors: + - mingmxu +---

robertwb · 2017-07-25T00:29:15Z

sdks/java/core/src/main/java/org/apache/beam/sdk/sd/package-info.java

+ * <a href="https://beam.apache.org/blog/2017/07/21/sql-dsl.html#data-type">data types</a>
+ * for more details.
+ */
+package org.apache.beam.sdk.sd;


I assume "sd" stands for "structured data" or similar?

Right, sd stands for structure data

robertwb · 2017-07-25T00:34:46Z

sdks/java/core/src/main/java/org/apache/beam/sdk/sd/BeamRowType.java

  public abstract List<String> getFieldsName();
  public abstract List<Integer> getFieldsType();

-  public static BeamSqlRowType create(List<String> fieldNames, List<Integer> fieldTypes) {
-    return new AutoValue_BeamSqlRowType(fieldNames, fieldTypes);
+  public static BeamRowType create(List<String> fieldNames, List<Integer> fieldTypes) {


This interface (taking a list of names, then a parallel list of types) seems fragile and prone to off-by-one errors. If columns are always referred to by name (not index) in the API, could this just be a Map? Otherwise, some kind of an OrderedMap (or list of pairs)?

That's a good point, it doesn't verify that size of names and types matches. I'll open another task to address it. --A line like fieldNames.size() == fieldTypes.size() is enough I think.

--Columns in BeamRow are accessed both by name and index.

create BEAM-2678

That would help, but it's still messy to have the relationship fall to simply having the same index in a parallel list. I'll comment on the bug, as this is somewhat orthogonal to this PR.

Yup, a List<KV<FieldName, Types>> may be better.

robertwb · 2017-07-25T00:39:46Z

sdks/java/core/src/main/java/org/apache/beam/sdk/sd/BeamRowCoder.java


 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
+import java.sql.Types;


Is this as safe dependency?

Yes, java.sql.Types is part of JDK.

mingmxu · 2017-07-25T06:17:20Z

retest this please

robertwb

LGTM, pending tests passing.

coveralls · 2017-07-25T21:40:29Z

Changes Unknown when pulling c33150d on XuMingmin:BEAM-2676 into ** on apache:DSL_SQL**.

mingmxu · 2017-07-26T17:09:25Z

@robertwb could you merge this task?

Also #3628 as I want to use CombineFnTester to test an aggregation function in Beam SQL.

mingmxu · 2017-07-26T22:41:22Z

R: + @lukecwik

coveralls · 2017-07-27T00:09:48Z

Changes Unknown when pulling 5e65d6f on XuMingmin:BEAM-2676 into ** on apache:DSL_SQL**.

mingmxu · 2017-07-27T21:48:17Z

move back to DSL/SQL as there're some concerns that this is SQL-specific, not a generic Tuple solution.

coveralls · 2017-07-27T23:18:05Z

Changes Unknown when pulling b56234d on XuMingmin:BEAM-2676 into ** on apache:DSL_SQL**.

mingmxu · 2017-07-27T23:25:56Z

retest this please

mingmxu · 2017-07-28T04:48:56Z

retest this please

coveralls · 2017-07-28T06:29:56Z

Changes Unknown when pulling b56234d on XuMingmin:BEAM-2676 into ** on apache:DSL_SQL**.

robertwb · 2017-07-31T18:13:17Z

Could you point me to where these concerns were discussed?

mingmxu · 2017-07-31T18:28:06Z

You can find the comments in JIRA ticket.

…

Sent from my iPhone

On Jul 31, 2017, at 11:13 AM, Robert Bradshaw ***@***.***> wrote: Could you point me to where these concerns were discussed? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

mingmxu · 2017-08-02T07:01:42Z

close this task, and prepare a new PR to avoid the huge rebase work.

move BeamSqlRow and BeamSqlRowType to sdk/java/core

5b38211

mingmxu force-pushed the BEAM-2676 branch from 8e93828 to 5b38211 Compare July 24, 2017 23:15

robertwb reviewed Jul 25, 2017

View reviewed changes

robertwb approved these changes Jul 25, 2017

View reviewed changes

fix an error in JavaDoc.

c33150d

move to package org.apache.beam.sdk.values

5e65d6f

move back to DSL/SQL

b56234d

mingmxu closed this Aug 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEAM-2676] move BeamSqlRow and BeamSqlRowType to sdk/java/core #3633

[BEAM-2676] move BeamSqlRow and BeamSqlRowType to sdk/java/core #3633

mingmxu commented Jul 24, 2017 •

edited

mingmxu commented Jul 24, 2017

robertwb left a comment

robertwb Jul 25, 2017

mingmxu Jul 25, 2017

robertwb Jul 25, 2017

mingmxu Jul 25, 2017

robertwb Jul 25, 2017

mingmxu Jul 25, 2017

mingmxu Jul 25, 2017

robertwb Jul 25, 2017

mingmxu Jul 25, 2017

robertwb Jul 25, 2017

mingmxu Jul 25, 2017

mingmxu commented Jul 25, 2017

robertwb left a comment

coveralls commented Jul 25, 2017

mingmxu commented Jul 26, 2017

mingmxu commented Jul 26, 2017

coveralls commented Jul 27, 2017

mingmxu commented Jul 27, 2017

coveralls commented Jul 27, 2017

mingmxu commented Jul 27, 2017

mingmxu commented Jul 28, 2017

coveralls commented Jul 28, 2017

robertwb commented Jul 31, 2017

mingmxu commented Jul 31, 2017 via email

mingmxu commented Aug 2, 2017

[BEAM-2676] move BeamSqlRow and BeamSqlRowType to sdk/java/core #3633

[BEAM-2676] move BeamSqlRow and BeamSqlRowType to sdk/java/core #3633

Conversation

mingmxu commented Jul 24, 2017 • edited

mingmxu commented Jul 24, 2017

robertwb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mingmxu commented Jul 25, 2017

robertwb left a comment

Choose a reason for hiding this comment

coveralls commented Jul 25, 2017

mingmxu commented Jul 26, 2017

mingmxu commented Jul 26, 2017

coveralls commented Jul 27, 2017

mingmxu commented Jul 27, 2017

coveralls commented Jul 27, 2017

mingmxu commented Jul 27, 2017

mingmxu commented Jul 28, 2017

coveralls commented Jul 28, 2017

robertwb commented Jul 31, 2017

mingmxu commented Jul 31, 2017 via email

mingmxu commented Aug 2, 2017

mingmxu commented Jul 24, 2017 •

edited