[BEAM-2515] BeamSql: refactor the MockedBeamSqlTable and related tests#3478
[BEAM-2515] BeamSql: refactor the MockedBeamSqlTable and related tests#3478xumingming wants to merge 5 commits intoapache:DSL_SQLfrom
Conversation
|
R: @xumingmin |
takidau
left a comment
There was a problem hiding this comment.
Nice. A couple javadoc requests, but otherwise looks solid.
| this.rows.add(row); | ||
| } | ||
|
|
||
| public RowsBuilder addRows(final Object... args) { |
| */ | ||
| public class MockedBoundedTable extends MockedTable { | ||
| public static final ConcurrentLinkedQueue<BeamSqlRow> CONTENT = new ConcurrentLinkedQueue<>(); | ||
| private List<BeamSqlRow> rows = new ArrayList<>(); |
| return new MockedBoundedTable(buildBeamSqlRecordType(args)); | ||
| } | ||
|
|
||
| public MockedBoundedTable addRows(Object... args) { |
| * Mocked table for bounded data sources. | ||
| */ | ||
| public class MockedBoundedTable extends MockedTable { | ||
| public static final ConcurrentLinkedQueue<BeamSqlRow> CONTENT = new ConcurrentLinkedQueue<>(); |
mingmxu
left a comment
There was a problem hiding this comment.
The new Mock package is much more clear. Just one question about the test data set, I see each unit test class prepared its own table, is it possible to test on the same set?
| public class MockedUnboundedTable extends MockedTable { | ||
| private List<Pair<Duration, List<BeamSqlRow>>> timestampedRows = new ArrayList<>(); | ||
| /** rows flow out from this table with the specified watermark instant. */ | ||
| private final List<Pair<Duration, List<BeamSqlRow>>> timestampedRows = new ArrayList<>(); |
There was a problem hiding this comment.
for L102-L105, is it better to add data first, and then advance watermark?
There was a problem hiding this comment.
Actually I did it in purpose, Let's look at some examples. First example:
addRows(
Duration.ZERO,
1, 1, 1, FIRST_DATE,
1, 2, 2, FIRST_DATE
)It means these data are added after the specified watermark(ZERO) -- just after the first window starts. Another example:
.addRows(
WINDOW_SIZE.minus(Duration.standardSeconds(1)),
2, 2, 3, SECOND_DATE,
2, 3, 3, SECOND_DATE
)It means these data are emitted just before the first window close. One more example:
.addRows(
WINDOW_SIZE.plus(Duration.standardSeconds(1)),
2, 2, 3, SECOND_DATE,
2, 3, 3, SECOND_DATE
)It means these data are emitted just after the first window close, and after the second window starts.
Very handy and easy to use, right :)
|
@xumingmin For the test data set, sometimes we need to test the normal case -- the test would pass; sometimes we need to test the exception case -- the test need to fail, the test data for these two kinds of case will differ. Another perspective: If we want to use one common test data set to test every cases of every relational algebra(join, union, values etc), the test data set will need diversity. Change to the common data set will affect all of tests, it will make the tests brittle -- which will bring unnecessary maintainance burden. I prefer prepare separate data set for each test. |
|
Retest this please. |
|
Make sense for different functions like JOIN/UNION/..., would be helpful to separate the |
|
I agree, the data set can be reused to test DSL methods, but let's leave to DSL Unit Tests PR, when it is needed? |
|
@lukecwik can you help merge this one? @xumingmin and @takidau have already reviewed and approved this PR, just need a merge. (Other ISSUE is blocked by this one). |
|
Merged, please close PR |
Summary:
MockedBeamSqlTabletoMockedBoundedTableRowsBuilderto build rows in Unit Test.PAssertrather thanassertEqualswhen test PCollections.