-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to use physical name instead of logical name to retrieve available interval #226
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs CHANGELOG
.collect(Collectors.toMap( | ||
Function.identity(), | ||
column -> availableIntervals.get(getSchema().getPhysicalColumnName(column.getName())) | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should wrap down 1 more level (ie. unwrap the .collect(Collectors.toMap(
line, and corresponding ));
line
return getSchema().getColumns().stream() | ||
.collect(Collectors.toMap( | ||
Function.identity(), | ||
column -> availableIntervals.get(getSchema().getPhysicalColumnName(column.getName())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be getOrDefault
?
@@ -106,7 +104,7 @@ public SimplifiedIntervalList getAvailableIntervals(DataSourceConstraint constra | |||
return metadataService.getAvailableIntervalsByTable(name); | |||
} | |||
|
|||
protected Set<String> getColumnNames() { | |||
return columnNames; | |||
protected Set<String> getColumnPhysicalNames() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this physicalColumnNames
? columnPhysicalNames
feels a bit backwards, for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as we don't have physicalColumn names
* | ||
* @param dataSourceConstraint The data source constarint to copy from | ||
*/ | ||
public DataSourceConstraint(DataSourceConstraint dataSourceConstraint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be protected
, not public
* | ||
* @return the physical name of all the columns | ||
*/ | ||
public Set<String> getAllColumnPhysicalNames() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename this physicalColumnNames
? columnPhysicalNames
feels a bit backwards, for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have PhysicalColumn
s, we do have Column
physical names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it more as having the "physical" "column names", rather than "physical column" "names", but I suppose it's not a big difference either way 😄
* @param dataSourceConstraint Data soruce contraint containing all the column names as logical names | ||
* @param physicalTableSchema A map from logical column name to physical column names | ||
*/ | ||
public PhysicalDataSourceConstraint(DataSourceConstraint dataSourceConstraint, PhysicalTableSchema physicalTableSchema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than depending on the PhysicalTableSchema
, we should just take the one mapping function we care about:
Function<String, String> logicalToPhysicalNameMapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need all the columns from it, might as well pass the entire schema.
97bfd0d
to
e2a36f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think moving the toNames... mapping to schema will greatly simplify this.
It'll also need tests.
CHANGELOG.md
Outdated
- [Fix to use physical name instead of logical name to retrieve available interval](https://github.com/yahoo/fili/pull/226) | ||
* `getAllAvailbleIntervals` in `ConcreteAvailability` no longer filters out unconfigured columns, instead table's `getAllAvailbleIntervals` does | ||
* `getAvailbleIntervals` in `Availbality` now takes `PhysicalDataSourceConstraint` instead of `DataSourceConstraint` | ||
* `Availbility` no longer takes a set of columns on the table, only table needs to know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error in Availability
CHANGELOG.md
Outdated
* `getAvailbleIntervals` in `Availbality` now takes `PhysicalDataSourceConstraint` instead of `DataSourceConstraint` | ||
* `Availbility` no longer takes a set of columns on the table, only table needs to know | ||
* `getAllAvailbleIntervals` in `Availbility` now returns a map of column physical name to interval list instead of column to interval list | ||
* `TestDataSourceMetadataService` now takes map from string to list of intervals instead of column to list of intervals for constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Availbility -> Availability
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint constraint) { | ||
Map<String, List<Interval>> allAvailableIntervals = getAllAvailableIntervals(); | ||
|
||
return constraint.getAllColumnPhysicalNames().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change in behavior and I don't think it is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think @garyluoex is right, in that it should work as long as PhysicalDataSourceConstraint::buildWithSchemaUnion
does the right thing. That said, I think the JavaDoc on this method needs significant upgrades to reflect the new assumptions this method has, in that the unioning needs to have already been done to the inbound constraint
object.
And with this new formulation, there is significant overlap between this method and the method in ConcreteAvailability
, even though they do it a little differently. Here are the relevant portions:
// ConcreteAvailability
.map(physicalName -> allAvailableIntervals.getOrDefault(physicalName, Collections.emptyList()))
.map(intervals -> (Collection<Interval>) intervals)
.reduce(null, IntervalUtils::getOverlappingSubintervals)
// PermissiveAvailability
.map(columnName -> allAvailableIntervals.getOrDefault(columnName, Collections.emptyList()))
.flatMap(List::stream)
.collect(SimplifiedIntervalList.getCollector());
If that core can be lifted into a common protected helper in ConcreteAvailability
, that would be nice.
Set<String> schemaColumnNames = physicalTableSchema.getColumns().stream() | ||
.map(Column::getName) | ||
.collect(Collectors.toSet()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be a get method on physical table schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that there is such a method, this code should use it 😉
e2a36f1
to
8b6212e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move:
-
Set<String> schemaColumnNames = physicalTableSchema.getColumns().stream()
-
.map(Column::getName)
-
.collect(Collectors.toSet());
behavior to the schema class as a get method.
Also, I believe permissive availability is incorrect. Constraints should be ignored for getAvailableIntervals.
* Instead of returning the intersection of all available intervals, this method returns the union of them. | ||
* | ||
* @param ignoredConstraints Data constraint containing columns and api filters. Constrains are ignored, because | ||
* @param constraint Data constraint containing columns and api filters. Constrains are ignored, because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed? I'm pretty sure that the original 'ignored' was intentional.
Also, spelling 'Constraints'
@@ -49,15 +50,14 @@ class PermissiveAvailabilitySpec extends Specification { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is incorrect to design. Create a 'long_interval' column and confirm that even when not selecte that the getAvailableIntervals()l is always the long_interval or the union of the long_interval with any non overlapping intervals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What? I'm not sure I understand the problem here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically a column that doesn't participate at all should still overwhelm everything else if it's big in time.
this.allDimensions = dataSourceConstraint.getAllDimensions(); | ||
this.allDimensionNames = this.getAllDimensionNames(); | ||
this.allColumnNames = this.getAllColumnNames(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all private instance variables are final, should we use defensive copy? @cdeszaq @michael-mclawhorn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, protected
and private
constructors like this are relying on the fact that the external interfaces are ensuring the collections like this are immutable, so there's usually no need to defensively copy collections for those sorts of copy-constructors.
tldr: In this case, it's OK to use the item directly, not need to defensively copy
* @param physicalTableSchema A map from logical column name to physical column names | ||
*/ | ||
public PhysicalDataSourceConstraint( | ||
DataSourceConstraint dataSourceConstraint, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing @NotNull
?
*/ | ||
public PhysicalDataSourceConstraint( | ||
DataSourceConstraint dataSourceConstraint, | ||
PhysicalTableSchema physicalTableSchema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing @NotNull
?
|
||
[d1, d2, m1, m2, m3].each { | ||
availabilityMap1.put(toColumn(it), [interval1].toSet()) | ||
availabilityMap2.put(toColumn(it), [interval2].toSet()) | ||
availabilityMap1.put(toColumn(it).getName(), [interval1].toSet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess [interval1] as Set
is more readable?
(new Column('ignored')): [new Interval('2010-01-01/2500-12-31')] as Set | ||
(columnPhysicalName1): [interval1] as Set, | ||
(columnPhysicalName2): [interval2] as Set, | ||
'hidden_column' : [interval1] as Set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small - colon allignment
(column3): [], | ||
(columnPhysicalName1): [interval1], | ||
(columnPhysicalName2): [interval2], | ||
'hidden_column' : [interval1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small - colon allignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't go deeply into the tests, but I got through the rest of it.
this.allColumnPhysicalNames = dataSourceConstraint.getAllColumnNames().stream() | ||
.filter(schemaColumnNames::contains) | ||
.map(physicalTableSchema::getPhysicalColumnName) | ||
.collect(Collectors.toSet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be made unmodifiable, so that when we expose it through the getter, it's not messed with. Collectors.collectinAndThen
is a useful tool for this task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still needs to be addressed
|
||
this.allColumnPhysicalNames = schemaColumnNames.stream() | ||
.map(physicalTableSchema::getPhysicalColumnName) | ||
.collect(Collectors.toSet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be made unmodifiable, so that when we expose it through the getter, it's not messed with. Collectors.collectinAndThen
is a useful tool for this task.
*/ | ||
private PhysicalDataSourceConstraint( | ||
PhysicalTableSchema physicalTableSchema, | ||
DataSourceConstraint dataSourceConstraint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely not have 2 constructors that only differ by the order of their parameters. We should find a better way to do this.
One idea that comes to mind would be to have a private constructor that takes the dataSourceConsraint
and the collection of allColumnPhysicalNames
directly, and then allow the "smarts" to be handled elsewhere.
} | ||
|
||
/** | ||
* Private Constructor that union columns in constraint and schema, use buildWithSchemaUnion to invoke. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor doesn't look like it does a union of the columns in the constraint and the schema. It looks like it's only pulling columns from the schema. Am I missing something?
.collect(Collectors.toSet()); | ||
|
||
this.allColumnPhysicalNames = schemaColumnNames.stream() | ||
.map(physicalTableSchema::getPhysicalColumnName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least part of this processing is shared between the two constructors:
this.allColumnPhysicalNames = <columnNames>.stream()
.map(physicalTableSchema::getPhysicalColumnName)
.collect(Collectors.collectingAndThen(Collectors.toSet(), Collections::unmodifiableSet));
It would be nice to remove this duplication.
* | ||
* @return a new physical data source constraint with all columns in schema | ||
*/ | ||
public static PhysicalDataSourceConstraint buildWithSchemaUnion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to never be called by a test. Perhaps it should be?
@@ -79,12 +82,21 @@ public DateTime getTableAlignment() { | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should JavaDoc be added here to call out the filtering that happens so that only columns that are in the Schema for this table are returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still needs to be addressed I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment on the interface
return getAvailability().getAvailableIntervals(PhysicalDataSourceConstraint.buildWithSchemaUnion( | ||
constraint, | ||
getSchema() | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should wrap differently:
return getAvailability().getAvailableIntervals(
PhysicalDataSourceConstraint.buildWithSchemaUnion(constraint, getSchema())
);
In general, the outer-most param list is where the wrapping should occur, rather than the inner-most, in a case like this where there are nested calls.
|
||
// Need to ensure requestColumns is not empty in order to prevent returning null by reduce operation | ||
return new SimplifiedIntervalList( | ||
requestColumns.stream() | ||
.map(columnName -> allAvailableIntervals.getOrDefault(columnName, Collections.emptyList())) | ||
.map(physicalName -> allAvailableIntervals.getOrDefault(physicalName, Collections.emptyList())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use new SimplifiedIntervalList()
instead of Collections.emptyList()
for the default value here? It would bring it in line with what getAllAvailableIntervals()
is doing in the BasePhysicalTable
.
It may also allow for some optimizations or smarts with the rest of this stream, possibly letting us drop the wrapping construction of new SimplifiedIntervalList
that is ultimately getting returned here if we can build one from this stream directly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change things to Collections.emptyList
since it is shorter and looks less expensive than creating a new SimplifiedIntervalList
, let me know otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use a SIL as it's default. The biggest reasons is that it allows the following map
and reduce
calls to use SIL-aware methods which can be faster, since an SIL has known ordering and compactness qualities that a List
does not. It also allows the SIL produced by that reduction to be returned directly, rather than having to convert a List
into an SIL.
So, while Collections.emptyList()
may seem shorter and faster than new SimplifiedIntervalList()
, the net result may actually be that using SIL everywhere is faster overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still needs addressing?
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint constraint) { | ||
Map<String, List<Interval>> allAvailableIntervals = getAllAvailableIntervals(); | ||
|
||
return constraint.getAllColumnPhysicalNames().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think @garyluoex is right, in that it should work as long as PhysicalDataSourceConstraint::buildWithSchemaUnion
does the right thing. That said, I think the JavaDoc on this method needs significant upgrades to reflect the new assumptions this method has, in that the unioning needs to have already been done to the inbound constraint
object.
And with this new formulation, there is significant overlap between this method and the method in ConcreteAvailability
, even though they do it a little differently. Here are the relevant portions:
// ConcreteAvailability
.map(physicalName -> allAvailableIntervals.getOrDefault(physicalName, Collections.emptyList()))
.map(intervals -> (Collection<Interval>) intervals)
.reduce(null, IntervalUtils::getOverlappingSubintervals)
// PermissiveAvailability
.map(columnName -> allAvailableIntervals.getOrDefault(columnName, Collections.emptyList()))
.flatMap(List::stream)
.collect(SimplifiedIntervalList.getCollector());
If that core can be lifted into a common protected helper in ConcreteAvailability
, that would be nice.
52ece84
to
da27321
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all my concerns are answered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's just 3 things outstanding from my perspective. I've commented on those items indicating they still need to be addressed.
Also, this needs to be rebased onto master.
da27321
to
f038837
Compare
f038837
to
f5e3d97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more small changes
// get a map from availability to its available metric columns intersected with configured metric columns | ||
// i.e. metricColumns | ||
availabilitiesToAvailableColumns = physicalTables.stream() | ||
// map each metric to its corresponding availability object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment change is losing the fact that this is intersected with the configured (ie. passed in) metric columns.
This comment change is also losing the fact that this is building Map<Availability, Set<metric column>>
, rather than Map<metric column, Availability>
, which seems to be what the comment is saying
Map.Entry::getValue | ||
) | ||
); | ||
return availabilityToMetricNames.entrySet().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just use .values()
and skip the extra map
operation
getName().asName(), | ||
getColumnNames() | ||
return String.format("PermissiveAvailability with table name = %s", | ||
getName().asName() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to wrap?
.flatMap(List::stream) | ||
.collect(SimplifiedIntervalList.getCollector()); | ||
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint ignoredConstraint) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove blank line
* @param apiFilters Map of dimension to its set of API filters | ||
* @param allColumnPhysicalNames Set of all column physical names | ||
*/ | ||
protected PhysicalDataSourceConstraint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have this constructor? It looks like it does exactly the same thing as the private constructor just above, except unpacks the parameters. Can we remove this one?
Set<MetricColumn> availableColumns2 = availabilitiesToAvailableColumns.get(availability2) | ||
availableColumns2.contains(metricColumn2) | ||
Set<String> availableColumns2 = availabilitiesToAvailableColumns.get(availability2) | ||
availableColumns2.contains(metric2) | ||
!availableColumns2.contains(tableColumn1) | ||
!availableColumns2.contains(tableColumn2) | ||
} | ||
|
||
@Unroll | ||
def "checkDuplicateValue returns duplicate metric column names or empty, if no duplicates, from 2 physical tables in the case of #caseDescription"() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test name needs to be updated
143b2f7
to
5e684e6
Compare
@@ -41,7 +38,7 @@ public PhysicalDataSourceConstraint( | |||
} | |||
|
|||
/** | |||
* Constructor, use with care, beware of caching behavior. | |||
* Constructor, use with care, beware of inconsistent caching behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm still confused. What inconsistencies are possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple small questions. Otherwise this is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there from my point of view.
I think we should take on handling the 'requested but not mapped' metric column bug even if the SchemaTableResolver is currently protecting us from the worst risk.
@@ -163,6 +171,36 @@ class MetricUnionAvailabilitySpec extends Specification { | |||
|
|||
} | |||
|
|||
def "constructSubConstraint correctly intersect metrics configured on the table with the metrics supported by availability's underlying datasource"() { | |||
given: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subject/verb agreement: 'intersects'
expect: | ||
metricUnionAvailability.constructSubConstraint(physicalDataSourceConstraint).get(availability1).getMetricNames() == [metric1] as Set | ||
metricUnionAvailability.constructSubConstraint(physicalDataSourceConstraint).get(availability2).getMetricNames() == [metric2] as Set | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name for the ignored column in the mocks is confusable with the ignored request column. Change the name to something distinctive like 'unrequested' on the mock.
This test reveals another, existing problem. We should be defensive that the 'ignored2' gets handled correctly in getAvailableIntervals().
Probably a test before streaming that PhysicalDataSourceConstraint.allColumnPhysicalNames is a subset of metricNames.
To align with ConcreteAvailability it should return empty SIL.
(And a test for that case)
5e684e6
to
52598ac
Compare
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint constraint) { | ||
|
||
// If there are columns requested that are not configured on this table | ||
if (constraint.getAllColumnPhysicalNames().size() != constraint.getAllColumnNames().size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely always be true, due to the defaulting behavior of how mapping from logicalToPhysicalName
works. We probably need to look at the union of the columns in the subconstraints to see if any of the requested columns are not present in the subconstraints.
… data source constraint
if (!constraint.getAllColumnNames().stream().allMatch(tableColumnNames::contains)) { | ||
String message = String.format( | ||
"Received invalid request requesting for columns: %s that is not available in this table: %s", | ||
invalidColumnNames.stream().collect(Collectors.joining(",")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than computing invalidColumnNames
every time, let's replace the formatting stream here with the computation for the invalid names:
Set<String> constraintColumnNames = constraint.getAllColumnNames();
// ...
constraintColumnNames.stream().filter(name -> !tableColumnNames.contains(name)).collect(Collectors.joining(","))
new Column("MissingName") | [] as Set | ||
} | ||
|
||
@Unroll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for @Unroll
here, since there's no where
block and no templated test name.
Just those few trivial things, and this will merge. |
PhysicalDataSourceConstraint
class to capture physical names of columns for retrieving availble intervalgetAllAvailbleIntervals
inConcreteAvailability
no longer filters out unconfigured columns, instead table'sgetAllAvailbleIntervals
doesgetAvailbleIntervals
inAvailbality
now takesPhysicalDataSourceConstraint
instead ofDataSourceConstraint
Availbility
no longer takes a set of columns on the table, only table needs to knowgetAllAvailbleIntervals
inAvailbility
now returns a map of column physical name to interval list instead of column to interval listTestDataSourceMetadataService
now takes map from string to list of intervals instead of column to list of intervals for constructor