-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partition table config #259
Conversation
6cd1154
to
3ef487e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs changelog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small things.
ZonedTimeGrain timeGrain, | ||
Set<FieldName> metricNames, | ||
Set<? extends DimensionConfig> dimensionConfigs, | ||
Map<TableName, Map<String, Set<String>>> tablePartDefinitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have a DimensionName
now =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll come back to this.
@@ -35,7 +37,8 @@ public PhysicalTableSchema( | |||
super(timeGrain, columns); | |||
this.timeGrain = timeGrain; | |||
|
|||
this.logicalToPhysicalColumnNames = Collections.unmodifiableMap(logicalToPhysicalColumnNames); | |||
this.logicalToPhysicalColumnNames = Collections.unmodifiableMap(new LinkedHashMap<> | |||
(logicalToPhysicalColumnNames)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap this better?
return Objects.equals(metricNames, that.metricNames) | ||
&& Objects.equals(availabilitiesToMetricNames, that.availabilitiesToMetricNames); | ||
} | ||
return super.equals(obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using super's equal if two objects are not the same?
public static final PaginationParameters ONE_RESULT = new PaginationParameters( | ||
1, | ||
1 | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can unwrap?
searchProvider.hasAnyRows(_) >> searchRowsExist | ||
|
||
expect: "filter matches when there are no constraints or they exist and return rows from search provider" | ||
expected == filter.emptyConstraintOrAnyRows(dimension, constraintMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected should be on the right hand side. IDE will display result on the left hand side as actual and right hand side as expected I think.
} | ||
|
||
expect: "True as long as the dimension that doesnt return rows is not constrained" | ||
dimensionIdFilter.apply(constraint) == success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably rename success to something like, expected or participate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing too big
* @param timeGrain Zoned time grain of the table | ||
* @param metricNames The Set of metric names on the table | ||
* @param dimensionConfigs Set of dimensions on the table as dimension configs | ||
* @param tablePartDefinitions A map from table names to the dimension value matches for that table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we expand a bit here on what the components of the Map are? Map<String, Set<String>>
doesn't tell us what the key and value should be.
*/ | ||
public DefaultingVolatileIntervalsService( | ||
DefaultingDictionary<PhysicalTable, VolatileIntervalsFunction> intervalsFunctions | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this constructor outright instead of deprecating it? (call out in changelog)
* | ||
* @return true if at least one row is returned. | ||
*/ | ||
default boolean hasAnyRows(Set<ApiFilter> filters) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call out addition in changelog.
@@ -139,4 +140,31 @@ public String getPhysicalColumnName(String logicalName) { | |||
protected void setAvailability(Availability availability) { | |||
this.availability = availability; | |||
} | |||
|
|||
@Override | |||
public boolean equals(final Object obj) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
not needed
|
||
@Override | ||
public boolean equals(final Object obj) { | ||
if (obj instanceof BasePhysicalTable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check instance equality 1st?
TableName.of("partition"), | ||
DAY_UTC, | ||
Collections.emptySet(), | ||
Collections.emptySet(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[] as Set
PartitionCompositeTable expected = new PartitionCompositeTable( | ||
TableName.of("partition"), | ||
DAY_UTC, | ||
(Set<Column>) Collections.emptySet(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[] as Set
keyField.name >> keyFieldFieldName | ||
nonKeyField.name >> nonKeyFieldName | ||
} | ||
def "healthy initialization returns correct state"() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line between methods
import com.yahoo.bard.webservice.util.SimplifiedIntervalList | ||
|
||
import org.joda.time.DateTime | ||
import org.joda.time.Interval | ||
|
||
import spock.lang.Specification | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep
@@ -59,7 +63,7 @@ class DefaultingVolatileIntervalsServiceSpec extends Specification { | |||
allGrainQuery.intervals >> fullRange | |||
allGrainQuery.granularity >> AllGranularity.INSTANCE | |||
|
|||
intervalsFunctions = new DefaultingDictionary<>(defaultFunction) | |||
intervalsFunctions = new HashMap<>() | |||
intervalsFunctions.put(table2, mockFunction2) | |||
intervalsFunctions.put(table3, mockFunction3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use map literals, now that this is a simple Map?
Patched bug on volatile equals.
Clean up to Strings and Equals. CHANGELOG
ba062f9
to
5e655d2
Compare
* @param timeGrain Zoned time grain of the table | ||
* @param metricNames The Set of metric names on the table | ||
* @param dimensionConfigs Set of dimensions on the table as dimension configs | ||
* @param tablePartDefinitions A map from table names to a map of dimension names to sets of values for those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 spaces between name and description
|
||
@Override | ||
public Set<TableName> getDependentTableNames() { | ||
return tablePartDefinitions.keySet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest putting tablePartDefinitions.keySet();
to a variable so we don't have to call .keySet()
multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is keySet expensive? I'm not sure it makes sense to keep a copy of the keySet and the original collection. Also, I think the builder might modify the tablePartDefinitions class during construction and having a copy of the keyset might result in inconsistency.
entry -> dictionaries.getPhysicalDictionary().get(entry.getKey().asName()), | ||
entry -> new DimensionIdFilter(toDimensionValuesMap( | ||
entry.getValue(), | ||
dictionaries.getDimensionDictionary() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to assign dictionaries.getDimensionDictionary()
and dictionaries.getPhysicalDictionary()
to variables?
* @param dimensionNameMap The configuration map from dimension names to sets of dimension key values. | ||
* @param dimensionDictionary The dictionary of dimensions to use for binding. | ||
* | ||
* @return A map of dimensions to dimension key values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cdeszaq Lower case "A"?
&& Objects.equals(availability, that.availability); | ||
} else { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else {
return false;
}
can be simplified to
return false;
dimensionKeySelectFilters = dimensionMappingValues.entrySet().stream() | ||
.map(entry -> new AbstractMap.SimpleEntry<>( | ||
entry.getKey(), | ||
new ApiFilter(entry.getKey(), entry.getKey().getKey(), FilterOperation.in, entry.getValue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are getting key multiple times, could we assign entry.getKey()
to a variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's likely not worth it here, since this is a fairly simple constructor, and the call is likely very cheap. The added complexity isn't worth it, imho.
* @param dimension The dimension whose rows are being tested on. | ||
* @param constraintFilters The api filters from the constraint | ||
* | ||
* @return True if for this dimension there are rows matching the query filters AND the embedded filters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once space between "return" and "True". "true" should be lowercased
|
||
@Override | ||
public boolean equals(Object obj) { | ||
if (obj instanceof DimensionIdFilter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add this == obj
check?
* @param value The value being appended to the set. | ||
* @param <T> The type of the set | ||
* | ||
* @return The new set containing the additional value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lowercase "The"
|
||
TableName dataSource1 = TableName.of("part1_ds1") | ||
TableName dataSource2 = TableName.of("part1_ds2") | ||
TableName dataSource3 = TableName.of("part2_ds1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is part2Ds1
better for style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these aren't java identifiers I think readability is the biggest concern and this way is clearly more readable.
Equals optimizations
Added dimension value based partition feature.
Configuration of dimension value partitions is done by supplying a map of table names to maps of dimension names to maps of dimension values.
The entry for a given table name describes a set of filters, by dimension. If each dimension successfully applies when combined with the restrictions on a table, then the partition is considered to pertain to the query. A partition which pertains is used for restricting availability.