Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to use physical name instead of logical name to retrieve available interval #226

Merged
merged 8 commits into from
Apr 17, 2017
8 changes: 4 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ Current
### Added:

- [Fix to use physical name instead of logical name to retrieve available interval](https://github.com/yahoo/fili/pull/226)
* Added `PhysicalDataSourceConstraint` class to capture physical names of columns for retrieving availble interval
* Added `PhysicalDataSourceConstraint` class to capture physical names of columns for retrieving available intervals

- [BaseCompositePhysicalTable](https://github.com/yahoo/fili/pull/242)
* `ConcretePhysicalTable` provides common operations, such as validating coarsest ZonedTimeGrain, for composite
* `BaseCompositePhysicalTable` provides common operations, such as validating coarsest ZonedTimeGrain, for composite
tables.

- [Add Reciprocal `satisfies()` relationship complementing `satisfiedBy()` on Granularity](https://github.com/yahoo/fili/issues/222)
Expand Down Expand Up @@ -81,10 +81,10 @@ Current
### Changed:

- [Fix to use physical name instead of logical name to retrieve available interval](https://github.com/yahoo/fili/pull/226)
* `getAllAvailbleIntervals` in `ConcreteAvailability` no longer filters out unconfigured columns, instead table's `getAllAvailbleIntervals` does
* `getAllAvailbleIntervals` in `ConcreteAvailability` no longer filters out un-configured columns, instead table's `getAllAvailbleIntervals` does
* `getAvailbleIntervals` in `Availbality` now takes `PhysicalDataSourceConstraint` instead of `DataSourceConstraint`
* `Availability` no longer takes a set of columns on the table, only table needs to know
* `getAllAvailbleIntervals` in `Availability` now returns a map of column physical name to interval list instead of column to interval list
* `getAllAvailbleIntervals` in `Availability` now returns a map of column physical name string to interval list instead of column to interval list
* `TestDataSourceMetadataService` now takes map from string to list of intervals instead of column to list of intervals for constructor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Availbility -> Availability


- [Reduced number of queries sent by `LuceneSearchProvider` by 50% in the common case](https://github.com/yahoo/fili/pull/234)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
/**
* An implementation of BasePhysicalTable that contains multiple tables.
*/
public class BaseCompositePhysicalTable extends BasePhysicalTable {
public abstract class BaseCompositePhysicalTable extends BasePhysicalTable {

private static final Logger LOG = LoggerFactory.getLogger(BaseCompositePhysicalTable.class);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,10 @@ public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint
// Need to ensure requestColumns is not empty in order to prevent returning null by reduce operation
return new SimplifiedIntervalList(
requestColumns.stream()
.map(physicalName -> allAvailableIntervals.getOrDefault(physicalName, Collections.emptyList()))
.map(physicalName -> allAvailableIntervals.getOrDefault(
physicalName,
new SimplifiedIntervalList()
))
.map(intervals -> (Collection<Interval>) intervals)
.reduce(null, IntervalUtils::getOverlappingSubintervals)
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,9 @@ public Set<TableName> getDataSourceNames() {
}

/**
* Combines and returns intervals of all availabilities' metric columns.
* Retrieve all available intervals for all columns across all the underlying datasources.
* <p>
* Intervals of the same metric column are associated with the same metric column key. Overlapping intervals under
* the same metric column key are collapsed into single interval.
* Available intervals for the same columns are unioned into a <tt>SimplifiedIntervalList</tt>
*
* @return a map of column to all of its available intervals in union
*/
Expand All @@ -148,9 +147,15 @@ public Map<String, List<Interval>> getAllAvailableIntervals() {
}

@Override
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint constraints) {
public SimplifiedIntervalList getAvailableIntervals(PhysicalDataSourceConstraint constraint) {

// If there are columns requested that are not configured on this table
if (constraint.getAllColumnPhysicalNames().size() != constraint.getAllColumnNames().size()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely always be true, due to the defaulting behavior of how mapping from logicalToPhysicalName works. We probably need to look at the union of the columns in the subconstraints to see if any of the requested columns are not present in the subconstraints.

return new SimplifiedIntervalList();
}

return new SimplifiedIntervalList(
constructSubConstraint(constraints).entrySet().stream()
constructSubConstraint(constraint).entrySet().stream()
.map(entry -> entry.getKey().getAvailableIntervals(entry.getValue()))
.map(simplifiedIntervalList -> (Set<Interval>) new HashSet<>(simplifiedIntervalList))
.reduce(null, IntervalUtils::getOverlappingSubintervals)
Expand All @@ -169,7 +174,7 @@ public String toString() {
}

/**
* Validates whether the metric columns are unique accross each of the underlying datasource.
* Validates whether the metric columns are unique across each of the underlying datasource.
*
* @param availabilityToMetricNames A map from <tt>Availability</tt> to set of <tt>MetricColumn</tt>
* contained in that <tt>Availability</tt>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ public PhysicalDataSourceConstraint(
}

/**
* Constructor, use with care, beware of inconsistent caching behavior.
* Constructor, use with care, beware of possible inconsistent between underlying dimension and metrics property
* and allColumnPhysicalName.
*
* @param dataSourceConstraint Data source constraint containing all the column names as logical names
* @param allColumnPhysicalNames The physical names of the columns
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ package com.yahoo.bard.webservice.table.availability

import com.yahoo.bard.webservice.data.config.names.TableName
import com.yahoo.bard.webservice.data.metric.MetricColumn
import com.yahoo.bard.webservice.data.time.ZonedTimeGrain
import com.yahoo.bard.webservice.table.Column
import com.yahoo.bard.webservice.table.PhysicalTable
import com.yahoo.bard.webservice.table.PhysicalTableSchema
import com.yahoo.bard.webservice.table.resolver.DataSourceConstraint
import com.yahoo.bard.webservice.table.resolver.PhysicalDataSourceConstraint
import com.yahoo.bard.webservice.table.resolver.PhysicalDataSourceConstraintSpec
import com.yahoo.bard.webservice.util.SimplifiedIntervalList

import org.joda.time.Interval
Expand Down Expand Up @@ -171,14 +172,14 @@ class MetricUnionAvailabilitySpec extends Specification {

}

def "constructSubConstraint correctly intersect metrics configured on the table with the metrics supported by availability's underlying datasource"() {
def "constructSubConstraint correctly intersects metrics configured on the table with the metrics supported by availability's underlying datasource"() {
given:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subject/verb agreement: 'intersects'

availability1.getAllAvailableIntervals() >> [
(metric1): []
]
availability2.getAllAvailableIntervals() >> [
(metric2): [],
'ignored1': []
'un_requested': []
]

metricUnionAvailability = new MetricUnionAvailability(physicalTables, [metricColumn1, metricColumn2] as Set)
Expand All @@ -187,10 +188,10 @@ class MetricUnionAvailabilitySpec extends Specification {
[] as Set,
[] as Set,
[] as Set,
[metric1, metric2, 'ignored2'] as Set,
[metric1, metric2, 'un_configured'] as Set,
[] as Set,
[] as Set,
[metric1, metric2, 'ignored2'] as Set,
[metric1, metric2, 'un_configured'] as Set,
[:]
)

Expand Down Expand Up @@ -251,4 +252,38 @@ class MetricUnionAvailabilitySpec extends Specification {
['2017-01-01/2017-02-01'] | ['2017-01-01/2017-01-15'] | ['2017-01-01/2017-01-15'] | "full back overlap (0/10, 0/5)"
['2017-01-01/2017-02-01'] | ['2017-01-15/2017-01-25'] | ['2017-01-15/2017-01-25'] | "fully contain (0/10, 3/9)"
}

def "getAvailableInterval requesting un-configured column on the table will result in empty available interval"() {
given:
Interval interval = new Interval('2000-01-01/3000-01-01')
availability1.getAllAvailableIntervals() >> [
(metric1): [interval]
]
availability2.getAllAvailableIntervals() >> [
(metric2): [interval]
]

availability1.getAvailableIntervals(_ as PhysicalDataSourceConstraint) >> new SimplifiedIntervalList([interval])
availability2.getAvailableIntervals(_ as PhysicalDataSourceConstraint) >> new SimplifiedIntervalList([interval])

metricUnionAvailability = new MetricUnionAvailability(physicalTables, [metricColumn1, metricColumn2] as Set)

DataSourceConstraint dataSourceConstraint = new DataSourceConstraint(
[] as Set,
[] as Set,
[] as Set,
[metric1, metric2, 'un_configured'] as Set,
[] as Set,
[] as Set,
[metric1, metric2, 'un_configured'] as Set,
[:]
)

PhysicalTableSchema physicalTableSchema = new PhysicalTableSchema(Mock(ZonedTimeGrain), [new Column(metric1), new Column(metric2)], [:])

PhysicalDataSourceConstraint physicalDataSourceConstraint = new PhysicalDataSourceConstraint(dataSourceConstraint, physicalTableSchema)

expect:
metricUnionAvailability.getAvailableIntervals(physicalDataSourceConstraint) == new SimplifiedIntervalList()
}
}