Skip to content
This repository has been archived by the owner on Jul 11, 2022. It is now read-only.

Commit

Permalink
[1125439] Duplicate baselines metric entries
Browse files Browse the repository at this point in the history
Although the root cause is still unknown the relevant code paths did not
protect against generating multiple baselines for a single schedule, even
though that should be invalid.  Here are some changes that are more in line
with the 1-1 relationship that should exist between schedule and
baseline (optional for schedule, required for baseline)
- Make the db index on baseline's scheduleId unique. There should only be
  one baseline for a schedule. This will prevent the problem in the BZ, possibly
  at the expense of failure when trying to store a duplicate, but that would
  perhaps help us pinpoint a problematic code-path.
- Use Sets (or a Map) as opposed to Lists in several places in the local API.
- Simplify and I think make more efficient the query for selecting schedules
  without a current baseline. Make it return distinct ids (this is a hedge,
  I don't think it would return duplicates, but this ensures that constraint).
  Replace a left join with a NOT EXISTS sub select.
- Chunk the storage of baselines to prevent large Txs, don't pass entities
  across Tx boundary, fetch all Schedules in one query, and (maybe) simplify the
  entity manager interaction.
- Simplify some of the TransactionAttribute decls
- Mark MeasurementSchedule.baseline as an optional 1-1 relationship, just to make
  it explicit on the entity.
- Fix some local SLSB jdoc
  • Loading branch information
jshaughn committed Aug 5, 2014
1 parent ca1a0fd commit 3686686
Show file tree
Hide file tree
Showing 9 changed files with 144 additions and 96 deletions.
2 changes: 1 addition & 1 deletion modules/core/dbutils/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
<description>Database schema setup, upgrade and other utilities</description>

<properties>
<db.schema.version>2.155</db.schema.version>
<db.schema.version>2.156</db.schema.version>
<rhq.ds.type-mapping>${rhq.test.ds.type-mapping}</rhq.ds.type-mapping>
<rhq.ds.server-name>${rhq.test.ds.server-name}</rhq.ds.server-name>
<rhq.ds.db-name>${rhq.test.ds.db-name}</rhq.ds.db-name>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
<index name="RHQ_MEAS_BASELINE_CTIME_IDX">
<field ref="BL_COMPUTE_TIME"/>
</index>
<index name="RHQ_MEAS_BASELINE_SID_IDX">
<index name="RHQ_MEAS_BASELINE_SID_IDX" unique="true">
<field ref="SCHEDULE_ID"/>
</index>
</table>
Expand Down
12 changes: 12 additions & 0 deletions modules/core/dbutils/src/main/scripts/dbupgrade/db-upgrade.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2431,6 +2431,18 @@
<schemaSpec version="2.155">
<schema-alterColumn table="RHQ_BUNDLE_DESTINATION" column="DEPLOY_DIR" nullable="true"/>
</schemaSpec>

<schemaSpec version="2.156">
<schema-directSQL ignoreError="true">
<statement desc="Dropping non-unique index RHQ_MEAS_BASELINE_SID_IDX">
DROP INDEX RHQ_MEAS_BASELINE_SID_IDX
</statement>
<statement desc="Recreating RHQ_MEAS_BASELINE_SID_IDX as unique index">
CREATE UNIQUE INDEX RHQ_MEAS_BASELINE_SID_IDX ON RHQ_MEASUREMENT_BLINE ( SCHEDULE_ID )

This comment has been minimized.

Copy link
@loleary

loleary Aug 11, 2014

Contributor

This query will fail if the environment fell victim to a duplicate baseline. You need to remove the duplicates if they exist otherwise the upgrade will result in no index at all.

The query that I have tested that will address this is provided but keep in mind that it must be executed before dropping the original index or the query execution time will be greatly increased.

DELETE FROM rhq_measurement_bline
WHERE  (id, schedule_id) NOT IN (SELECT id, schedule_id
                                 FROM   (SELECT Max(id) id, schedule_id
                                         FROM   rhq_measurement_bline
                                         GROUP  BY schedule_id) tokeep);

This comment has been minimized.

Copy link
@jshaughn

jshaughn Aug 12, 2014

Author Contributor

I hadn't originally thought of the db update as a repair mechanism, but I can see how it could be useful. The inner select is the entire table you want, which could be large, but shouldn't be so large as to be prohibitive.

I can add this.

</statement>
</schema-directSQL>
</schemaSpec>

</dbupgrade>
</target>
</project>
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@
+ "SELECT SUM(1000.0 / ms.interval) * 60.0 " //
+ " FROM MeasurementSchedule ms " //
+ " WHERE ms.enabled = true " //
+ " AND ms.definition.name <> '" + MeasurementDefinition.AVAILABILITY_NAME + "' " //
+ " AND ms.definition.name <> '" + MeasurementDefinition.AVAILABILITY_NAME + "' " //
+ " AND ms.resource.inventoryStatus = :status"), //
@NamedQuery(name = MeasurementSchedule.DISABLE_ALL, query = "" //
+ "UPDATE MeasurementSchedule ms " //
Expand Down Expand Up @@ -215,7 +215,7 @@ public class MeasurementSchedule implements Serializable {
* merge-merge cascade).
*/
// Remove now performed by bulk delete
@OneToOne(mappedBy = "schedule", cascade = { CascadeType.MERGE }, fetch = FetchType.LAZY)
@OneToOne(mappedBy = "schedule", cascade = { CascadeType.MERGE }, fetch = FetchType.LAZY, optional = true)
private MeasurementBaseline baseline;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import org.rhq.core.domain.measurement.NumericType;
import org.rhq.core.domain.measurement.composite.MeasurementOOBComposite;
import org.rhq.core.domain.resource.Agent;
import org.rhq.core.domain.resource.InventoryStatus;
import org.rhq.core.domain.resource.Resource;
import org.rhq.core.domain.resource.ResourceCategory;
import org.rhq.core.domain.resource.ResourceType;
Expand Down Expand Up @@ -434,11 +435,13 @@ private void setupResources() {

platform = new Resource("platform1", "testAutoBaseline Platform One", platformType);
platform.setUuid("" + new Random().nextInt());
platform.setInventoryStatus(InventoryStatus.COMMITTED);
em.persist(platform);
platform.setAgent(agent);

platform2 = new Resource("platform2", "testAutoBaseline Platform Two", platformType);
platform2.setUuid("" + new Random().nextInt());
platform2.setInventoryStatus(InventoryStatus.COMMITTED);
// deleteResource removes the agent, so we can't have two direct platforms for it, make one a child of the other
platform.addChildResource(platform2);
em.persist(platform2);
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@
*/
package org.rhq.enterprise.server.measurement;

import java.util.List;
import java.util.Map;
import java.util.Set;

import javax.ejb.Local;

Expand Down Expand Up @@ -66,35 +67,34 @@ public interface MeasurementBaselineManagerLocal extends MeasurementBaselineMana
/**
* <strong>Note</strong> This method exists only for transaction demarcation.
*
* @return A list of schedules that do not have baselines. This list is not assumed
* to be an exhaustive list of schedules that lack a baseline. As such, this method
* will be called repeatedly during baseline calculations to get all of the necessary
* schedules.
* @return The Set of of enabled, numeric, schedules that do not have baselines.
*/
List<Integer> getSchedulesWithoutBaselines();
Set<Integer> getSchedulesWithoutBaselines();

/**
* Given a list of schedules, this method calculates and stores baselines using the
* Given a list of scheduleIds, this method calculates and stores baselines using the
* amount of 1 hr data specified and older than the time specified.
* <br/><br/>
* <strong>Note</strong> This method exists only for transaction demarcation.
*
* @param schedules The schedules that do not yet have baselines
* @param scheduleIds The schedules that do not yet have baselines
* @param olderThan Use 1 hr data prior to this time
* @param amountOfData The amount of data to use for calculating baselines. This value
* is treated as a duration. For example, a value of 259200000
* would be treated as 3 days.
*/
void calculateBaselines(List<Integer> schedules, long olderThan, long amountOfData);
void calculateBaselines(Set<Integer> scheduleIds, long olderThan, long amountOfData);

/**
* Persists the newly calculated baselines.
* <br/><br/>
* <strong>Note</strong> This method exists only for transaction demarcation.
*
* @param baselines The baselines to persist.
* @param scheduleIds the scheduleIds for whom we want to persist baselines (a subset of baselines.keyset,
* we may not save them all in one call to this method)
* @param baselines Map of scheduleIds to The baselines to persist.
*/
void saveNewBaselines(List<MeasurementBaseline> baselines);
void saveNewBaselines(Set<Integer> scheduleIds, Map<Integer, MeasurementBaseline> baselines);

MeasurementBaseline getBaselineIfEqual(Subject subject, int groupId, int definitionId);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,9 @@
*/
package org.rhq.server.metrics;

import java.util.ArrayList;
import java.util.List;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
Expand All @@ -47,15 +48,14 @@ public MetricsBaselineCalculator(MetricsDAO metricsDAO) {
this.metricsDAO = metricsDAO;
}

public List<MeasurementBaseline> calculateBaselines(List<Integer> schedules, long startTime,
long endTime) {
List<MeasurementBaseline> calculatedBaselines = new ArrayList<MeasurementBaseline>();
public Map<Integer, MeasurementBaseline> calculateBaselines(Set<Integer> scheduleIds, long startTime, long endTime) {
Map<Integer, MeasurementBaseline> calculatedBaselines = new HashMap<Integer, MeasurementBaseline>();

MeasurementBaseline measurementBaseline;
for (Integer schedule : schedules) {
measurementBaseline = this.calculateBaseline(schedule, startTime, endTime);
for (Integer scheduleId : scheduleIds) {
measurementBaseline = this.calculateBaseline(scheduleId, startTime, endTime);
if (measurementBaseline != null) {
calculatedBaselines.add(measurementBaseline);
calculatedBaselines.put(scheduleId, measurementBaseline);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,11 @@
import static org.testng.Assert.assertEquals;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.Set;

import org.powermock.api.mockito.PowerMockito;
import org.powermock.core.classloader.annotations.PrepareForTest;
Expand Down Expand Up @@ -74,14 +76,15 @@ public void noCalculationTest() throws Exception {
when(mockMetricsDAO.findAggregatedSimpleOneHourMetric(eq(1), eq(0), eq(1))).thenReturn(
new ArrayList<AggregateSimpleNumericMetric>());

int expectedScheduleId = 2567;
Set expectedScheduleId = new HashSet(1);
expectedScheduleId.add(2567);

//create object to test and inject required dependencies
MetricsBaselineCalculator objectUnderTest = new MetricsBaselineCalculator(new MetricsDAO(mockSession,
metricsConfiguration));

//run code under test
List<MeasurementBaseline> result = objectUnderTest.calculateBaselines(Arrays.asList(expectedScheduleId), 0, 1);
Map<Integer, MeasurementBaseline> result = objectUnderTest.calculateBaselines(expectedScheduleId, 0, 1);

//verify the results (Assert and mock verification)
assertEquals(result.size(), 0, "No baselines expected");
Expand Down Expand Up @@ -129,6 +132,8 @@ public void randomDataTest() throws Exception {
long expectedStartTime = 135;
long expectedEndTime = 246;
long beforeComputeTime = System.currentTimeMillis();
Set expectedScheduleIdSet = new HashSet(1);
expectedScheduleIdSet.add(expectedScheduleId);

//tell the method story as it happens: mock dependencies and configure
//those dependencies to get the method under test to completion.
Expand All @@ -146,7 +151,7 @@ public void randomDataTest() throws Exception {
metricsConfiguration));

//run code under test
List<MeasurementBaseline> result = objectUnderTest.calculateBaselines(Arrays.asList(expectedScheduleId),
Map<Integer, MeasurementBaseline> result = objectUnderTest.calculateBaselines(expectedScheduleIdSet,
expectedStartTime, expectedEndTime);

//verify the results (Assert and mock verification)
Expand Down Expand Up @@ -189,6 +194,8 @@ public void noMinMaxDataTest() throws Exception {
int expectedScheduleId = 567;
long expectedStartTime = 135;
long expectedEndTime = 246;
Set expectedScheduleIdSet = new HashSet(1);
expectedScheduleIdSet.add(expectedScheduleId);

//tell the method story as it happens: mock dependencies and configure
//those dependencies to get the method under test to completion.
Expand All @@ -206,7 +213,7 @@ public void noMinMaxDataTest() throws Exception {
metricsConfiguration));

//run code under test
List<MeasurementBaseline> result = objectUnderTest.calculateBaselines(Arrays.asList(expectedScheduleId),
Map<Integer, MeasurementBaseline> result = objectUnderTest.calculateBaselines(expectedScheduleIdSet,
expectedStartTime, expectedEndTime);

//verify the results (Assert and mock verification)
Expand Down

0 comments on commit 3686686

Please sign in to comment.