Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransactionalIndexedCollection and UniqueIndex #67

Closed
kimptoc opened this issue Jun 7, 2016 · 10 comments
Closed

TransactionalIndexedCollection and UniqueIndex #67

kimptoc opened this issue Jun 7, 2016 · 10 comments
Assignees

Comments

@kimptoc
Copy link
Contributor

kimptoc commented Jun 7, 2016

When I try to update an object with a unique index via update on a TransactionalIndexedCollection, I get the error below.

Code is here: https://github.com/kimptoc/cqengine-test/blob/master/src/main/java/TestUpdateOnTransIndexColl.java

Exception in thread "Thread-0" com.googlecode.cqengine.index.unique.UniqueIndex$UniqueConstraintViolatedException: The application has attempted to add a duplicate object to the UniqueIndex on attribute 'Id', potentially causing inconsistencies between indexes. UniqueIndex should not be used with attributes which do not uniquely identify objects. Problematic attribute value: 'uniqueId', problematic duplicate object: {Version=2, Size=1, Height=3, Id=uniqueId, Width=4, Distance=2}
    at com.googlecode.cqengine.index.unique.UniqueIndex.addAll(UniqueIndex.java:233)
    at com.googlecode.cqengine.engine.CollectionQueryEngine$12.perform(CollectionQueryEngine.java:1116)
    at com.googlecode.cqengine.engine.CollectionQueryEngine.forEachIndexDo(CollectionQueryEngine.java:1197)
    at com.googlecode.cqengine.engine.CollectionQueryEngine.addAll(CollectionQueryEngine.java:1113)
    at com.googlecode.cqengine.ConcurrentIndexedCollection.doAddAll(ConcurrentIndexedCollection.java:443)
    at com.googlecode.cqengine.TransactionalIndexedCollection.update(TransactionalIndexedCollection.java:221)
    at com.googlecode.cqengine.TransactionalIndexedCollection.update(TransactionalIndexedCollection.java:150)
    at Updater.run(Updater.java:49)
    at java.lang.Thread.run(Thread.java:745)

Is that to be expected?

Cheers.

@npgall
Copy link
Owner

npgall commented Jun 7, 2016

Can you create a test case which shows the problem?

@kimptoc
Copy link
Contributor Author

kimptoc commented Jun 8, 2016

Sure - it works ok with Concurrent and ObjectLocking versions of IndexedCollection - its just the Transactional one that fails.

https://github.com/kimptoc/cqengine-test/blob/master/src/main/java/uniqueindex/UniqueWithConcurrent.java

@kimptoc
Copy link
Contributor Author

kimptoc commented Jun 8, 2016

Having looked at TransactionalIndexedCollection in more detail, I see that its update method does add then remove, whereas the other collections do a remove and then add. Although I don't quite see how this change would cause this issue...

@npgall
Copy link
Owner

npgall commented Jun 8, 2016

Hi Chris,

This is a good question.

Yes when replacing different versions of the same object, the TransactionalIndexedCollection does add the new objects, before it removes the old ones. It's intentionally like this because it needs to have both versions of the objects in the collection at the same time. It just hides either the old or the new ones depending on which stage of MVCC is applicable.

So I guess the problem here, might be that you have a UniqueIndex on some kind of primary key attribute, but that attribute does not account for the version? The UniqueIndex expects the attribute to uniquely identify an object in the collection, but in this case I guess due to MVCC you can have two objects in the collection whose primary key attribute returns the same unique value.

So to workaround you can either (1) use a HashIndex instead of the UniqueIndex as HashIndex does permit duplicates, or (2) somehow ensure that your primary key attribute accounts for the version of your object, or (3) avoid using the TransactionalIndexedCollection.

I don't think there is an easy workaround for this with MVCC. So I think this issue needs to be documented. I will keep this issue open until then. Thanks for reporting this!

Niall

@npgall npgall self-assigned this Jun 8, 2016
@kimptoc
Copy link
Contributor Author

kimptoc commented Jun 8, 2016

Thanks for clarifying - I went for option 3 :)

@devinrsmith
Copy link
Contributor

Just ran into this issue as well. I'm using a unique compound attribute on two values as the key to a state (to manage state transitions).

unique ( compound( A, B ) )
add (A, B, STATE_1)
update (A, B, STATE_1) -> (A, B, STATE_2)

The update fails with a constraint violation when using TransactionalIndexedCollection.

@kimptoc
Copy link
Contributor Author

kimptoc commented Jul 18, 2016

Hi,

Just an update on this - had hoped with to switch to the UniqueIndex/TransactionalIndexedCollection with v2.7 of CQEngine - but now get the error about "sets of objectsToRemove and objectsToAdd are not disjoint" - line 429 of TransactionalIndexedCollection. Which they aren't as its just the old/new versions and they don't include the version in the key.

So, will continue with not using this collection.

Cheers,
Chris

@npgall
Copy link
Owner

npgall commented Jul 18, 2016

I don't think there is an easy solution to this. However there is a middle ground.

The tradeoff is, for MVCC, the new object must be added to the collection before the old object is removed. But OTOH the key selling point of UniqueIndex is that it uses less memory by eliminating the data structures required to store more than one object per key.

If you are looking for a middle ground, you can try the following.

package com.googlecode.cqengine;

import com.googlecode.cqengine.attribute.Attribute;
import com.googlecode.cqengine.index.hash.HashIndex;
import com.googlecode.cqengine.index.support.Factory;
import com.googlecode.cqengine.index.unique.UniqueIndex;
import com.googlecode.cqengine.resultset.stored.StoredResultSet;
import com.googlecode.cqengine.resultset.stored.StoredSetBasedResultSet;

import java.util.Collections;
import java.util.concurrent.ConcurrentHashMap;

/**
 * @author niall.gallagher
 */
public class SemiUniqueHashIndexFactory {

    /**
     * Creates a new {@link HashIndex} on the specified attribute, where the attribute is expected to uniquely identify
     * an object in the collection <i>most of the time</i>.
     * <p/>
     * This can be used as a less-strict alternative to {@link UniqueIndex}, especially when used with a
     * {@link TransactionalIndexedCollection} which implements an MVCC algorithm. The MVCC algorithm sometimes needs
     * to store multiple versions of the same object in the collection at the same time, which would violate the
     * constraints of that index.
     * <p/>
     * This configures the {@link HashIndex} with {@link CompactValueSetFactory}, and as such it will use less memory
     * than the default configuration of {@link HashIndex}. However it will use somewhat more memory than
     * {@link UniqueIndex}.
     *
     * @param attribute The attribute on which the index will be built
     * @param <O> The type of the object containing the attribute
     * @return A {@link HashIndex} on this attribute
     */
    public static <A, O> HashIndex<A, O> onSemiUniqueAttribute(Attribute<O, A> attribute) {
        return HashIndex.onAttribute(new HashIndex.DefaultIndexMapFactory<A, O>(), new CompactValueSetFactory<O>(), attribute);

    }

    /**
     * Creates a value set which is tuned to reduce memory overhead, with an expectation that only a single or
     * a very small number of object(s) will be stored in each bucket.
     * <p/>
     * Additionally, this is tuned with the <b>expectation of low concurrency for writes</b>. Concurrent writes are
     * safe, but they might block each other.
     */
    public static class CompactValueSetFactory<O> implements Factory<StoredResultSet<O>> {
        @Override
        public StoredResultSet<O> create() {
            return new StoredSetBasedResultSet<O>(Collections.<O>newSetFromMap(new ConcurrentHashMap<O, Boolean>( 1, 1.0F, 1)));
        }
    }
}

I have added this to CQEngine mainline, so in the next release this will be built into HashIndex via HashIndex.onSemiUniqueAttribute().

@npgall
Copy link
Owner

npgall commented Aug 12, 2016

Closing this, as I have updated the documentation in UniqueIndex to document this.

CQEngine 2.7.1 has now been released which includes HashIndex.onSemiUniqueAttribute() to work around this issue.

@npgall npgall closed this as completed Aug 12, 2016
@kimptoc
Copy link
Contributor Author

kimptoc commented Aug 12, 2016

Thanks - not had a chance to try this yet. Definitely on my list to try out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants