Issue seq #3744

matanshukry · 2015-03-10T22:08:13Z

User perspective

Usage

Create a sequence

CREATE SEQUENCE <sequence> [TYPE <CACHED|ORDERED>] [START <value>] [INCREMENT <value>] [CACHE <value>]

Sequence types:

Ordered: Each call to .next() will result in a new value.
Cached: The sequence will cache N items on each node, thus improving the performance if many .next() calls are required. However, this may create holes.

Drop a sequence

DROP SEQUENCE <sequence>

Using a sequence

SELECT #<sequence>.<function>

Possible functions:

next() - retrieve the next value.
current() - get the current value
reset() - reset the sequence value to it's initial value.

Notes

Prefix

I decided to use the hash sign(#) prefix for sequences, although that is easy to replace. The reason to use any sign was due to code architecture, however in my opinion it is also the better option; When not using any prefix, it can collide with table columns.

Distributed

Distributed sequences are not 100% working now. They will most probably work, just be careful not to use them too much (although i'm hoping we can finish the work with distributed sequences until the next release).

Code perspective

Test

I added two test cases:

SequenceTest: Check a sequence simply "works". That is, calling .next() return next value, .current() return the current, and reset() resets the value. Simple.
Distributed test: This test case actually test 3 issues:
** Ordered sequence: check two ordered sequence, under simple load, will generate ordered values.
** Cached sequence: Check two cached sequences, with a specific cache value, will generate values with
range same as cache value (showing that one of the sequences cached a range of values)
* Parallel Sync: Attempt to create multiple (2) threads, where each thread will use a different node (and databases, as according to multi-threading rules in orientdb), and generate unique N(100) values, each. The result should be 2N(200) unique values.

Notes:

The test it self cause many OConcurrentModificationException, which are printed to the console. This is valid, and the test does succeed.
The parallel sync test is currently not working. I am still working on it, although it seems like an edge case, and a regular normal-load of usage of the sequences, even distributed, shouldn't cause this problem.
In order to pass the test though, it has a flag which is set to false, hence not testing this specific test (parallel sync) at all.
I am not entirely sure what the problem is, but there seem to be more than one:
The initial problem was the one I fixed in this PR: Added a test case(distributed) and a fix #3738
The second problem was with OConcurrentModificationException, which seems to be now fixed (fixed recently by lvca)
There still seems to be a problem now, where when an update to a field is happening distributed, the results are conflicted: one node return the same version as sent, and the other a higher one. I am still looking into it, and will post as soon as possible.

Although the distributed part of sequences is not complete, I decided to still open this PR in case someone more experienced with the project might shed some light on the distributed errors.

lvca · 2015-03-11T00:23:07Z

This is related to #367.

lvca · 2015-03-11T00:23:16Z

Hey @matanshukry this PR is impressive! The only doubt is on using the # as prefix. We already use it for RIDs.

I see Postgres uses only nextval() function (http://www.postgresql.org/docs/8.1/static/sql-createsequence.html), while MySQL seems to have no control over sequences, but rather you can use them in schema only (auto-increment columns). Oracle has both currval and nextval functions, but not as functions (http://docs.oracle.com/cd/B28359_01/server.111/b28286/pseudocolumns002.htm#SQLRF51138) and no prefix for them.

lvca · 2015-03-11T00:23:37Z

@luigidellaquila WDYT about the syntax?

luigidellaquila · 2015-03-11T06:26:44Z

I didn't go through all the code, but I wonder if it's possible to do
something like

insert into Foo (a, b) values (#mySequence.next(), 'bar')

I don't like the # sign very much, but it does not create problems in terms
of parsing (a RID has a ":" between numbers, while a sequence has a "."),
let me just think a little bit if we can find a better sign.
Aside of this, implementing this syntax in the new parser is trivial.

Luigi

2015-03-11 1:23 GMT+01:00 Luca Garulli notifications@github.com:

@luigidellaquila https://github.com/luigidellaquila WDYT about the
syntax?

—
Reply to this email directly or view it on GitHub
#3744 (comment)
.

tglman · 2015-03-11T10:12:24Z

we may need an integration with the distributed as wel, or declare that this feature is on single node mode, but sounds limitating

by the way ipressive ;)

lvca · 2015-03-11T10:21:01Z

I like also the @matanshukry's idea about the OExposedMethod annotation. We could adopt in all our public APIs. And maybe this is the right time to introduce also an attribute on it called "stability" can have one of the following ENUM values (inspired by Node.js: https://nodejs.org/api/documentation.html):

DEPRECATED, EXPERIMENTAL, UNSTABLE, STABLE, DEPRECATED

Well, node.js propose also API-FROZEN, LOCKED but it could be misleading for users. WDYT?

lvca · 2015-03-11T10:37:22Z

Created a new issue for that: #3749

matanshukry · 2015-03-11T15:19:37Z

luigidellaquila - Actually I forgot about the RID syntax. We may be able to distinguish between the two, but I really prefer to avoid inserting the sequence test into the 'from'(field) test. Using a one sign prefix is better than trying to test the string. Also, a prefix isn't a bad way to go.

In my opinion choosing a better prefix sign is the preferred choice.

lvca · 2015-03-11T16:00:53Z

@matanshukry The usage of # in OrientDB is associated to RIDs. We can use something else. Some proposals:

(1) "sequence:" as prefix:

select from sequence:mySequence.next()

We already support the prefix: class, cluster and index.

(2) Just rely on functions, like other RDBMSs:

select from nextval(mySequence)
select from currentval(mySequence)

matanshukry · 2015-03-12T12:42:07Z

Not a big fan of global functions. Any chance we can use seq prefix instead?

select from seq:mySeq.next()

tglman · 2015-03-20T14:59:07Z

something like:
select next from sequence:mySeq
select current from sequence:mySeq

but i think can be tricky the implementation of the 'next' in this way, the case of the 'current' is fine.

WDYT ?

lvca · 2015-04-01T22:02:09Z

Hey @matanshukry we'd like to release tomorrow 2.1-rc1. What's the status of your contrib? WDYT about the syntax for retrieve the next/current?

matanshukry · 2015-04-04T02:30:12Z

I like the syntax, and I can change it in a few days. But I don't think we should release sequences until we can verify they work correctly.

As I wrote (in my long post, I admit), the distributed test is not working. And until it is resolved (or a good explanation on simply how to use it), I don't think we should use sequences.

I will change the syntax in a few days anyway though, so you'll be able to merge it if you want (probably later than 2.1-rc1 by now, but still) but my recommendation would be not to use it until the test is resolved.

That said, I would love if anyone else can take a look at the test, perhaps I'm not using the classes correctly (I tried many ways so I doubt it, but who knows), or even shed some light on the case.

lvca · 2015-04-04T13:57:46Z

@matanshukry Which is the test that failed and need our help?

matanshukry · 2015-04-04T19:30:39Z

@lvca - it's the distributed test ( ServerClusterLocalSequenceTest )

lvca · 2015-04-04T22:45:05Z

I have just run ServerClusterLocalSequenceTest and pass.

matanshukry · 2015-04-04T22:55:02Z

I forgot I added a variable to skip it;

Change the variable "RUN_PARALLEL_SYNC_TEST" to true, and run again (Assuming you didn't change it before ?)

lvca · 2015-04-05T00:15:27Z

ok, found the distributed problem: when node 1 says version 5 and node 2 says version 6, this is a OConcurrentModificationException, but it's raised an ODistributedException. I changed the code to catch this case and seems to finish correctly. The only weird thing is that 50% of the times the result is not 200 numbers but 197 or 198. Seems some sequenced number is retrieved on both thread. On high contention the number of retry (10) was too low. Just to demonstrate the problem I increased it to 1000, just for test. I've also merged your PR 3738 and pushed everything on "pr/3744" branch. Do you have time to look why the same sequence number is returned to 2 threads?

matanshukry · 2015-04-05T02:41:39Z

I remember working intensively for a day or two trying to find out, and the only reason I could think of is the missing case of distributed locks. I can't say for sure if the problem was between different nodes by the way, and not the local node.

The fact that the local node also gets an update message after updating a document, might be problematic too.

I could test the theory by creating some sort of distributed lock (or perhaps higher scope lock for the local node) and test it, but my time is a bit short now. I can't promise anything less than two weeks.

matanshukry · 2015-04-13T16:17:18Z

After looking at the code a bit, I agree with tlgman about the implementation; Making next() work in such a case is a bit tricky. Any chance we can use

select sequence:.next()

or something like that (the idea is using the entire syntax is in the select clause, rather than the from clause) ?

lvca · 2015-05-14T00:44:00Z

@luigidellaquila WDYT about the syntax?

lvca · 2015-05-14T08:06:17Z

I have good news on this. The concurrent sequence in a cluster is semantically correct. The problem is the high contention + replication + MVCC that practically let fail it. The test passes if:

I change the internal retry from 10 to 100
put a random pause between 0 and 50 ms:

 OSequence seq = db.getMetadata().getSequenceLibrary().getSequence(sequenceName);
          for (int j = 0; j < SEQ_RUN_COUNT; ++j) {
            try {
              long value = seq.next();
              res.add(value);
              Thread.sleep(new Random().nextInt(20));
            } catch (OConcurrentModificationException ex) {
              failures.incrementAndGet();
            }
          }

Catch the Distributed Exception about failure and treat it as a retry.

I published it as "sequence" branch.

lvca · 2015-05-14T08:10:50Z

I think on this use case it would be much faster and cheap acquiring a distributed lock to avoid contentions. OrientDB provides it by OHazelcastPlugin.getLock(name) where name could be the following "sequence..". In this way it would be unique even in case of multiple DBs. If you can get a Distributed lock, acquire it, do the change and then release it, you switch from optimistic approach (MVCC) to a pessimistic one. WDYT about providing this behavior as a setting?

matanshukry · 2015-08-04T18:59:49Z

That could be a good feature, but I feel like the optimistic approach (MVCC) should work.

Although your changes to the code would work, I see no reason why we need to add a pause to make it work; It sound like an internal flaw.

I tried increasing the internal try and catching it too, by the way; it wasn't enough.

Conflicts: core/src/main/java/com/orientechnologies/orient/core/db/document/ODatabaseDocumentTx.java core/src/main/java/com/orientechnologies/orient/core/metadata/OMetadataDefault.java tools/src/main/java/com/orientechnologies/orient/console/OConsoleDatabaseApp.java

lvca · 2015-08-29T14:59:40Z

Hey guys, I'm going to test this PR if distributed stuff is working now, but we didn't resolve the problem about syntax. I think the smoothest integration would be creating a new "sequence" target and execute this kind of query:

select next() from sequence:InvoiceId

So if you want it in your update you can do:

update invoice set id = (select next() from sequence:InvoiceId), date = date()

matanshukry added 4 commits March 10, 2015 23:26

Added sequences

efe3ff4

Fixed sql command sequence create/drop syntax

7403da1

Merge remote-tracking branch 'source/develop' into issueSeq

f306388

Merged develop and fixed a few things

403c35b

lvca self-assigned this Mar 11, 2015

lvca added this to the 2.1 milestone Mar 11, 2015

lvca added enhancement contributor friendly labels Mar 11, 2015

lvca modified the milestones: 2.1-rc1, 2.1-rc2 Apr 1, 2015

lvca modified the milestones: 2.2, 2.1-rc2 Apr 13, 2015

lvca added the approved feature label Apr 30, 2015

lvca added a commit that referenced this pull request Aug 31, 2015

Merged PR #3744

579879f

lvca closed this Sep 18, 2015

matanshukry deleted the issueSeq branch January 1, 2016 23:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue seq #3744

Issue seq #3744

matanshukry commented Mar 10, 2015

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

luigidellaquila commented Mar 11, 2015

tglman commented Mar 11, 2015

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

matanshukry commented Mar 11, 2015

lvca commented Mar 11, 2015

matanshukry commented Mar 12, 2015

tglman commented Mar 20, 2015

lvca commented Apr 1, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 4, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 4, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 5, 2015

matanshukry commented Apr 5, 2015

matanshukry commented Apr 13, 2015

lvca commented May 14, 2015

lvca commented May 14, 2015

lvca commented May 14, 2015

matanshukry commented Aug 4, 2015

lvca commented Aug 29, 2015

Issue seq #3744

Issue seq #3744

Conversation

matanshukry commented Mar 10, 2015

User perspective

Usage

Create a sequence

Drop a sequence

Using a sequence

Notes

Prefix

Distributed

Code perspective

Test

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

luigidellaquila commented Mar 11, 2015

tglman commented Mar 11, 2015

lvca commented Mar 11, 2015

lvca commented Mar 11, 2015

matanshukry commented Mar 11, 2015

lvca commented Mar 11, 2015

matanshukry commented Mar 12, 2015

tglman commented Mar 20, 2015

lvca commented Apr 1, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 4, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 4, 2015

matanshukry commented Apr 4, 2015

lvca commented Apr 5, 2015

matanshukry commented Apr 5, 2015

matanshukry commented Apr 13, 2015

lvca commented May 14, 2015

lvca commented May 14, 2015

lvca commented May 14, 2015

matanshukry commented Aug 4, 2015

lvca commented Aug 29, 2015