Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MODE-650 #2

Merged
merged 78 commits into from Apr 2, 2013
Merged

MODE-650 #2

merged 78 commits into from Apr 2, 2013

Conversation

vasilievip
Copy link

  • Merge with latest master
  • Few tweaks to code

rhauch and others added 30 commits February 18, 2013 13:51
… options.

The garbage collection options can now be specified in the repository
configuration. Each repository registers its own garbage collection
processes with a named thread pool, and these processes are stopped
when the repository is shut down.
…stem

Added the garbage collection configuration attributes to ModeShape's
XSD schema and subsystem, and added a unit test that verifies the
attributes are read correctly.
… simultaneously from multiple threads. The solution was to add an additional method to JcrSession#PreSave and which is executed only after Infinispan locks are obtained on the changed nodes.
Made several improvements to the recent changes that add validation
of SNS *after* locks were obtained.
Added a new specialization of WorkspaceCache called
TransactionWorkspaceCache that is used in place of WorkspaceCache
any time a ReadOnlySessionCache or WritableSessionCache is used
during the scope of a running user transaction.

Normally, each RepositoryCache instance has for each workspace a
single WorkspaceCache instance shared by all sessions. However, in
the case of a session is running within a user transaction, it is
possible for the session (or other sessions that are also participating
in the same user transaction) to persist node changes in the document
store but to not commit the transaction. This means that such
persisted-but-not-committed node representations should be visible to
the sessions running within the transaction but should not be visible
to sessions outside of the transaction. Because the shared
WorkspaceCache lazily loads the requested nodes using the transactional
scope of the requestor, this means that the WorkspaceCache may load
these persisted-but-not-committed node representations and make them
available to other sessions (outside the transaction boundary) and
thereby leaking transactionally-scoped data. Similarly,
a WorkspaceCache that already has the persisted-and-committed node
representations from other sessions will expose those already-cached
forms to sessions running within the transaction, thereby causing
some of the sessions in the same user transaction to not see the
transactionally-scoped (e.g., persisted-but-not-committed) changes.

Therefore, such sessions running within user transactions need a
transactionally-scoped WorkspaceCache instance rather than sharing
the commont WorkspaceCache instance. However, the ModeShape
infrastructure is not set up to handle lots of WorkspaceCache instances
for the same workspace, so we have to be careful about any
transaction-specific caches.

This new design changes the AbstractSessionCache to maintain the original
reference to the real (shared) WorkspaceCache, but to also have another
WorkspaceCache reference that may change (depending upon whether there
is a current user transaction) to a transient TransactionalWorkspaceCache
instance specific to the transaction.

Originally, I tried to have the TransactionalWorkspaceCache do no
caching whatsoever, but this caused problems within the
WritableSessionCache's persistChanges logic, which needs access to
the persisted representation of the node prior to when each changed
node was actually changed. So, the TWC simply uses an in-memory
ConcurrentHashMap. (The alternative was to use another Infinispan
cache such as the one used for the shared workspace caches, but this
would dramatically increase the complexity of the system and would
require an additional cache container configured with different settings
than the normal workspace caches and to use a different naming
convention.)

Several new test cases were added to verify that multiple sessions
within the same user transaction do indeed all see the same
transactionally-scoped data, while other sessions that are outside
of the user transaction do indeed not see any of the transactionally-
scoped changes until commit occurs.
Other fixes for MODE-1819 actually fixed the problem, while this
change only adds test cases and test dependencies on Atomikos to
verify the expected behavior now works. Also added similar tests
that use the JBoss Transactions library.
…written strings are always UTF-8 encoded and added UTF-8 as an explicit response header.
…nd also to generate full links when displaying the contents of folders.
Null check in case the node no longer exist in the cache.
… of the same node in a session

Fix the issue by making sure that nodeTypes with the same name is actually updated properly in the cache.
Before this fix, when a change was made via a WritableSessionCache,
any changes persisted to the store were also used to directly purge
any previously-cached node representations from the corresponding
WorkspaceCache. Thus, when the RepositoryCache received the changes,
it simply forwarding them to all workspaces except the one in which
the changes originated.

However, in a clustered environment, the changes originate on one
process and are sent to all other processes. The logic described
above works fine in the same process in which the changes originated,
but it doesn't correctly propagate the changes in the other processes.

The fix was pretty simple: determine if the changes came from the same
process. If so, the use the current logic. If not, always forward the
changes to all WorkspaceCache instances.

I also modified the ClusteredRepositoryTest to look in process A for
nodes changed in process B. This failed before the logic was fixed,
and now works with the fix.
…nate Search/Lucene to 4.2/3.6.

Because the new version of Infinispan does not allow by default duplicate MBeans with the same name, the tests had to be updated to specifically turn this option on for each used caches.
Removed the @ignore from the JDBCRepositoryIntegrationTest, which had
to be added to the changes for MODE-1826 since the fix for
MODE-1769 has not yet been corrected but will be when Infinispan
5.2.1.Final is used.
- the former 7.1.1.kit has been removed (together will assemblies) because ISPN 5.2 is not compatible with AS7.1.1
- removed the modeshape-integration-tests folder as that was ported from 2.x and either the corresponding tests have been ported in various other places in 3.x or they no do apply anymore (connectors)
- changed our AS7 distribution kit to as72
- since integration with AS7.2.0 is still WIP, the integration modules have been commented out
…S7.2 server and re-enabled the integration tests modules. However, for a local build to work, one needs for the time being a locally installed (via Maven) version of AS7.2.

 In the process of updating the kit, a couple of other things were updated:
 - exception processing around WritableSessionCache#persistChanges
 - logging of exceptions / errors in the ModeShape Webdav Servlet
…v6 profiles. By default, neither IPv4 nor IPv6 will be forced, meaning that whatever the underlying OS supports will be used (see http://docs.oracle.com/javase/1.5.0/docs/guide/net/ipv6_guide/index.html for more information)

Also, It seems that on Windows (at least) JGroups 3.2 is a lot faster when using IPv6 than it is when using IPv4, so a new OS-dependent profile was added to modeshape-jcr. Also, it seems Arquillian doesn't work with IPv6, so IPv4 was enforced for the integration tests.
rhauch and others added 27 commits March 11, 2013 17:53
Improved the behavior of JCR-SQL2 and JCR-QOM queries that use
columns not in the SELECT clause. Such columns are necessary
within the query processing to evaluate the ordering, but are not
to be exposed in the query results. One particular case needed
by the recent XPath improvements (to handle order-by clauses that
involve a property on the child node) resulted in the ORDER BY
column coming from a selector that was not even included in the
SELECT.

A test case recently added in other MODE-1680 commits was modified
to verify the XPath and JCR-SQL2 behavior is as expected. Also added
another query test that uses only JCR-SQL2.
… are stored

Added a few bits of logic to ensure that any DocumentEditor or ArrayEditor
(which wrap Document and Array objects) are not stored. There were a few
methods that might have been storing them.
ModeShape's implementations of 'javax.jcr.Item' all implement this
new interface. So applications can check and cast a Node or Property
(or Version or VersionHistory) to 'Namespaced' so that these methods
can be called.

Added several unit tests to verify the behavior.
…o that new kit and integration tests work against EAP 6.1.Alpha1
It is now possible to define a node type that overrides a property
definition of a supertype, as long as the new definition is as-constrained
or more-constrained than that of the supertype.

For example, a property definition in a supertype N might define
constraints on a STRING property definition, with enumerated values
of "A", "B", "C" and "D". A subtype N' can override this property
definition with constraints that are a subset of this (e.g., "A", "C"),
ensuring that any node of type N' is also still a valid node of type N.
However, not all nodes of type N are valid nodes of type N'.

ModeShape determines which property definitions are
"as- or more-constrained than" another property definition using rules
that are dependent upon the property type of the definitions. If the
property types are the same, the rules are as follows:

- STRING and URI: the constraint literal strings (which are regular
expressions) in the overriding definition must also appear exactly
as a constraint literal string in the overridden definition.
- LONG, DOUBLE, DECIMAL, DATE: the constraints are ranges, and each of the
ranges in the overriding definition must be equal to or wholly contained within
one of the ranges in the overridden definition.
- BINARY: the constraints are ranges for the binary value sizes, and
each of the size ranges in the overriding definition must be equal to
or wholly contained within one of the size ranges in the overridden definition.
- NAME: the constrained names in the overriding definition must be
a subset of those found in the overridden definition.
- PATH: the constrained paths (including an optional '*' descendant
wildcard) in the overriding definition must either exactly match a
path in the overridden definition, or if the optional '*' descendant
wildcard is used a path that is below one of the paths in the
overridden definition
- REFERENCE: the constrained node type names in the overriding definition must be
a subset of those found in the overridden definition.

If the property types of the overriding and overridden property definitions
are different, then a simple string comparison is used to ensure that
each of the constraint literal strings of the overriding definition
are also constraint literal strings for the overridden definition.

A number of unit tests were added to verify that determining whether
a property definition is as- or more-constrained than another property
definition. Also, several unit tests were added to check that nodes
that used overridden definitions are properly validated.

Note that even when the node type has residual property definitions,
if a non-residual property definition applies but is not constrained,
the residual property defintiions are **NOT** used. Thus, a property
"foo" on a node with a (primary or mixin) node type that has a "foo"
property definition will be validated with the "foo" property definition;
if the property value does not satisfy the constraints, then a
ConstraintViolationException is thrown despite the fact that a residual
property definition might theoretically apply. This behavior is indeed
compliant with the specification.
Per irc conversation with Oleg, if repositoryId is short form (e.g. "repositoryName", rather than "repositoryName:workspace"),
then workspace(String) should return null. This allows login to use the default workspace.

Adds jetty-maven-plugin to run ModeShapeCmisClientTest.

https://issues.jboss.org/browse/MODE-1832
…which allows any web application packaged under the "deployments" folder of the ModeShape main module, to be deployed by the subsystem. This replaces the previous "static" way of copying web applications directly from the kit in the servers "deployments" directory.
…tly removed

When the query results are being processed/produced, any node that is not
found is simply ignored (treated as non-existant from the perspective
of the query). However, in certain cases, the timing of a removal could
occur within the method that obtains the document from the cache, causing
an exception. This was rectified to return null in such cases. Thus,
the existing code in the BasicTupleCollector that handles null nodes
will continue to work in all cases.

I was not able to come up with a test case able to replicate this scenario,
however. That's because the failure occurs within a single WorkspaceCache
method, after the SchematicEntry is obtained but before the SchematicEntry's
content document can be obtained. A test case cannot reliably ensure
this condition occurs (without adding junk code within the WorkspaceCache).
…modules" folder. To make this easier to change in the future, a root property was added which controls the location.
JGroups needs to know about the classloader that loads the ModeShape
event classes. The easiest way to do this is to extend ObjectInputStream
to supply and use a classloader for deserialization. This is especially
important in AS7/EAP and OSGi, where a single classloader is not used
for all the components.
The internal session nodes that track transient changes were not
correctly recording how a property was removed. If a property were
added and immediately removed (all before saving), the add was
cleared from the transient state but the remove was still kept.
(This was due to the fact that it's more difficult to track removes,
since we're using a single structure to track adds and sets, so
it's not really easy to tell when a property was added and then
removed.) This change works around that constraint by using the
persisted state of the node to determine if any added/set property
already exists in the persisted state. If not, then the remove
is considered to undo the prior add/set.

A single test was added to perform the add/remove prior to a save.
However, due to the eventing mechanism, it's not possible for the
test to verify that the node was not re-indexed. (Simple debugging
and trace logging was used to verify that such changes no longer
cause an update to the node's indexes. If there are better ideas
for how the test can verify, please say so.)
Changed how the DatabaseBinaryStore configures and uses the database,
so that the various SQL statements (both DDL and DML) are read from
property files rather than be dynamically created. This means that it
is far easier to see and understand exactly the kinds of statements
that are issued for a given DBMS. It also allows the database table
and statements to be altered slightly for specific DBMSes.

Several DBMS-specific property files (containing the SQL statements)
were added to accommodate the variations in the previously-generated
statements. There is a fair amount of duplication, but each file
is self-contained, is very readable, and can be customized as needed
(including by the user, since additional property files can be created
to override those ModeShape provides out-of-the-box).
IMO, these benefits outweigh the negative aspects of the duplication.

Since several DBMSes (including MySQL, PostgreSQL, Sybase
and SQLServer) consider 'usage' to be a reserved keyword, the 'usage'
column was renamed to 'usage_flag' in the create-table and other
statements for all of the DBMSes. (This shouldn't be a problem,
since the use of reserved words should have prevented creation
of the table in the first place). The statements for the other DBMSes
still use 'usage', but these DBMSes don't consider 'usage' to be a
reserved keyword.

This change also reduced the amount of "general-purpose" JDBC-oriented
code, such as eliminating the SQLBuilder code.

Finally, the Database class now creates PreparedStatement objects
during initialization, and these are reused throughout the lifetime
of the DatabaseBinaryStore. This will increase performance compared
with creating a PreparedStatement every time one is needed.
… is removed. Refactored the general remove & recover operations, as they are very similar.
…indexing operations were submitted on a transaction in state COMMITTED, causing Hibernate Search to no see an active transaction.
okulikov added a commit that referenced this pull request Apr 2, 2013
@okulikov okulikov merged commit 98aca22 into okulikov:MODE-650 Apr 2, 2013
okulikov pushed a commit that referenced this pull request Sep 2, 2014
…dexes

There are 2 changes:

1) The engine now internally alters the LIKE expression in a path comparison
to include a same-name-sibling index for all literal segments of the path.
For example, the path `/a/b/c_/d[%]` is converted to `/a[1]/b[1]/c_[1]/d[%]`,
while `/a/%/d[%]` is converted to `/a[1]/%/d[%]`. Obviously any segment that
already has a SNS index (literal or wildcard) is left alone.

2) When extracting from each row the left-hand values for the LIKE comparison,
same-name-sibling indexes are added to all segments.

These changes replace a slightly different technique that did not do #1 but
for #2 extracted the idiomatic path (with SNS indexes only where required)
and a second value with SNS indexes for all segments. This worked when the
LIKE expression specified SNS indexes (wildcard or literals) for all segments,
or when the LIKE expression had no SNS indexes at all. But this would not work
when the LIKE expression contained some segments with and others without SNS
indexes (e.g., `/a/b/c[%]`).

Several new tests validate the behavior, and no regressions were found.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants