Skip to content

Feature/geode 12#70

Closed
jinmeiliao wants to merge 456 commits intoapache:feature/GEODE-12from
jinmeiliao:feature/GEODE-12
Closed

Feature/geode 12#70
jinmeiliao wants to merge 456 commits intoapache:feature/GEODE-12from
jinmeiliao:feature/GEODE-12

Conversation

@jinmeiliao
Copy link
Copy Markdown
Member

-- rename pulse to gemfire-pulse
-- fix the test script and tests so that it won't depend on external environment variables
-- fix the test categories, make sure the automation tests run without failure.

shroman and others added 30 commits November 11, 2015 12:39
…nect

network-partition-detection was taking too long to initiate.  This adds checks for IOExceptions to the Transport class to initiate member checks, shrinking the time to detect partitions.

There were still problems with auto-reconnect not being able to join while the old member ID was still in the view.  It would also sometimes install a view and think it had joined when it had not, causing other members to reject messages from the new "member" and resulting in a hung test.  GMSJoinLeave now rejects view messages that don't contain an appropriate member ID during the join process, and installView is smarter about what views it will accept as well.

The view creator was being stubborn about exiting during shutdown.  I've added additional checks to it so that it won't accidentally create another view when GMSJoinLeave is in the process of stopping.
The OffHeapReference interface has been removed.
Use the StoredObject interface instead.

The XD SRC_TYPE constants have been renamed to
unused and a comment added explaining why we might
want to keep the SRC_TYPE/ChunkType feature around
for future off-heap extensions.
The "soplog" code was a partial implementation of a concurrent
LSM tree stored on local disk. This is not currently used anywhere
so is being cleaned up.  The interfaces used by the HDFS feature
have not been deleted.
TCPConduit's Connection.java was not initiating suspect processing when a member crashed.  This was due to not having the check in the normal (amt < 0) check for a socket error.

In testing this fix with ReconnectDUnitTest I found that the change exposed some problems in GMSJoinLeave that were keeping reconnect from happening as fast as it should:

1. The reconnecting member was processing a RemoveMember message intended for its old incarnation.  This caused it to invoke forceDisconnect() but the concurrent join() attempt did not notice this and continued to try to connect until it timed out.

2. ViewCreator was removing the new member from the view if its old ID was being declared crashed in the same view because of the way InternalDistributedMember.compareTo() works with viewless identifiers.

This change-set also gets rid of a bunch of references to JGroups scattered around in the code and removes references to JGroups classes from GMSMembershipManager, moving the code requiring these refs to the quorum checker.
There was a catch clause of a CancelException that was causing us not to
reply to a function call if a CacheClosedException was thrown from the
function. That caused as hang waiting for replies.
The build was using the executable git command to populate version
information, which created big blocks of Gradle code and could be
unstable. All Git executable commands have been changed to use the
Gradle Git plugin. If directory is not a valid Git workspace, then
it will log a warning and use default values to populate version
information.

Tested with and without Git workspace
This internal class was added for a use case that is no longer
supported. It has a bug, in that the client may never have connected to
the server that it supposed to be able to look up and invoke the
function on. Deleting the class rather than trying to fix the bug
because the class is no longer needed.
… connection

The initiation of member verification wasn't working correctly.  It wasn't
being triggered most of the time and when it was the check passed because
it was done so soon after the other member crashed that the timestamp
checks in GMSJoinLeave didn't think enough time had passed to declare the
member dead.  This changes the check to request a heartbeat first.
Merge remote-tracking branch 'origin/develop' into feature/GEODE-77

Conflicts:
	gemfire-core/build.gradle
	gemfire-core/src/main/java/com/gemstone/gemfire/internal/cache/LocalRegion.java
	gemfire-jgroups/build.gradle
Fixed various typos, fleshed-out some links and descriptive text.
Modified the README.md instructions for setting up the local webpage environment.
jinmeiliao and others added 23 commits January 7, 2016 09:46
…n watcher since log4j now has an almost the same implementation.
Sepatrated that logic as test may look this count more than once.
…d to

wait for client termination before validating the HAStatsCleanup.
…dDispatch

removed an old assertion that is no longer valid and rewrote the code to
allow for the possibility of a null entry in the queue.

This commit also includes renaming of one of the old BugXXX unit test classes
that I'm currently working on to have a name that matches what it is testing.
This failure could also happen with the other test in this class:  a
client does put() operations that force the scheduling of a fetch of
PR metadata.  The client asserts that the metadata has been fetched,
but since the fetching is asynchronous it's possible for the client
to complete its put() operations and execute the assertion before the
asynchronous operation has completed.

The fix adds a WaitCriterion for the assertion.
* Add DistributedTest @category to DistributedTestCase
* Rename disabled tests and use @ignore instead.
* Add ContainerTest, HydraTest categories.
* Ensure ever *Test has a @category.
* Disable performance tests that perform no assertions.
* Modify build to check all tests for categories.
* Modify build to use **/*Test.class pattern for all testing tasks.
* Modify DUnitLauncher to get name of ChildVM.class from the class itself.
…failed with AssertionFailedError

Order by needs to be specified in the query
…CompactRangeIndex

Fixed QueryMonitor test hook to pause at the correct point in code
The failure was caused by LifecycleListenerJUnitTest setting
a system property and then not unsetting it when it completed.
testClose now asserts that the system property must be initially
false when it runs.
LifecycleListenerJUnitTest now unsets the property in a finally
block after setting it.
Due to the batched nature of tombstone garbage-collection it's possible
that only a small portion of the tombstones in this test will be removed.
The test has been changed to stop expecting that all tombstones will
be removed.
…use they were not run in the original script. Have the tests not depend on the system variables defined in the gradle script.
…s that are governed by other licenses. fix the rat.gradle to add these files in the correct license section. Remove css/js that are not used by pulse.
…git work directory. Add the pulseversion.properties into rat ignore since this is created at build time.
@jdeppe-pivotal
Copy link
Copy Markdown
Contributor

It looks like the integration tests are still failing. We should either fix those or mark them as Ignore. If the selenium tests (which require a local browser) can be made to work, then consider introducing a new test category; something like UITest or ManualTest.

@asfgit asfgit closed this in 0236b7e Jan 21, 2016
@jinmeiliao jinmeiliao deleted the feature/GEODE-12 branch January 22, 2016 17:40
pdxrunner pushed a commit to pdxrunner/geode that referenced this pull request Jan 25, 2016
bschuchardt pushed a commit to bschuchardt/geode that referenced this pull request Jul 10, 2020
update the readme with instructions for profiling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.