Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown) #11413

Merged
2 commits merged into from Nov 14, 2016

Conversation

ghost
Copy link

@ghost ghost commented Oct 11, 2016

@ghost ghost self-assigned this Oct 11, 2016
@ghost ghost added this to the jewel milestone Oct 11, 2016
@ghost ghost added bug-fix core cephfs Ceph File System and removed core labels Oct 11, 2016
ghost pushed a commit that referenced this pull request Oct 13, 2016
…ed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown)

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@ghost
Copy link
Author

ghost commented Oct 13, 2016

rc/test/cli/osdmaptool/print-nonexistent.t: passed
src/test/cli/osdmaptool/test-map-pgs.t: failed
--- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/osdmaptool/test-map-pgs.t
+++ ./src/test/cli/osdmaptool/test-map-pgs.t.err
@@ -40,6 +40,7 @@
 # it is almost impossible to get the same stats with random and crush
 # if they are, it most probably means something went wrong somewhere
   $ test "$STATS_CRUSH" != "$STATS_RANDOM"
+  [1]
 #
 # cleanup
 #
src/test/cli/osdmaptool/tree.t: passed

@ghost ghost changed the title jewel: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown) DNM: jewel: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown) Oct 13, 2016
@jcsp
Copy link
Contributor

jcsp commented Oct 26, 2016

@dachary that test failure looks really unlikely to be anything to do with this change

@jcsp jcsp changed the title DNM: jewel: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown) jewel: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown) Oct 26, 2016
@ghost
Copy link
Author

ghost commented Oct 26, 2016

jenkins test this please

@ghost ghost changed the base branch from jewel to jewel-next November 9, 2016 09:58
John Spray added 2 commits November 9, 2016 12:14
The MDS is a client to the OSDs, and responds
to blacklists by respawning itself.  Usually
respawns of a daemonized process result in a PID
change, but it's not guaranteed, and it's definitely
not the case when someone runs in foreground (e.g.
teuthology).

Using a random nonce makes sure we won't match
against an existing blacklist entry from a failed
instance of an MDS daemon with the same name as us.

Related to: http://tracker.ceph.com/issues/17236
Signed-off-by: John Spray <john.spray@redhat.com>

(cherry picked from commit 5ba6128)

Conflicts:
	src/ceph_mds.cc : Messenger::create() prototype is different
EBLACKLISTED was being incorrectly handled as an
indication of metadata damage.

Fixes: http://tracker.ceph.com/issues/17236
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 19bb8c0)
@ghost
Copy link
Author

ghost commented Nov 9, 2016

rebased

ghost pushed a commit that referenced this pull request Nov 9, 2016
…ed to read JournalPointer: -108 ((108) Cannot send after transport endpoint shutdown)

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@ghost
Copy link
Author

ghost commented Nov 14, 2016

@jcsp does this backport look good to merge ? It passed the fs suite http://tracker.ceph.com/issues/17851#note-5 (except for valgrind failures and one java test failure that also fail on the jewel branch ). Note that it will not be included in 10.2.4, reason why it targets jewel-next.

@ghost ghost assigned jcsp Nov 14, 2016
@ghost ghost merged commit 44f75dc into ceph:jewel-next Nov 14, 2016
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix cephfs Ceph File System
Projects
None yet
1 participant