New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem: Occasional corrupt journal likely triggered by sleep/wake cycle #341
Comments
Which version of chronicle queue are you using?
…On 20 Feb 2017 14:50, "Trevor Bernard" ***@***.***> wrote:
I've been able to trigger this on OSX and Linux usually after a few sleep
wake cycles.
SingleChronicleQueueBuilder(path).build()
08:55:16.130 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.141 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.151 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.161 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.172 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.182 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.192 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.203 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#341>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/ABBU8Vl0m9y1mXU8KzY5drBl5pNxDaTBks5reZoegaJpZM4MGMxP>
.
|
|
I don't understand the root cause but it happens far more then the sleep/wake cycle. Attempting to try and find a minimal test case. |
@trevorbernard Hi, have you been able to identify a test case? Does this problem still happen with latest queue version? |
@dpisklov not a minimal test case but it's very reproducible one. A few awake/sleep cycles of my laptop and closing/re-opening the application/chronicle can put it in the state. I don't think it corrupts the journal as much as it just spam the logs. Writing to the journal/rolling over usually fixes the problem. This also happens in v4.5.27 |
@trevorbernard are you able to provide a failing test case for this, that you can commit via a pull request. I see that your thread name is [engine-tailer] where you using chronicle engine, we have not been able to reproduce this issue as such we may have to close it, but if you can help us out with a test case that would be very helpful, thank you in advance. |
@RobAustin I'm using the chronicle queue -- that's just the name of my thread. Unfortunately, I don't have a minimal test case. This is my tailer code. We operating in a SPSC environment (defn engine-tailer-thread
[^String journal-path listener]
(let [pauser (LongPauser. 1 100 500 10000 TimeUnit/MICROSECONDS)]
(doto (Thread.
#(with-open [lock (AffinityLock/acquireLock)
queue (.build (SingleChronicleQueueBuilder. journal-path))]
(let [tailer (.toEnd (.createTailer queue))]
(log/info "Starting Matching Engine Tailer...")
(while (not (Thread/interrupted))
(try
(if (p/process-engine-event tailer listener)
(.reset pauser)
(.pause pauser))
(catch Throwable t
(log/error t "Uncaught exception in tailer"))))
;; Should we system exit here? Tailer should never shutdown
(log/info "Matching Engine Tailer has been shutdown..."))))
(.setName "engine-tailer")))) If I recreate this issue with a fresh dev chronicle, I'll submit that in lieu of a minimal test case. |
thanks for this pseudo code ( I appreciate the time you have taken in writing this up ), however real java code is more preferable, submitting a failing test case is going to make it more likely that it will be fixed. |
@RobAusti it's not pseudo code but the actual Clojure code we use. When I have free cycles, I'll try to create a reproducible test case in Java. |
@trevorbernard and what you mean by corrupt journal? Our rolling mechanism will write EOF marker to the end of a queue file when it needs to roll over to new cycle (e.g. if you use RollCycles.HOURLY rolling, at the end of an hour it will roll). |
Thanks :-)
…Sent from my iPhone
On 4 Dec 2017, at 3:58 pm, Trevor Bernard ***@***.***> wrote:
@robausti it's not pseudo code but the actual Clojure code we use. When I have free cycles, I'll try to create a reproducible test case in Java.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This prints out every 10ms until something is written to appender or the log rolls over. |
I think it's fixed in 4.6.xx. Can you try with latest version?
(unfortunately it's not on maven central but you should be able to easily
build it locally and use the jar)
…On 4 Dec 2017 16:16, "Trevor Bernard" ***@***.***> wrote:
@dpisklov <https://github.com/dpisklov>
I don't think it corrupts the journal as much as it just spam the logs.
08:55:16.130 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.141 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.151 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.161 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.172 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.182 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.192 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
08:55:16.203 [engine-tailer] DEBUG n.o.c.q.i.s.SingleChronicleQueueExcerpts$StoreTailer - moveToIndex: 433e 2
This prints out ever 10ms until something is written to appender or the
log rolls over.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#341 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHdPyDuiM5bLChIS-Knq7wS-goF8-EKcks5s9BrDgaJpZM4MGMxP>
.
|
Can you point me to the commit please? I'll just cherry pick it on top of |
Is not that easy, in 4.6 file format is different.
4.6.xx is stable for general purpose, it is used by a number of our clients
so you can use it in production without a problem. It has been decided
since time ago that we publish recent versions to the private repo
available to our clients, so if you want to have support for the chronicle
queue, you can contact sales@chronicle.software to get tailored solution.
At the moment you can just build from latest tag, to assess it for your
app.
…On 4 Dec 2017 16:23, "Trevor Bernard" ***@***.***> wrote:
Can you point me to the commit please? I'll just cherry pick it on top of
v4.5.27. I probably won't be using 4.6.xx until it's stabilized and on
central.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#341 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHdPyLBPSUSSs0Ul0nwWvictLFaJA1eAks5s9BycgaJpZM4MGMxP>
.
|
@trevorbernard BTW 4.6.55 is on maven central (although users are encouraged to use bom file, chronicle-bom-1.15.6 is the version you need, as it will also specify all the correct dependencies). |
@dpisklov From what I gather it stands for bill of materials? How do I use it? So far |
Add this in dependencyManagement section of your pom file:
And then you can omit version in your dependencies section, and whenever PS of it works in latest, do you mind closing this issue? Thanks |
Can confirm, no longer seeing this issue with |
@trevorbernard Great thanks! |
I've been able to trigger this on OSX and Linux usually after a few sleep wake cycles.
SingleChronicleQueueBuilder(path).build()
The text was updated successfully, but these errors were encountered: