Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8229517: Support for optional asynchronous/buffered logging #3135

Closed
wants to merge 39 commits into from
Closed
Changes from 12 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
0697587
8229517: Support for optional asynchronous/buffered logging
Mar 22, 2021
a3945d0
8229517: Support for optional asynchronous/buffered logging
Mar 22, 2021
e23657f
8229517: Support for optional asynchronous/buffered logging
Mar 22, 2021
2871894
8229517: Support for optional asynchronous/buffered logging
Mar 22, 2021
760ec76
8229517: Support for optional asynchronous/buffered logging
Mar 28, 2021
bcefbec
8229517: Support for optional asynchronous/buffered logging
Mar 28, 2021
09c058d
8229517: Support for optional asynchronous/buffered logging
Apr 1, 2021
9211121
8229517: Support for optional asynchronous/buffered logging
Apr 9, 2021
81b2a0c
fix runtime/logging/RedefineClasses.java crashed with -XX:+AsyncLogging
Apr 9, 2021
54aadfa
Revert "fix runtime/logging/RedefineClasses.java crashed with -XX:+As…
Apr 12, 2021
f453d03
Inject the number of dropped messages since last dumpping.
Apr 15, 2021
3d74a32
Fix build issue with `--disable-precompiled-headers`
Apr 15, 2021
2bd104f
Support Jcmd pid VM.log disable
Apr 16, 2021
edb15c6
Remove LogAsyncInterval completely
Apr 20, 2021
11c2b7a
Resolve rank conflict between tty_lock and LogAsyncFlusher's _lock.
Apr 21, 2021
b2ad511
Fix a race condition bug on LogAsyncFlusher termination.
Apr 22, 2021
13c1c2b
Merge branch 'master' into JDK-8229517
Apr 22, 2021
4cf4a57
Refactor LogAsyncFlusher::abort()
Apr 23, 2021
1f06be3
Accurate Decorations for AsyncLogging.
Apr 23, 2021
e2d6f9e
Fix bugs/style/typo based on reviewers' feedbacks.
Apr 23, 2021
78d8f2c
Revert "Accurate Decorations for AsyncLogging."
Apr 29, 2021
e035ee3
Reimplement Accurate Decorations for AsyncLogging.
Apr 29, 2021
98ae33f
Implement the global option -Xlog:async per CSR.
May 3, 2021
1d5e540
Use LogTagSetMapping<LogTag::__NO_TAG>::tagset() instead of NULL poin…
May 5, 2021
c67d74e
Change option AsyncLogBufferEntries to AsyncLogBufferSize.
May 7, 2021
e3ee430
Merge branch 'master' into JDK-8229517
May 8, 2021
908d2a3
Update based on the feedbacks from reviwers.
May 11, 2021
7df8b1e
Fix build on Windows.
May 12, 2021
8f7b3b1
Refactor Initialize() and Termiante().
May 13, 2021
cade304
Replace Mutex with Semaphore.
May 14, 2021
e19e90a
refactor code and update per reviewers' feedback.
May 16, 2021
be83aaf
Implement the new discard policy: drop the incoming message.
May 17, 2021
348edfc
Update according to reviewer's feedback.
May 20, 2021
2381ba1
Add threadump support for AsyncLog Thread.
May 22, 2021
cad42d7
Update according to reviewers' feedback (Part-2).
May 22, 2021
9800a8c
flush() waits until all pending logging IO operations are done.
May 24, 2021
a25b8d6
Biased to the thread which is taking buffer lock.
May 24, 2021
2ffc33f
Remove AsyncLogWriter::flush() from os::shutdown().
May 25, 2021
b3d36cf
Correct typos and add a comment for a corner case.
May 27, 2021
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -727,6 +727,7 @@ bool os::create_thread(Thread* thread, ThreadType thr_type,
case os::vm_thread:
case os::pgc_thread:
case os::cgc_thread:
case os::asynclog_thread:
case os::watcher_thread:
if (VMThreadStackSize > 0) stack_size = (size_t)(VMThreadStackSize * K);
break;
@@ -0,0 +1,180 @@
/*
* Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*
*/
#include "precompiled.hpp"
#include "jvm.h"
#include "logging/logAsyncFlusher.hpp"
#include "logging/logFileOutput.hpp"
#include "logging/logHandle.hpp"
#include "runtime/atomic.hpp"

void AsyncLogMessage::writeback() {
if (_message != NULL) {
// should cache this object somehow
LogDecorations decorations(_level, _tagset, _decorators);
_output.write_blocking(decorations, _message);
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@tstuefe

tstuefe Mar 27, 2021
Member Outdated

Would this mean that time dependent decorators get resolved at print time, not when the log happen?

This comment has been minimized.

@navyxliu

navyxliu Mar 28, 2021
Author Member Outdated

yes, it is. I just realize that this does distort the timestamps of relevant decorators.

I explain to Volker why I choose to compress LogDecorations to LogDecorators. #3135 (comment)

This issue is similar to "safepoint-bias" in Java profiler. Those timestamps "bias" to the flusher, but the error is limited to AsyncLogInterval. We could spend extra efforts to keep the original timestamps. Is it worth it?

}
}

void LogAsyncFlusher::enqueue_impl(const AsyncLogMessage& msg) {

This comment has been minimized.

@tstuefe

tstuefe May 13, 2021
Member Outdated

Could you rename this to enqueue_locked? We use this name at other places for functions which factor out locking to the caller.

This comment has been minimized.

@navyxliu

navyxliu May 13, 2021
Author Member Outdated

okay.

assert_lock_strong(&_lock);

This comment has been minimized.

@tstuefe

tstuefe May 13, 2021
Member Outdated

Can we enqueue (so, log) from within the async logger itself? If not, add an assert here that this should not be called from the logger thread?

This comment has been minimized.

@navyxliu

navyxliu May 13, 2021
Author Member Outdated

enqueue_impl() is private. it's not accessible from outside.
only public member functions can be seen by clients.

This comment has been minimized.

@tstuefe

tstuefe May 14, 2021
Member Outdated

enqueue_impl() is private. it's not accessible from outside.
only public member functions can be seen by clients.

What I meant is, can logging - hence enqueing - be done from the async thread itself? Can it happen that the logger thread itself logs, would that be a problem, and if yes, should we guard against it?

This comment has been minimized.

@navyxliu

navyxliu May 21, 2021
Author Member Outdated

Actually, it could happen. This line is the very first 'log' from AsyncLog Thread.

  void pre_run() override {
    NonJavaThread::pre_run();
    log_debug(logging, thread)("starting AsyncLog Thread tid = " INTX_FORMAT, os::current_thread_id());
  }

This logsite is handled exactly same as other logsites. grab the lock, enqueue the cloned message and return -1.


if (_buffer.size() >= AsyncLogBufferSize) {
const AsyncLogMessage* h = _buffer.front();
assert(h != NULL, "sanity check");

if (h->message() != nullptr) {
bool p_created;
uintx* counter = _stats.add_if_absent(h->output(), 0, &p_created);
*counter = *counter + 1;

if (Verbose) {
// Temporarily turn off SerializeVMOutput so defaultStream will not
// invoke set_owner(self) for tty_lock.
FlagSetting t(SerializeVMOutput, false);
// The writing below can not guarantee non-blocking because tty may be piped by the filesystems
// or throttled by XOFF, so only dump the dropping message in Verbose mode.
tty->print_cr("asynclog dropping message: %s", h->message());
}
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@tstuefe

tstuefe May 13, 2021
Member Outdated

I'm not sure this is needed or sensible. If we are flooded, we would then proceed to print every dropped message to tty? So now we flood tty?

I'd hope exhausting the output buffer is rare, and easily remedied by telling the user to increase the buffer or stop using async mode.

}

_buffer.pop_front();
}
assert(_buffer.size() < AsyncLogBufferSize, "_buffer is over-sized.");
_buffer.push_back(msg);

size_t sz = _buffer.size();
if (sz == (AsyncLogBufferSize >> 1) || sz == AsyncLogBufferSize) {
_lock.notify();
}
}

void LogAsyncFlusher::enqueue(LogFileOutput& output, const LogDecorations& decorations, const char* msg) {
AsyncLogMessage m(output, decorations, msg);

{ // critical area
MutexLocker ml(&_lock, Mutex::_no_safepoint_check_flag);
enqueue_impl(m);
}
}

// LogMessageBuffer consists of a multiple-part/multiple-line messsage.
// the mutex here gurantees its interity.
void LogAsyncFlusher::enqueue(LogFileOutput& output, LogMessageBuffer::Iterator msg_iterator) {
MutexLocker ml(&_lock, Mutex::_no_safepoint_check_flag);

for (; !msg_iterator.is_at_end(); msg_iterator++) {
AsyncLogMessage m(output, msg_iterator.decorations(), msg_iterator.message());
enqueue_impl(m);
}
}

LogAsyncFlusher::LogAsyncFlusher()
: _should_terminate(false),
_lock(Mutex::tty, "async-log-monitor", true /* allow_vm_block */, Mutex::_safepoint_check_never),
_stats(17 /*table_size*/) {
if (os::create_thread(this, os::asynclog_thread)) {
os::start_thread(this);
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@tstuefe

tstuefe May 13, 2021
Member Outdated

Here is some error handling needed if thread creation failed. UL should drop to sync mode then.

This comment has been minimized.

@navyxliu

navyxliu May 13, 2021
Author Member Outdated

make sense! I will take a look at the error handling.

}
}

bool AsyncLogMapIterator::do_entry(LogFileOutput* output, uintx* counter) {
LogDecorators decorators = output->decorators();
decorators.without(LogDecorators::tags_decorator);
LogDecorations decorations(LogLevel::Warning, decorators);
const int sz = 128;
char out_of_band[sz];

if (*counter > 0) {
jio_snprintf(out_of_band, sz, UINTX_FORMAT " messages dropped...", *counter);
output->write_blocking(decorations, out_of_band);
*counter = 0;
}

return true;
}

void LogAsyncFlusher::flush() {
LinkedListImpl<AsyncLogMessage, ResourceObj::C_HEAP, mtLogging> logs;

{ // critical area
MutexLocker ml(&_lock, Mutex::_no_safepoint_check_flag);
_buffer.pop_all(&logs);

AsyncLogMapIterator iter;
_stats.iterate(&iter);
}

LinkedListIterator<AsyncLogMessage> it(logs.head());
while (!it.is_empty()) {
AsyncLogMessage* e = it.next();
e->writeback();
}
}

void LogAsyncFlusher::run() {
while (!_should_terminate) {
{
MonitorLocker m(&_lock, Mutex::_no_safepoint_check_flag);
m.wait(LogAsyncInterval);
}
flush();
}
}

LogAsyncFlusher* LogAsyncFlusher::_instance = nullptr;

void LogAsyncFlusher::initialize() {
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

I don't think you need this. Please merge the initialization into LogAsyncFlusher::instance() (see my comments on your changes in init.cpp).

This comment has been minimized.

@tstuefe

tstuefe May 7, 2021
Member Outdated

In contrast to Volker I think an explicit initialization is fine and follows the established pattern for other subsystems. However, I would:

  • assert for _instance == NULL and remove the condition. This should only be called once.
  • no need for Atomics here since we do this during VM init.

And while writing this, I realize this would conflict with the gtest fixture where you switch this on or off several time in one process. I propose you switch those gtests - only those testing the LogAsyncFlusher, not those testing the Deque etc - as TEST_OTHER_VM.

Which arguably we should do for all logging gtests, not only yours, if they fiddle with global VM settings. In other words, it should be possible to run the gtests with -Xlog:... options and the gtests should still pass, including all log tests. Today, this is not the case.

if (!_instance) {
_instance = new LogAsyncFlusher();
}
}

// Termination
// 1. issue an atomic store-&-fence to close the logging window.
// 2. flush itself in-place
// 3. signal the flusher thread to exit
// 4. (optional) deletes this in post_run()
void LogAsyncFlusher::terminate() {
if (_instance != NULL) {
LogAsyncFlusher* self = _instance;

// make sure no new log entry will be enqueued after.
Atomic::release_store_fence<LogAsyncFlusher*, LogAsyncFlusher*>(&_instance, nullptr);
self->flush();
{
MonitorLocker m(&self->_lock, Mutex::_no_safepoint_check_flag);
self->_should_terminate = true;
m.notify();
}
}
}

LogAsyncFlusher* LogAsyncFlusher::instance() {
if (Thread::current_or_null() != nullptr) {
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@tstuefe

tstuefe May 7, 2021
Member Outdated

I don't understand why this depends on the existence of a current thread? Ideally we should be able to log just fine in corner cases, e.g. when thread is detached or if we have an error (during signal handling e..g.)

This comment has been minimized.

@navyxliu

navyxliu May 10, 2021
Author Member Outdated

LogAsyncFlusher::enqueue needs Thread::current() to use _lock.

void Mutex::lock_without_safepoint_check() {
  lock_without_safepoint_check(Thread::current());
}

This check is for a very rare corner case. This can be trigger using the following cmdline.
java -Xlog:'thread+smr=debug:file=/tmp/t.log' -Xlog:async -version
the root cause is the following code in thread

void ThreadsSMRSupport::smr_delete(JavaThread *thread) {
  elapsedTimer timer;
  if (EnableThreadSMRStatistics) {
    timer.start();
  }

  wait_until_not_protected(thread);

  delete thread; // Thread::current() will trigger assert(current != __null) failed: Thread::current() called on detached thread after ~Thread()
  if (EnableThreadSMRStatistics) {
    timer.stop();
    uint millis = (uint)timer.milliseconds();
    ThreadsSMRSupport::inc_deleted_thread_cnt();
    ThreadsSMRSupport::add_deleted_thread_times(millis);
    ThreadsSMRSupport::update_deleted_thread_time_max(millis);
  }
  // logsite will crash  in async mode because we can't use Thread::current() here. 
  log_debug(thread, smr)("tid=" UINTX_FORMAT ": ThreadsSMRSupport::smr_delete: thread=" INTPTR_FORMAT " is deleted.", os::current_thread_id(), p2i(thread));
}

This comment has been minimized.

@tstuefe

tstuefe May 13, 2021
Member Outdated

Sigh. I get it. Another example that a good logging system should be independent from the infrastructure of the surrounding system. Any chance this patch could use raw locking primitives, pthread_mutex resp. CreateMutex, without having to rewrite everything?

Ironically I though async logging would reduce the number of log call sites where the conditions for logging are not met, by moving most of the stack out of the caller into a separate thread. But the Thread::current condition is a new limitation then compared to sync logging.

If you don't want to change the Mutex:

So you have some rare log sites which won't allow you to use asynchronous logging since you cannot use locks, and you solve this by returning NULL, which then falls back to synchronous logging? What happens if you mix synchronous and asynchronous logging like this? Does this work?

From a code point of view your reasoning is very difficult to deduce. Can you please change this: remove the THread::current condition from the singleton getter, and add an explicit check for Thread::current at its caller site. In async mode either just omit printing any output needing Thread::current, or relegate to synchronous printing. And add a good comment there.

This comment has been minimized.

@navyxliu

navyxliu May 13, 2021
Author Member Outdated

Thank you for reviewing. you raise the exact problem which keep me from sleeping...

From a code point of view your reasoning is very difficult to deduce. Can you please change this: remove the THread::current condition from the singleton getter, and add an explicit check for Thread::current at its caller site. In async mode either just omit printing any output needing Thread::current, or relegate to synchronous printing. And add a good comment there.

That's exactly the current approach. It works but async logging and synchronous logging may overlap in some rare windows such as my example of smr_delete.

LogAsyncFlusher* LogAsyncFlusher::instance() {
  // thread may has been detached, then Thread::current() used by Mutex is inavailable.
  // eg. ThreadsSMRSupport::smr_delete() uses log_debug() after delete thread.
  if (Thread::current_or_null() != nullptr) {
    return _instance;
  } else {
    return nullptr;
  }
}

One drawback is that I have to give up the FileLocker optimization for the entire interval async logging(99.99% time)
for those "rare" windows :( Once I remove FileLocker for FILE*, it may get stuck when two logging writers overlap!

I am evaluating different approaches. eg. the 2-synchronization-points design or even replace Mutex with a semaphore-based lock like ConfigurationLock

return _instance;
} else {
// current thread may has been detached.
return nullptr;
}
}
@@ -0,0 +1,157 @@
/*
* Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*
*/
#ifndef SHARE_LOGGING_ASYNC_FLUSHER_HPP
#define SHARE_LOGGING_ASYNC_FLUSHER_HPP
#include "logging/logDecorations.hpp"
#include "logging/logFileOutput.hpp"
#include "logging/logMessageBuffer.hpp"
#include "memory/resourceArea.hpp"
#include "runtime/mutexLocker.hpp"
#include "runtime/nonJavaThread.hpp"
#include "utilities/hashtable.hpp"
#include "utilities/linkedlist.hpp"

template <typename E>
class LinkedListDeque : private LinkedListImpl<E, ResourceObj::C_HEAP, mtLogging> {
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

The name LinkedListDeque implies that this is a general purpose Deque implementation which is not true. It's fine to have an implementation for your very specific needs (otherwise it should probably be in its own file under share/utilities/dequeue.hpp). But to make this explicitly clear to the reader, can you please rename it to something like AsyncFlusherDeque and specify it's semantics in a comment on top of the class. E.g this class doesn't support the usage of the inherited add() method because that would break the size() functionality.

This comment has been minimized.

@navyxliu

navyxliu Apr 23, 2021
Author Member Outdated

I have changed the class declaration to
template <typename E, MEMFLAGS F> class LinkedListDeque

Per your request, I have moved all logging logics out of it, so it can be used as a generic LinkedListDeque. I think it's fair enough to keep its name. In logging logics, I use a typedef version here:
typedef LinkedListDeque<AsyncLogMessage, mtLogging> AsyncLogBuffer;

private:
LinkedListNode<E>* _tail;
size_t _size;

public:
LinkedListDeque() : _tail(NULL), _size(0) {}
void push_back(const E& e) {
if (!_tail)
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

I think the convention is to use curly braces even for one-line blocks (as you've done in your other code).

_tail = this->add(e);
else
_tail = this->insert_after(e, _tail);

++_size;
}

void pop_all(LinkedList<E>* logs) {

This comment has been minimized.

@tstuefe

tstuefe May 6, 2021
Member Outdated

If you do your own Deque as proposed above, I would change this to a simple "take_over" operation where you hand over _head and _tail to the new list and NULL them out in the old one. Way easier to read. You do not need the ability of "move()" to append to a non-empty receiver list.

logs->move(static_cast<LinkedList<E>* >(this));
_tail = NULL;
_size = 0;
}

void pop_front() {
LinkedListNode<E>* h = this->unlink_head();
if (h == _tail) {
_tail = NULL;
}

if (h != NULL) {
--_size;
this->delete_node(h);
}
}

size_t size() const { return _size; }

const E* front() const {
return this->_head == NULL ? NULL : this->_head->peek();
}

const E* back() const {
return _tail == NULL ? NULL : _tail->peek();
}
};

class AsyncLogMessage {
LogFileOutput& _output;
mutable char* _message;
LogDecorators _decorators;
LogLevelType _level;
const LogTagSet& _tagset;

public:
AsyncLogMessage(LogFileOutput& output, const LogDecorations& decorations, const char* msg)
: _output(output), _decorators(output.decorators()),
_level(decorations.get_level()), _tagset(decorations.get_logTagSet()) {
// allow to fail here, then _message is NULL
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

Why do you extract and store the LogDecorators, _level and _tagset set separately and re-create the LogDecorations in AsyncLogMessage::writeback()? Is it to save memory (because LogDecorators are much smaller than the LogDecorations) at the expense of time for recreating?

This comment has been minimized.

@navyxliu

navyxliu Mar 26, 2021
Author Member Outdated

Saving memory is my intension. To keep thing simple, I used to copy objects directly. Then I found LogDecorations consists of a char array(256bytes) . This information can be compressed into an uint mask, which is LogDecorators.

Because AsyncLogMessage is the payload, which hotspot stores thousands of them in the buffer, size matters.

_message = os::strdup(msg, mtLogging);
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

If you think msg can't be NULL here please add an assertion, otherwise please handle it.

This comment has been minimized.

@navyxliu

navyxliu Apr 23, 2021
Author Member Outdated

I deliberately design this code snippet tolerate error.

os::strdup uses malloc. The default behavior is returning NULL when it fails. it's fine that _message is NULL. the payload is "null" and nothing will be emitted eventually.

If I handle error(lacking memory) here, the it may then recursively generates more logs.

}

~AsyncLogMessage() {
if (_message != NULL) {
os::free(_message);
_message = NULL;
}
}

AsyncLogMessage(const AsyncLogMessage& o)
:_output(o._output), _decorators(o._decorators), _level(o._level), _tagset(o._tagset) {
_message = o._message;
o._message = NULL; // transfer the ownership of _message to this
}

This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

Maybe add an explicit copy assignment operator with a ShouldNotReachHere to make sure AsyncLogMessages are not assigned unintentionally.

This comment has been minimized.

@navyxliu

navyxliu Apr 23, 2021
Author Member Outdated

I do use copy constructor here. The container typedef LinkedListDeque<AsyncLogMessage, mtLogging> AsyncLogBuffer; stores objects instead of raw pointers on purpose. I create objects and push it to the container in equeue().

It's always easier to manage memory with copying byval. There's no smart_pointer in hotspot, I had to explicitly manage memory if I would store raw pointers. That makes code error-prone and messy.

void writeback();

bool equals(const AsyncLogMessage& o) const {
return (&_output == &o._output) && (_message == o._message || !strcmp(_message, o._message));
}
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

strcmp() is not defined for NULL but you can have _message == NULL if you've transferred ownership in the copy constructor.

This comment has been minimized.

@navyxliu

navyxliu Mar 30, 2021
Author Member Outdated

yes, this is subtle bug! thanks!
I thought that if _message is NULL, then o._message must be NULL, then it will be true for _message == o._message. I was wrong.

This comment has been minimized.

@navyxliu

navyxliu Apr 23, 2021
Author Member Outdated

fixed.


const char* message() const { return _message; }
LogFileOutput* output() const { return &_output; }
};

typedef LinkedListDeque<AsyncLogMessage> AsyncLogBuffer;
typedef KVHashtable<LogFileOutput*, uintx, mtLogging> AsyncLogMap;

This comment has been minimized.

@tstuefe

tstuefe May 6, 2021
Member Outdated

This is just to keep statistics of dropped messages per output, right?

I think it would be a lot simpler just to add a reset-able counter to LogFileOutput directly. I don't even think it has to be uintx. 32bit should be sufficient (also, uintx makes no sense - do we somehow expect less message drops on 32bit platforms?)

This comment has been minimized.

@navyxliu

navyxliu May 7, 2021
Author Member Outdated

Yes, it is for the statistics of dropped messages.

AsyncLogMap looks a little cumbersome indeed, but the good side is that I isolate async logging things in logAsyncFlusher.hpp/cpp. I manage not to pollute shared unified logging code.

Another reason is that I can deliver the zero-cost promise when async logging is off. If I move the counter to LogFileOutput, it will increase those object size a little bit. In current implementation, the singleton object LogAsyncFlusher is not created if -Xlog:async is absent.

Size wise, I think it's okay to to have 8 byte. if a network-based device disappeared, who know how long asynclog thread would block. bigger counter is better. Cost is almost same because the number of logFileOutput objects is usually less than 4.

This comment has been minimized.

@tstuefe

tstuefe May 7, 2021
Member Outdated

I respect your "no costs promise" concern, but really this is no problem. We create exactly one LogFileOutput object per file sink; so usually we have one, maybe two of those if async=true. I think the costs of those counters would be bearable :)

I understand the concern of not polluting the UL code base. But here I think it does not apply, since you bring your patch upstream anyway, so async logging is now part of the UL code base.

wrt to counter size, okay lets use 64bit. But lets really use 64bit, also in 32bit VMs (so, use uint64_t instead of uintx).

This comment has been minimized.

@navyxliu

navyxliu May 12, 2021
Author Member Outdated

Hi, @tstuefe ,
I agree with you that uintx is not a good type. uint32_t is big enough.

I found it's very tricky to move those counters to LogFileOutput objects. after than, I still need HashTable as a set to traverse all LogFileObject objects.

Alternatively, there's a member variable _outputs in LogConfiguration, which refers to all LogOutput objects. I have to obtain ConfigurationLock first before reading it because jcmd could modify outputs at any moment. Grabbing ConfigurationLock in LogAsyncFlusher::flush may lead to a dead lock with _lock. This part of code is really subtle. I tried different approaches. no one is more straight-forward than a hashmap.

struct AsyncLogMapIterator {
bool do_entry(LogFileOutput* output, uintx* counter);
};

// Flusher is a NonJavaThread which manages a FIFO capacity-bound buffer.
class LogAsyncFlusher : public NonJavaThread {
private:
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

As far as I know, PeriodicTask is designed for short running task. But LogAsyncFlusher::task() will now call AsyncLogMessage::writeback() which does blocking I/O and can block for quite some time (that's why we have this change in the first place :). How does this affect the other periodic tasks and the WatcherThread. What's the worst case scenario if the WatcherThread is blocked? Is this any better than before?

This comment has been minimized.

@navyxliu

navyxliu Mar 30, 2021
Author Member Outdated

ack. other reviewers also raise this question.
I propose a dedicated nonjavathread to flush log.

static LogAsyncFlusher* _instance;

volatile bool _should_terminate;
// The semantics of _lock is like JVM monitor.
// This thread sleeps and only wakes up by the monitor if any of events happen.
// 1. buffer is half-full
// 2. buffer is full
// 3. timeout defined by LogAsyncInterval
//
// It also roles as a lock to consolidate buffer's MT-safety.
Monitor _lock;
AsyncLogMap _stats; // statistics of dropping messages.
AsyncLogBuffer _buffer;

LogAsyncFlusher();
void enqueue_impl(const AsyncLogMessage& msg);
void run() override;
char* name() const override { return (char*)"AsyncLog Thread"; }
public:
void enqueue(LogFileOutput& output, const LogDecorations& decorations, const char* msg);
void enqueue(LogFileOutput& output, LogMessageBuffer::Iterator msg_iterator);
void flush();

static LogAsyncFlusher* instance();
// None of following functions are thread-safe.
static void initialize();
static void terminate();
};

#endif // SHARE_LOGGING_ASYNC_FLUSHER_HPP
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015, 2020, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2015, 2021, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -541,6 +541,7 @@ void LogConfiguration::print_command_line_help(outputStream* out) {
out->print_cr(" filecount=.. - Number of files to keep in rotation (not counting the active file)."
" If set to 0, log rotation is disabled."
" This will cause existing log files to be overwritten.");
out->print_cr(" async=true|false - write asynchronously or not.");
This conversation was marked as resolved by navyxliu

This comment has been minimized.

@simonis

simonis Mar 26, 2021
Member Outdated

Mention the default here which should be "false".

This comment has been minimized.

@navyxliu

navyxliu Apr 23, 2021
Author Member Outdated

fixed.

out->cr();

out->print_cr("Some examples:");
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015, 2020, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2015, 2021, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -33,7 +33,13 @@
const char* volatile LogDecorations::_host_name = NULL;

LogDecorations::LogDecorations(LogLevelType level, const LogTagSet &tagset, const LogDecorators &decorators)
: _level(level), _tagset(tagset) {
: _level(level), _tagset(&tagset) {
create_decorations(decorators);
}

LogDecorations::LogDecorations(LogLevelType level, const LogDecorators &decorators)
: _level(level), _tagset(nullptr) {
assert(!decorators.is_decorator(LogDecorators::tags_decorator), "_tagset can't be NULL");
create_decorations(decorators);
}

@@ -131,7 +137,11 @@ char* LogDecorations::create_level_decoration(char* pos) {
}

char* LogDecorations::create_tags_decoration(char* pos) {
int written = _tagset.label(pos, DecorationsBufferSize - (pos - _decorations_buffer));
int written = 0;

if (_tagset != nullptr) {
_tagset->label(pos, DecorationsBufferSize - (pos - _decorations_buffer));
}
ASSERT_AND_RETURN(written, pos)
}