Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTEMIS-2676 PageCursorProviderImpl::cleanup can save decoding pages without large messages #3044

Closed
wants to merge 1 commit into from

Conversation

franz1981
Copy link
Contributor

@franz1981 franz1981 commented Mar 23, 2020

PageCursorProviderImpl::cleanup is calling PageCursorProviderImpl::finishCleanup that's fully reading pages (when not present in the PageCache), just to trigger large messages delete.
The decoding phase could be skipped and possibly the page read as well.

@franz1981
Copy link
Contributor Author

I'm planning to add another commit to try save completely reading the Page, or I should just send another pr for this.

@wy96f @clebertsuconic seems that I cannot assign reviewers, so I'm calling out directly ;)

@clebertsuconic
Copy link
Contributor

If you could please wait my PR to be merged first? I'm changing around these methods also.. and I wanted to have a version closer to what we have now as I'm cherry-picking to a production branch based on a previous version.

@franz1981
Copy link
Contributor Author

Sure bud, there is no hurry and this change should be simpler enough to be rebased against your when you have done ;)

@franz1981
Copy link
Contributor Author

franz1981 commented Mar 24, 2020

I've run the CI tests and this change seems to work fine yay
Perf-wise and on stability is a nice improvement over the original version, although not perfect: ideally we should skip reading the Page at all, but now we save a lot of CPU time and heap space by saving useless message decoding 👍

@franz1981
Copy link
Contributor Author

franz1981 commented Mar 24, 2020

@clebertsuconic
I need some help to understand the impact of this change I've made on Page::delete:
image:

  • before: was calling Message::usageDown on each message
  • now: it's calling Message::usageDown only for large messages

I see that ddd8ed4 has introduced usageDown to be triggered on any message while before decrementDelayDeletionCount was just called on large messages, was it correct?
image
It's ok if I'm calling it just on large messages?

@franz1981
Copy link
Contributor Author

franz1981 commented Mar 24, 2020

@clebertsuconic has confirmed that Message::usageDown is meant to be used just on large messages and it has no meaningful effects on non large messages: it means that I can left the logic as I've implemented in this PR.
I will add some doc around Message::usageUp/usageDown to explain explicitely that to avoid future misuses.

@clebertsuconic
Copy link
Contributor

@franz1981 there's a conflict with master. can you rebase it please?

@franz1981
Copy link
Contributor Author

@clebertsuconic yep, there has been some new changes around that part recently, so tomorrow will take care of rebasing ;)

@clebertsuconic
Copy link
Contributor

clebertsuconic commented Mar 31, 2020

I could fix the rebasing, however your implementation is not valid.

Up until CoreMessages, we used a special record type on paging, only used for Large Messages.

With AMQP, I didn't use that record at all. instead I used the Persister from the AMQP Protocol. So, you would have to check the Persister Type, at the proper record position.

Besides, the following implementation is not self documented... if things change this will likely introduce bugs (lack of encapsulation of this rule):

   public static boolean isLargeMessage(ActiveMQBuffer buffer) {
      // skip transactionID
      final boolean isLargeMessage = buffer.getByte(buffer.readerIndex() + Long.BYTES) != 0;
      return isLargeMessage;
   }

You would need to encapsulate this through PagePosition for Core Records, or check the Persister, and do the proper encapsulation if you really want to implement such feature.

@clebertsuconic
Copy link
Contributor

Look at the AMQPLargeMessagePersister. you would need to check that record.... but please you would need to have the proper operations on the OO model.... not much about OO here, but about organizing the code in such way we can find the operation when we need it.

I think it's valid to check if the record is for Paging, but you would have to do it in an organized way.

@franz1981
Copy link
Contributor Author

franz1981 commented Apr 1, 2020

@clebertsuconic I see on PagedMessageImpl this check:

   public boolean isLargeMessage() {
      return message instanceof ICoreMessage && ((ICoreMessage)message).isLargeMessage();
   }

and

   @Override
   public void encode(final ActiveMQBuffer buffer) {
      buffer.writeLong(transactionID);

      boolean isLargeMessage = isLargeMessage();

      buffer.writeBoolean(isLargeMessage);

why we cannot fix check/encode/decode to correctly write the isLargeMessage prefix field in an OO way without doing it ad-hoc if is Core as it is now?
The only part to take care is to warn users to not upgrade the broker if there are paged large messages (with AMQP).

I've added a new commit that tries to minimize the changes and that is taking this route: I will add some tests to validate this new behaviour: please, let me know how it looks 👍

@franz1981
Copy link
Contributor Author

@clebertsuconic I've tried to come up with a test to cover AMQP large and standard messages: let me know if the way I create and write AMQPLargeMessage is wrong.
While working on it I've found a test failure on the new test AmqpPageTest::testLargeMessagePageWithNIO while checking Assert.assertEquals(largeMessages ? 1 : 0, pagedMessage.getMessage().getUsage());
I have pushed a change trying to fix this, but maybe the assert is just wrong for AMQP, while it's valid only for Core large messages.

@franz1981
Copy link
Contributor Author

franz1981 commented Apr 1, 2020

I've pushed the changed on usage count on #3055 + some of the enhancements on the test I have here: so please don't merge this yet. I will rebase over master when #3055 will be merged

@franz1981 franz1981 force-pushed the ARTEMIS-2676 branch 2 times, most recently from 4a08a38 to 996bcdb Compare April 2, 2020 07:37
final PagedMessageImpl msg = new PagedMessageImpl(encodedSize, storageManager);
msg.decode(fileBufferWrapper);
assert fileBuffer.get(endPosition) == Page.END_BYTE : "decoding cannot change end byte";
msg.initMessage(storage);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clebertsuconic I see that sometime we use storage and sometime the member storage manager stored on the Page...any idea which one would be the most correct to use?

@franz1981
Copy link
Contributor Author

I've rebased against master: the logic should be correct, but I would like to add some unit tests around the introduced logic too :)
Let me know if the logic seems sound

@franz1981
Copy link
Contributor Author

I see that this one is a kind of major perf improvement on paging, but I still want to add some more tests on binary compatibility of PagedMessage decoding that I see we currently don't cover on tests

@franz1981
Copy link
Contributor Author

I will try to add the missing tests in the next days

@clebertsuconic
Copy link
Contributor

I was thinking about... I need this as part of this release.

the change you made here wouldn't break the compatibility if released now... but it would break if released later.

up till now we didn't have any other type of large messages...
so if we release now, the byte(2) on PagedMessages would make a lot of sense...

if we released later, we would have to deal with previous versions where you may have AMQPLargeMessage with byte(1), while this PR is expecting byte(2).

so, I am merging this now.

Besides I have a full pass on this PR as well.

@asfgit asfgit closed this in ceceb66 Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants