Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTEMIS-2513 Large message's copy may be interfered by other threads #2859

Closed
wants to merge 1 commit into from

Conversation

gaohoward
Copy link
Contributor

In LargeMessageImpl.copy(long) it need to open the underlying
file in order to read and copy bytes into the new copied message.
However there is a chance that another thread can come in and close
the file in the middle, making the copy failed
with "channel is null" error.

This is happening in cases where a large message is sent to a jms
topic (multicast address). During delivery it to multiple
subscribers, some consumer is doing delivery and closed the
underlying file after. Some other consumer is rolling back
the messages and eventually move it to DLQ (which will call
the above copy method). So there is a chance this bug being hit on.

@gaohoward
Copy link
Contributor Author

pls hold this, it's getting test failures in jenkins.

file.close(false);
newMessage.getFile().close();
} finally {
cloneFile.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to close underlying file of new large message after copy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should be closed as original code. That'll cause file leak.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clebertsuconic @gaohoward I don't understand why we only close file when old file is not originally opened? In your case where other consumers open file and deliver, the file of new large message might not be closed. This will result in file leak and maybe data corrupt if broker crashes, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I don't know either. I just keep the old behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wy96f I'm doing some work with large messages, and I want to double check that as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wy96f / @gaohoward actually.. I will merge this right away.. as whatever I do would clash here.

and I will make sure I address how we open and close files.

@clebertsuconic
Copy link
Contributor

�There are files leaking in your test.

@gaohoward
Copy link
Contributor Author

@clebertsuconic Yes I'm fixing it. Also same issue with hornetq. I'll fix that too.

In LargeMessageImpl.copy(long) it need to open the underlying
file in order to read and copy bytes into the new copied message.
However there is a chance that another thread can come in and close
the file in the middle, making the copy failed
with "channel is null" error.

This is happening in cases where a large message is sent to a jms
topic (multicast address). During delivery it to multiple
subscribers, some consumer is doing delivery and closed the
underlying file after. Some other consumer is rolling back
the messages and eventually move it to DLQ (which will call
the above copy method). So there is a chance this bug being hit on.
@gaohoward
Copy link
Contributor Author

I'll run jenkins again.

@gaohoward
Copy link
Contributor Author

@clebertsuconic Jenkins has 5 failures, not likely relevant and those failed tests are passing on my local. So it's safe now.

@clebertsuconic
Copy link
Contributor

@gaohoward can I keep this open for 1 or 2 days?

I'm doing some work on large messages and I want to make sure why we have that code to close or keep files open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants