Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException with nonblocking logReader.readNext(true) #1544

Closed
infodog opened this issue Jul 15, 2018 · 3 comments
Closed

Comments

@infodog
Copy link

infodog commented Jul 15, 2018

BUG REPORT

  1. Please describe the issue you observed:
  • What did you do?

I do a nonblocking read from the server with three bookie, my version is bookkeeper 4.7.1
my code is

reader = dlm.getInputStream(nextTxId);
                    while (true) {
                        try {
                            LogRecord record = reader.readNext(false);
                            if (record == null) {
                                reader.close();
                                reader = null;
                                break;
                            }
                            if (record.getTransactionId() > getLocalLastTxId()) {
                                LOG.info("updatelog " + this.nodeName + "," + this.instanceName + ":" + "newTransactionId:" + record.getTransactionId() + ",localLastTxId:" + getLocalLastTxId());
                                updateLog(record);
                            }
                        } catch (LogEmptyException t) {
                            System.out.println("no record in log.");
                        } catch (Throwable ioe) {
                            // handle the exception
                            if(reader!=null){
                                reader.close();
                            }
                            nextTxId = getLocalLastTxId();
                            reader = dlm.getInputStream(nextTxId);
                        }

                    }

and I observe an error log

2018-07-15 23:05:01  [ DLM-/pigeon40namespace-OrderedScheduler-1-0:8345 ] - [ ERROR ]  Unexpected throwable caught  org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:38)  
java.util.ConcurrentModificationException
	at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)
	at java.util.HashMap$ValueIterator.next(HashMap.java:1466)
	at org.apache.distributedlog.impl.ZKLogSegmentMetadataStore.lambda$notifyLogSegmentsUpdated$1(ZKLogSegmentMetadataStore.java:493)
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:745)

and then no more data is read.

BP

This is the master ticket for tracking BP-<bp-number> :

[a short description for this BP]

Proposal PR - #<pr-number>

@sijie
Copy link
Member

sijie commented Jul 17, 2018

@infodog I think the problem might be due to some code blocks missing synchronization. will take a look at this tomorrow.

sijie added a commit to sijie/bookkeeper that referenced this issue Jul 18, 2018
…r.readNext(true)

*Motivation*

Fixes apache#1544. ConcurrentModificationException is thrown when trying to access a non-thread-safe hashmap from different threads.

*Changes*

Make sure accessing to this non-thread-safe hashmap is under synchronized block.

*Tests*

It is a bit tricky to reproduce this race condition in a unit test or an integration test. so not going to attempt adding any tests.
@sijie sijie self-assigned this Jul 18, 2018
@infodog
Copy link
Author

infodog commented Jul 18, 2018

@sijie Now I switch to blocking read and then everything is fine. I want to prevent blocking for a long time, I call lastShardTxId = dlm.getLastTxId(); when I reeach the lastShardTxId, I will sleep for a while 100ms and then call dlm.getLastTxId(); to check if there 's new log entry to read.

My question is: Is this a good practice or not?

@sijie
Copy link
Member

sijie commented Jul 18, 2018

@infodog that is totally okay. getLastTxId is basically reading from the bookies, which they are designed for high throughput write and reads.

@sijie sijie closed this as completed in ec7fb62 Jul 23, 2018
sijie added a commit that referenced this issue Jul 23, 2018
…logReader.readNext(true)

Descriptions of the changes in this PR:

*Motivation*

Fixes #1544. ConcurrentModificationException is thrown when trying to access a non-thread-safe hashmap from different threads.

*Changes*

Make sure accessing to this non-thread-safe hashmap is under synchronized block.

*Tests*

It is a bit tricky to reproduce this race condition in a unit test or an integration test. so not going to attempt adding any tests.

Master Issue: #1544

Author: Sijie Guo <sijie@apache.org>

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None>

This closes #1558 from sijie/issue_1544, closes #1544

(cherry picked from commit ec7fb62)
Signed-off-by: Sijie Guo <sijie@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants