Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve batch writer throughput #1120

Open
keith-turner opened this issue Apr 23, 2019 · 5 comments
Open

Improve batch writer throughput #1120

keith-turner opened this issue Apr 23, 2019 · 5 comments

Comments

@keith-turner
Copy link
Contributor

keith-turner commented Apr 23, 2019

https://issues.apache.org/jira/browse/ACCUMULO-4154
https://issues.apache.org/jira/browse/ACCUMULO-1962

@jzgithub1
Copy link
Contributor

I am investigating this issue now.

jzgithub1 added a commit to jzeiberg/accumulo that referenced this issue May 8, 2019
jzgithub1 added a commit to jzeiberg/accumulo that referenced this issue Aug 7, 2019
jzgithub1 added a commit to jzeiberg/accumulo that referenced this issue Aug 29, 2019
@jzgithub1
Copy link
Contributor

jzgithub1 commented Sep 4, 2019

I am disengaging from this effort until after the 2.1.0 release so I can concentrate on the issues of that release. The changes I made in #1152 do not sufficiently implement the 2-layer queuing strategy proposed in the original JIRA ticket number 4154. I will take up implementing the proposed algorithm in JIRA ticket number 4154 after the 2.1.0 release.

@ctubbsii
Copy link
Member

ctubbsii commented Sep 4, 2019

@jzgithub1 There is not currently a release plan in place for 2.1, and certainly not a feature freeze, so any issue that is resolved today is an issue that has the potential to be included in 2.1.

@jzgithub1
Copy link
Contributor

Thank you @ctubbsii, I looked at the 'To Do' list in the 2.1.0 project in the Projects Tab and did not see this ticket listed. Thank you for letting me know that list is not a release plan.

@ctubbsii
Copy link
Member

ctubbsii commented Sep 4, 2019

Sorry for the confusion. The projects are helpful for planning and triage, but are not set in stone. It's just our attempt to try to track what we expect might be done by, or could be done by, that release.

dlmarion added a commit to dlmarion/accumulo that referenced this issue Sep 10, 2021
Modified the TabletServerBatchWriter to use concurrent data structures such that
mutations could be added and binned simultaneously which allowed me to remove the
synchronized modifier from several methods. Specifically, we:

  Removed startProcessing() which added queued mutations to the MutationWriter
  Removed BatchWriterLatencyTimer thread, which called startProcessing had not been called in the latency interval
  Removed calls to startProcessing() from flush and close
  Removed call to startProcessing when used memory is half of max memory when adding a mutation to the BatchWriter
  Modified TSBW.MutationSet to use a ConcurrentHashMap instead of a HashMap
  Modified some variables to use Atomic datatypes to remove the synchronized modified from addMutation
  Removed TSBW.FailedMutations object, which requeued failed mutations on a 500ms interval, instead requeues are done immediately
  Moved binning task to BatchWriter constructor and modified such that is is always running
  Added SendTasks to the sendThreadPool in the order of servers with the most work

Closes apache#1120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants