Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Worksheet.values() batch function to the API #135

Closed
wants to merge 1 commit into from

Conversation

akbertram
Copy link

For use-cases where rows are written sequentially, this avoids the
overhead of resizing the cells array each time a new cell is added,
as well as repeating all of the bounds checks for each cell.

For use-cases where rows are written sequentially, this avoids the
overhead of resizing the cells array each time a new cell is added,
as well as repeating all of the bounds checks for each cell.
@rzymek
Copy link
Collaborator

rzymek commented Dec 20, 2020

I did a micro benchmark comparison of:

private static final int NB_ROWS = 10000;
private static Object[] row = IntStream.range(0, 200).boxed().toArray();

@Benchmark
public int oneByOne() throws IOException {
  CountingOutputStream count = new CountingOutputStream(new NullOutputStream());
  Workbook wb = new Workbook(count, "Perf", "1.0");
  Worksheet ws = wb.newWorksheet("Sheet 1");
  for (int r = 0; r < NB_ROWS; ++r) {
    for (int c = 0; c < row.length; c++) {
      ws.value(r, c, row[c]);
    }
    if (r % 1000 == 0) {
      ws.flush();
    }
  }
  wb.finish();
  return count.getCount();
}

against

@Benchmark
public int wholeRowReverse() throws IOException {
  CountingOutputStream count = new CountingOutputStream(new NullOutputStream());
  Workbook wb = new Workbook(count, "Perf", "1.0");
  Worksheet ws = wb.newWorksheet("Sheet 1");
  for (int r = 0; r < NB_ROWS; ++r) {
    ws.values(r, row);
    if (r % 1000 == 0) {
      ws.flush();
    }
  }
  wb.finish();
  return count.getCount();
}

and the results unfortunately do not show any difference in performance:

WriterMultipleRows.oneByOne           ss    5  0.987 ± 0.039   s/op
WriterMultipleRows.wholeRowReverse    ss    5  0.988 ± 0.121   s/op

@akbertram
Copy link
Author

akbertram commented Dec 20, 2020 via email

@rzymek
Copy link
Collaborator

rzymek commented Jan 2, 2021

Great! Keeping my fingers crossed that you'll find some place that can be optimized!
Microbenchmark infrastructure for fastexcel is setup in https://github.com/dhatim/fastexcel/tree/master/e2e subproject. Just create a class that extends BenchmarkLauncher, mark methods with JMH's @Benchmark annotation and run the class as a JUnit test. Feel free to drop be a line at rzymek@gmail.com if you need any help.
As for this PR, I'm gonna close it,as the new API method does not bring noticeable performance benefit.

@rzymek rzymek closed this Jan 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants