Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse sheets in ODS files once (to improve performance) #320

Merged
merged 1 commit into from Jun 19, 2016

Conversation

@orhantoy
Copy link
Contributor

orhantoy commented Jun 16, 2016

This improves the performance greatly for large files because instead of parsing the sheets every time a cell is accessed, this commit moves the parsing to the constructor.

In my tests with a 60K line ODS spreadsheet this change resulted in running times around ~70 seconds. Before this change the processing was not finished after having run for 10-20 minutes. Others are reporting > 4 hours of processing here: #113.

I am not sure who the active maintainer is but I'll just ping you, @stevendaniels, to start with 馃槃

This improves the performance greatly for large files because instead of parsing
the sheets everytime a cell is accessed, this commit moves the parsing to the
constructor.

In my tests with a 60K line ODS spreadsheet this change resulted in running
times around ~70 seconds. Before this change the processing was not finished
after having run for 10-20 minutes. Others are reporting > 4 hours of processing
here: #113.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Jun 16, 2016

Coverage Status

Coverage remained the same at 94.348% when pulling 31fb15c on orhantoy:feature/parse-ods-sheets-once into 211f89b on roo-rb:master.

@stevendaniels stevendaniels merged commit 63ad941 into roo-rb:master Jun 19, 2016
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage remained the same at 94.348%
Details
@stevendaniels

This comment has been minimized.

Copy link
Contributor

stevendaniels commented Jun 19, 2016

@orhantoy Thanks!

@orhantoy orhantoy deleted the orhantoy:feature/parse-ods-sheets-once branch Jun 20, 2016
jsonn pushed a commit to jsonn/pkgsrc that referenced this pull request Oct 15, 2016
## [2.5.1] 2016-08-26
### Fixed
- Fixed NameError. [337](roo-rb/roo#337)

## [2.5.0] 2016-08-21
### Fixed
- Remove tempdirs via finalizers on garbage collection. This cleans them up in all known cases, rather than just when the #close method is called. The #close method can be used to cleanup early. [329](roo-rb/roo#329)
- Fixed README.md typo [318](roo-rb/roo#318)
- Parse sheets in ODS files once to improve performance [320](roo-rb/roo#320)
- Fix some Cell conversion issues [324](roo-rb/roo#324) and [331](roo-rb/roo#331)
- Improved memory performance [332](roo-rb/roo#332)
- Added `no_hyperlinks` option to improve streamig performance [319](roo-rb/roo#319) and [333](roo-rb/roo#333)

### Deprecations
- Roo::Base::TEMP_PREFIX should be accessed via Roo::TEMP_PREFIX
- The private Roo::Base#make_tempdir is now available at the class level in
  classes that use tempdirs, added via Roo::Tempdir
=======
### Added
- Discard hiperlinks lookups to allow streaming parsing without loading whole files

## [2.4.0] 2016-05-14
### Fixed
- Fixed opening spreadsheets with charts [315](roo-rb/roo#315)
- Fixed memory issues for Roo::Utils.number_to_letter [308](roo-rb/roo#308)
- Fixed Roo::Excelx::Cell::Number to recognize floating point numbers [306](roo-rb/roo#306)
- Fixed version number in Readme.md [304](roo-rb/roo#304)

### Added
- Added initial support for HTML formatting [278](roo-rb/roo#278)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can鈥檛 perform that action at this time.