Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty cell information #178

Merged
merged 2 commits into from
May 13, 2022
Merged

Empty cell information #178

merged 2 commits into from
May 13, 2022

Conversation

JanMarvin
Copy link
Owner

[c++/empty cells] Empty cell information is now displayed with "" instead of "openxlsx_NA" as before. Previously this variable was used to help various wbWorkbook functions to distinguish between empty cells, missing cells and cells to be ignored. Fortunately, this is no longer needed. Either a cell contains useful information or it can be ignored. Empty strings are written as <is><t></t></is> or they are still sharedstrings. Row and column information cannot be empty, similarly openxml has no empty attributes (as far as I know).

This relieves us of the time-consuming burden of initializing each field of cc with the long string "openxlsx_NA" (it was intentionally that long to prevent someone from randomly selecting it in an xlsx file). However, since this string was used fundamentally in several places, this change should be monitored closely. Especially for files that are known to be strange (missing lines, missing data, empty strings, or uninitialized cells).

There is another string called "_openxlsx_NA". This string has another use that is still required, and writing this string from openxlsx2 will still corrupt the workbook as indicated in the description.

This is more of a heads up for you on who to blame @jmbarbone, I don't expect any negative effects from this PR (just positive, cleaner cc table and slight speedups due to skipping the initialization, and our tests and verifications have indicated no problem 😃 ).

@codecov-commenter
Copy link

codecov-commenter commented May 13, 2022

Codecov Report

Merging #178 (2b8be90) into main (4bd8032) will decrease coverage by 0.03%.
The diff coverage is 94.11%.

@@            Coverage Diff             @@
##             main     #178      +/-   ##
==========================================
- Coverage   77.74%   77.71%   -0.04%     
==========================================
  Files          36       36              
  Lines        7424     7414      -10     
==========================================
- Hits         5772     5762      -10     
  Misses       1652     1652              
Impacted Files Coverage Δ
R/wb_styles.R 47.25% <0.00%> (ø)
src/openxlsx2_types.h 100.00% <ø> (ø)
R/wb_functions.R 88.11% <88.88%> (ø)
R/class-worksheet.R 77.27% <100.00%> (ø)
src/helper_functions.cpp 98.05% <100.00%> (-0.09%) ⬇️
src/load_workbook.cpp 92.41% <100.00%> (-0.06%) ⬇️
src/write_file.cpp 94.23% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bd8032...2b8be90. Read the comment docs.

…tead of "_openxlsx_NA_" as before. Previously this variable was used to help various `wbWorkbook` functions to distinguish between empty cells, missing cells and cells to be ignored. Fortunately, this is no longer needed. Either a cell contains useful information or it can be ignored. Empty strings are written as `<is><t></t></is>` or they are still sharedstrings. Row and column information cannot be empty, similarly openxml has no empty attributes (as far as I know).

This relieves us of the time-consuming burden of initializing each field of cc with the long string "_openxlsx_NA_" (it was intentionally that long to prevent someone from randomly selecting it in an xlsx file). However, since this string was used fundamentally in several places, this change should be monitored closely. Especially for files that are known to be strange (missing lines, missing data, empty strings, or uninitialized cells).

There is another string called "_openxlsx_NA". This string has another use that is still required, and writing this string from openxlsx2 will still corrupt the workbook as indicated in the description.
@JanMarvin JanMarvin merged commit 8d275ae into main May 13, 2022
@JanMarvin JanMarvin deleted the empty_cell_information branch May 13, 2022 00:54
@JanMarvin JanMarvin mentioned this pull request May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants