Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reindents lines #8

Merged
merged 2 commits into from
Nov 23, 2021
Merged

reindents lines #8

merged 2 commits into from
Nov 23, 2021

Conversation

jmbarbone
Copy link
Collaborator

@jmbarbone jmbarbone commented Nov 22, 2021

To avoid trying it ignore all of these re-indents in other PRs I'm going to try to get them out of the way.

No code is changed, just adding/removing spaces.

I wouldn't bother cleaning up the ReferenceClass methods since these are being handled in the rc-cleanup branch

@JanMarvin
Copy link
Owner

Rcppexports is recreated when running build, therefore it wont remain that way, otherwise no problem.
Just a word of warning: Most of the functions unrelated to loading and saving workbooks are probably out of order. Outside of those in the wb_functions.R all others should be assumed not to work until otherwise determined. Currently one should not expect the full openxlsx functionality.

R/RcppExports.R Outdated Show resolved Hide resolved
@jmbarbone
Copy link
Collaborator Author

Rcppexports is recreated when running build, therefore it wont remain that way, otherwise no problem. Just a word of warning: Most of the functions unrelated to loading and saving workbooks are probably out of order. Outside of those in the wb_functions.R all others should be assumed not to work until otherwise determined. Currently one should not expect the full openxlsx functionality.

Thanks. I did notice that quite a bit of testing had been removed, so I'm going to be a little careful about these. But I have made some steps to organize the testing so these things should be a bit easier to see and deal with. Also gives some opportunity to review the functions for improvements/clean-up.

@JanMarvin
Copy link
Owner

Yes, I saw that you were beginning to add back tests for certain functions. I love tests, but for openxlsx2 I have rewritten most of the data importing and exporting process to a data structure that is hopefully capable of handling more Excel features. Unfortunately much of the old code was written specifically for this part of the data structure. Most of the old code modifies part of workbook areas that are attached to a certain data structure.

Re-writing this data structure was the ugly part that took quite some time and I haven't found the motivation to hot glue everything back into the new structure. My hope was to keep and rewrite the good parts of openxlsx and to drop the bad parts or rewrite them from scratch.

@JanMarvin
Copy link
Owner

JanMarvin commented Nov 22, 2021

For instance this is just what I can recall from the top of my head

cell data

In the new openxlsx2 structure, we have all cell information in a single data frame.

(row_r it the numeric row, c_r is the cell column name, c_s is the cell style, c_t is the cell type, v is the cell value, f is a cell formula f_t, f_ref and f_si are formula related (cell type, cell reference and a text string), is entire inlineStr content. There might be other cell attributes, but either they were not used frequently or I didn't understand their meaning, therefore I didn't include fields for them. I use the _openxlsx_NA_ string to differentiate between missing values in the xml files e.g. NA .)

> xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx2")
> wb2 <- loadWorkbook(xlsxFile)
> wb2$worksheets[[1]]$sheet_data$cc
   row_r c_r           c_s           c_t             v                   f           f_t         f_ref          f_si            is
1      1   A _openxlsx_NA_             s          2096       _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_
2      1   B _openxlsx_NA_             s          2097       _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_
3      1   D _openxlsx_NA_             s          2098       _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_ _openxlsx_NA_

Previously this was scattered in single vectors

> xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx2")
> wb1 <- openxlsx::loadWorkbook(xlsxFile)
> df <- data.frame(f = wb1$worksheets[[1]]$sheet_data$f, 
+                  v = wb1$worksheets[[1]]$sheet_data$v,
+                  t = wb1$worksheets[[1]]$sheet_data$t)
> df
                                f            v  t
1                            <NA>         2096  1
2                            <NA>         2097  1
3                            <NA>         2098  1

row data

Every row attribute is a list (we might actually switch to simply import the xml-node and modify if only if needed, now that we have a working xml-library under the hood, converting c++ lists to R lists is quite costly. Currently nearly half the time of a complex import file is spent creating the style attributes.):

> wb2$worksheets[[1]]$sheet_data$row_attr[[1]]
$r
[1] "1"

$spans
[1] "1:8"

$`x14ac:dyDescent`
[1] "0.25"

Previously we had this (edit: meaning the output of the following line) and maybe information scattered elsewhere (e.g. wb1$outlineLevels for outline levels).

> wb1$worksheets[[1]]$sheet_data$rows
 [1]  1  1  1

Now if we want to have outline levels in openxlsx2 all we have to do is adding outline levels to the row list. Assign it as a named element to the row_attr list.

@jmbarbone jmbarbone merged commit aed775b into main Nov 23, 2021
@JanMarvin JanMarvin deleted the reindent-lines branch January 1, 2022 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants