Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you add some parameters? #3

Closed
zoushucai opened this issue Jun 9, 2023 · 3 comments
Closed

Can you add some parameters? #3

zoushucai opened this issue Jun 9, 2023 · 3 comments

Comments

@zoushucai
Copy link

This package is indeed very good. Recently, it has solved my problem of reading large xlsx, but I hope the author can add more custom parameters, such as

  1. Specify the data type for each column,
  2. Specify the na value
  3. When reading data, do not introduce Scientific notation, especially if there are both text and numbers in a column, text will be selected by default, but numbers will be recognized as scientific counting
  4. There seems to be a coding issue? (test file)
> mm = SheetReader::read_xlsx(path = f2, sheet = 1)
> head(mm$Profit)
[1] "本期利润" "没有单位"
[3] NA                                 "1.72925e+08"                     
[5] NA                                 NA  
@fhenz
Copy link
Owner

fhenz commented Jun 15, 2023

Thank you,
I have pushed a fix for 4., there was an issue with xml-escaped unicode characters. If you have devtools you can try to install via install_github("fhenz/SheetReader-r"), I will probably only upload a new CRAN version once I have also addressed some of your other points.

I think 1. and 2. are both good ideas, I will try to implement something similar to what readxl also has.
3. is a bit tricky because Excel doesn't differentiate between integer or real numbers when storing, but I should be able to solve this more elegantly if I implement 1. (so it would then be solved by specifiying string/text as the column data type, that should be sufficient?).

@zoushucai
Copy link
Author

Thank you, I have pushed a fix for 4., there was an issue with xml-escaped unicode characters. If you have devtools you can try to install via install_github("fhenz/SheetReader-r"), I will probably only upload a new CRAN version once I have also addressed some of your other points.

I think 1. and 2. are both good ideas, I will try to implement something similar to what readxl also has. 3. is a bit tricky because Excel doesn't differentiate between integer or real numbers when storing, but I should be able to solve this more elegantly if I implement 1. (so it would then be solved by specifiying string/text as the column data type, that should be sufficient?).

Thank you for your reply. Indeed, if 1 is resolved, then 3 can theoretically be resolved,

fhenz added a commit that referenced this issue Mar 1, 2024
@fhenz
Copy link
Owner

fhenz commented Mar 1, 2024

A new parameter col_types has been added that allows specifying the data types for columns via named/unnamed character vector, e.g. read_xlsx([...], col_types=c("Profit"="text")).

@fhenz fhenz closed this as completed Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants