-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get many _x000d if cell contain \n #16
Comments
Hi @Jack-Myth I'm afraid I can't recreate your issue. Can you show the code you are using as well an Excel spreadsheet? Also, can you Tell what system you are using (Operating system, version etc.)? I made a small test myself; here are my results: I made a spreadsheet called "TestBook.xlsx", with one sheet called "Sheet1" with the following contents: Then I ran the following code: #include <iostream>
#include <iomanip>
#include <OpenXLSX/OpenXLSX.h>
using namespace std;
using namespace OpenXLSX;
int main() {
XLDocument doc;
doc.OpenDocument("./TestBook.xlsx");
auto wks = doc.Workbook().Worksheet("Sheet1");
cout << "In TestBook:" << endl;
cout << wks.Cell("A1").Value().Get<std::string>() << endl;
cout << wks.Cell("A2").Value().Get<std::string>() << endl;
cout << wks.Cell("A3").Value().Get<std::string>() << endl;
wks.Cell("B1").Value() = "B1: Line1";
wks.Cell("B2").Value() = "B2: Line1\n B2: Line 2";
wks.Cell("B3").Value() = "B1: Line1\n B2: Line 2\n B2: Line 3";
doc.SaveDocument();
doc.CloseDocument();
return 0;
} ...and the "TestBook.xlsx" after running the code: |
Thanks for your reply.
It's very simple and just print the first row, but with it you can see the error: And, by the way, I find the xlsx saved by OpenXLSX can't be read by a python library named "xlrd", But it can be read by "openpyxl". so now I use openpyxl load the xlsx and resave it so the xlrd can read the file, but openpyxl is very slow , it caused many time to resave the xlsx. so do you mind spend a little bit time on this problem? |
I looked into this, and it turns out that x000D is the UTF-8 code for "\r" (carriage return), not "\n". The difference is that "\n" jumps to the beginning of the next line, whereas "\r" jumps to the beginning of the current line. The problem actually lies in the spreadsheet you use, not in OpenXLSX, because Excel actually stores the "\r" character as a string with the UTF-8 code (i.e. "_x000D_"), not the "\r" character itself. So it is difficult to come up with good solution to the problem. Indeed, other Excel libraries have had the same problem (for example: jmcnamara/libxlsxwriter#189). Also, when I open the test.xlsx file in excel, it behaves strangely. When I select the cell, the contents is reformatted, as if the "\r" characters are replaced with "\n". SolutionIf you are not able to fix the spreadsheet itself (maybe you imported data from a text file?), then you can do something like the following: #include <iostream>
#include <regex>
#include <OpenXLSX/OpenXLSX.h>
using namespace std;
using namespace OpenXLSX;
int main() {
XLDocument doc;
doc.OpenDocument("./test.xlsx");
auto sht = doc.Workbook().Worksheet("Sheet1");
auto text = std::regex_replace(sht.Cell("A1").Value().AsString(), regex("_x000D_"), "\n");
cout << text << endl;
return 0;
} This will take the contents of a cell and replace all instances of the substring "_x000D_" with "\n". You can also replace it with "\r", but that is almost certainly not what you want to do (actually, Python/xlrd does that, resulting in only the last line of the cell contents being printed; this makes it very difficult to understand what is going on). Regarding your question about opening .xlsx files made with OpenXLSX with Python/xlrd, I have fixed that now. Please let me know if everything works out for you. |
Sorry for the delay. The last few months have been really busy! I am unable ro reproduce the error you mention. I have tested it with Visual Studio 2019 Community Edition, using both the 32- and 64-bit compiler, with no problems. According to these references, the std::remove and std::rename functions are available in Visual Studio: I will update the library to check the return value of std::remove and std::rename. If an error code is returned, the library will throw an exception. |
@Jack-Myth I came across the same issue. I've read your |
I'm trying to use OpenXLSX to read XLSX. It's a great library but I find something strange. If a cell's value contains multiple lines, the "\n" will be read as x000d.
"abc
abc" will become "abc_x000d__000d..abc"
any idea?
The text was updated successfully, but these errors were encountered: