-
-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong currency parsing #78
Comments
Based on some small tests (you can play around with the number formatting library: http://oss.sheetjs.com/ssf/) I think the issue here boils down to the codepage encoding of the number format and the lack of localization (to be sure, the number 12345.6789 in the US version renders as I would like to take a peek at those files and figure out the best strategy for the localization. Can you make a sample sheet with a cell from that format and save it as (in this particular order):
If it warns before saving in any of the formats, that's OK. If you could share the files with me (either email or putting them somewhere) I can take a look. I deeply appreciate your help :) |
See attached files. "Ñ." instead of local currency symbol is not really big issue but it Note that there are two additional sheets with objects map - to be ignored. Thank you. On 01.07.2014 11:35, SheetJSDev wrote:
|
Unfortunately github doesn't support attachments in the email. Not to worry though -- I was able to produce a file by changing my computer settings to Russian. The XLSX https://github.com/SheetJS/js-xlsx/blob/master/bits/47_styxml.js#L57
The other half of the problem (thousands separator and decimal character) is locale specific. To see this, in Windows "Region and Language" settings, the separators are in the "Additional Settings" pane. If you look at the file (unzip the xlsx and look at the file xl/styles.xml), you'll see the format is stored assuming that the comma is the thousand separator and the dot is the decimal. The fix probably looks like this:
@elad In Hebrew, are numbers/currencies written right-to-left or left-to-right? |
@sysarchitect you've opened a big can of worms. I took a look at the number format test, testing with different location settings, and found:
Good news is that this setting can be controlled very easily in Excel 2013, so it shouldn't be too hard to create a list for each of the locales. |
Hello, Glad to hear. For most purposes 2013 is enough I think. On 04.07.2014 8:00, SheetJSDev wrote:
|
@sysarchitect @elad when you save as "CSV", does Excel save with semicolons or commas? For example, this is what I saw when saving the number_format baseline as csv in Russian: https://github.com/SheetJS/test_files/blob/master/number_format_russian.0.csv |
For me the default is commas, but I don't use a localized version of Excel, so I don't know if that bit of information is of any help. :) |
@elad I should have tested Hebrew before asking -- Excel is horribly inconsistent. For example, this is the date Excel saves it as Which is the correct rendering? The US date is "October 18, 1933" |
For Hebrew CSV files, I think I found out you had to have a BOM character first. At least that's what I had to do in my code for the files to actually have readable contents. The first rendering ( The first thing I would try would be to put in a BOM character, if that doesn't work we can debug further. |
@elad mystery solved: when saving a file as CSV, Excel attempts to use the local codepage. That's actually controlled by a different setting (for "non-Unicode applications", strange since Excel is clearly unicode aware). So when I generated the baseline using codepage 1252 (the standard US codepage) the hebrew characters are invalid (so they were rendered as
|
Hello, In Excel: Saved in CSV as: 1234567;1234567 On 08.07.2014 22:50, SheetJSDev wrote:
|
@sysarchitect I haven't forgotten this :) With regards to the XLSX With regards to the actual format processing, there are two sub-problems. A) Determine the location information from the file. This is the status:
B) Use the location information to generate properly formatted text. The current snag is that the date information is localized as well. For example, consider the month format Also, can you directly send me a throwaway set of files (XLS, XLSX, XLSB, XML) using the problematic formats that I can add to the test suite? Replying to this email unfortunately doesn't forward the attachments, so you have to send it to dev -- sheetjs -- com |
Hello, H.N.Y. is less expected then new version )
I tested on XLSX only.
Thank you. On 30.07.2014 17:47, SheetJSDev wrote:
|
@sysarchitect Feel free to send the code so we can review :) The "github way" to do this is to fork the repo (hit the fork button), commit changes, push them to your fork and create a "pull request". Alternatively, you can just paste the function body in a reply and we can take a look. If you want to add it a reply, add three backticks (
|
Hello, OK
Where ID is еру Excel row number. It is necessary for back messages to The ideal solution would be to implement XML function that returns the My modification of sheet_to_csv function:
sysarchitect On 31.07.2014 16:57, SheetJSDev wrote:
|
@sysarchitect if I understand what you want to do correctly, you can make your own function that uses sheet_to_json (this function can be in its own script, so you don't have to change xlsx.js or xls.js to do this):
If you need UTF-16 encoding, then use codepage:
|
Hello, We're going to production with our project in one two months. Best Regards, Ilya Loskutov On 09.07.2014 10:14, Ilya Loskutov wrote:
|
@sysarchitect we are very close to pushing a boatload of logic to fix currency as well as date/time and other localization issues. Stay tuned :) Long story short, the SSF module will mirror the C localization functions (e.g. setlocale, LC_NUMERIC). After checking every locale (Windows does not make it easy to switch regions and languages) I have a rough sense for how windows locale information affects the formatting, and it appears to mirror the C localization system. |
Glad to hear ) On 13.10.2014 9:44, SheetJSDev wrote:
|
Hello I have looked through this thread to find an answer, but semicolon to comma is the closest to what I have gotten because what happens to me is the following.
As you can see that the XLSX for some reason surrounds the cell that contains the "." with guotes (which I don't mind), but what my problem is that it changes it to comma which makes it really hard to split cell by cell so that I can recreate it. |
@mandros1 I've found same issue, any solution for this? |
@MiqueiasGFernandes yikes, a long time has passed since I posted this, so I don't even remember what was the project/case I had issue with as described above. I do remember that I ultimately did a hacky fix by splitting and replacing, but don't remember what exactly. I also do know that I didn't solve it using this library, but I think this might be fixable by adapting some configurations or passing the format type for the parser to use surely (I just wasn't keen on reading the documentation back then). |
Hello,
Cell formatted as shown (Russian currency format) parses as:
Can you fix please? Thank you.
The text was updated successfully, but these errors were encountered: