Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not convert from SJIS to UTF-8 #5

Closed
huynh-duc-mulodo opened this issue Aug 31, 2017 · 7 comments
Closed

Could not convert from SJIS to UTF-8 #5

huynh-duc-mulodo opened this issue Aug 31, 2017 · 7 comments

Comments

@huynh-duc-mulodo
Copy link

huynh-duc-mulodo commented Aug 31, 2017

When I used below codes, data was not converted correctly. Output was not displayed as Japanese, but strange characters.
コンバートのため下記のソースコードを使ったが、出力のデータは日本語で表示せずに変な文字で表示ました。
「観é�³ Kå­�,ç�·,ç��å·� Yå­�,女,」

var file = fileUploader.files[0];
var reader = new FileReader();
reader.readAsBinaryString(file);
reader.onload = function() {
     console.log(originalText);
     var inputArray = str2Array(originalText); //sjis
     var outputArray = Encoding.convert(inputArray, 'UTF8', 'SJIS');
     var result = Encoding.codeToString(outputArray);
     console.log(result);
}

function str2Array(str) {
     var array = [],i,il=str.length;
     for(i=0;i<il;i++) array.push(str.charCodeAt(i));
     return array;
}
@huynh-duc-mulodo
Copy link
Author

Actually, the method codeToString should not be used in this case. It seems to be used for 16bit-based converting

@polygonplanet
Copy link
Owner

What is originalText string?

@huynh-duc-mulodo
Copy link
Author

hello, originalText is data was read from file which is in SJIS encode

@polygonplanet
Copy link
Owner

Maybe, I think that the character code you are looking for is "Unicode".
"Unicode" is JavaScript's encoding, "UTF-8" is not JavaScript internal encoding.

var outputArray = Encoding.convert(inputArray, 'UNICODE', 'SJIS');

@huynh-duc-mulodo
Copy link
Author

huynh-duc-mulodo commented Aug 31, 2017

Actually I used var outputArray = Encoding.convert(inputArray, 'UNICODE', 'SJIS'); already. But I need to convert to UTF8 because I'm importing data into Salesforce platform. So the way I fixed this problem is not to use Encoding.codeToString. Instead of that, I came up with using

function arrayToString(uintArray) {
            var encodedString = String.fromCharCode.apply(null, uintArray),
                decodedString = decodeURIComponent(escape(encodedString));
            return decodedString;
        }

@polygonplanet
Copy link
Owner

Encoding.codeToString uses String.fromCharCode.apply(null, code) internally. Just making sure not to cause an RangeError when it is a large string.

@huynh-duc-mulodo
Copy link
Author

Thanks for your reminding me of potential problems. I should apply library's method with a little bit of change

function arrayToString(uintArray) {
            var encodedString = Encoding.codeToString(uintArray);
            var decodedString = decodeURIComponent(escape(encodedString));
            return decodedString;
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants