-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having problem with Chinese Characters in Windows environment #329
Comments
Can you print out your |
Here comes: > sessionInfo() R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Chinese_People's Republic of China.936 [2] LC_CTYPE=Chinese_People's Republic of China.936 [3] LC_MONETARY=Chinese_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_3.0.2 And the result was generated with slidify 0.3.3 |
It seems to work with the latest version of slidify. I checked online using the slidify playground at http://slidify.github.io/playground. Make sure to remove the line with You can install the latest version of slidify and slidifyLibraries by running devtools::install_github(c('slidify', 'slidifyLibraries'), 'ramnathv') Before you slidify your deck, make sure to delete the |
I met the same problem after installing the latet version according to your code. Since Linux/OS x could handle Chinese fluently, I guess the success of slidify playground is not surprising. But is slidify playground running under Windows environment? I suspect the way it deals with UTF8 and GBK is the main problem. |
You are right. I believe the issue is a combination of Windows + Encoding. Let me see if I can test under Windows and get back on this. |
Most Chinese users are suffering from it because Windows is still the most popular OS in China. A lot of users would benefit from fixing this issue :) |
Can you try this @hetong007 ? It runs the slidify(knit("index.Rmd", encoding = 'GBK'), knit_deck = FALSE) |
I used that code on the I also tried |
Okay. Let me try to isolate the problem here. If you run |
This is what I got from running it on the
This is what I got from running it on the
|
You need to explicitly pass the encoding to |
Sorry, but the result still remains the same :( |
Okay. Can you save your Rmd file and provide me a link to it? Don't copy paste it as I want to ensure that it is saved with the correct encoding. Since you are having trouble using |
@yihui is not a Windows user, maybe he chose to ignore those errors before :( Here is a repo I just created with the Rmd files |
Chinese programmers suffer from encoding related problems everyday. Thank you and good luck! :) |
I think I know what is the problem, but it will take me a while to find out where the character encoding got messed up. The encoding of this page https://github.com/hetong007/temp_files/blob/master/index-GBK.html is not UTF-8, but it contains the spec @hetong007 I rarely use Windows myself, but that does not mean I do not care about Windows users :) |
@yihui, I understand why slidify fails on this file. The The failure of <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> I am thinking @hetong007 needs to convert the entire document to |
I'll take a look at @kohske's PR rstudio/markdown#49 and rstudio/markdown#50. The problem should be at least alleviated after the encoding problem is gone in the markdown package, although there are still other places that may have to be fixed. |
Thanks @yihui. I will look forward to these fixes. I presume that these issues are non-existent with |
FYI, here is the fix of encoding for markdown, slidify, and knitrBootstrap. The below is the test script and markdown files: kohske |
I tested the UTF8 file including GBK characters (below) and slidify works perfectly on Windows!! Note that before running |
Thanks @kohske. This is a really significant contribution as it opens up things for a large group of users. I will run through the tests and merge this weekend. Can you add yourself as a contributor in the DESCRIPTION file? |
Are you using RStudio? If yes, what version? If you can paste a screenshot of the output you get, that would be useful for me to figure out what might be going on. |
@ramnathv Okay, thanks. Note that MBCS-compatible slidify requires MBCS-compatible markdown package. |
@kohske After the code |
@hetong007 This is due to DESCRIPTION of knitrBootstrap. |
@ramnathv I am using the newest RStudio, i.e. 0.98.692. Under The output information is d> slidify("Douban_Folksonomy-master/index.Rmd", encoding="UTF8") processing file: index.Rmd |.................................................................| 100% ordinary text without R code output file: index.md Copying files to libraries/frameworks/io2012... Copying files to libraries/highlighters/highlight.js... Copying files to libraries/widgets/bootstrap... Warning messages: 1: In readLines(con, ...) : incomplete final line found on 'index.Rmd' 2: In readLines(con, ...) : incomplete final line found on 'index.Rmd' And the first page looks like The second page looks like Comparing to this original version, it is not hard to find the significant difference. |
@hetong007 Obviously the libraries in the original repository is quite old. The results are same to the newer version by generating under Mac OS X. |
@kohske is right. I updated the default stylesheets for io2012, adding the bottle green background in the title slide and the blue color for slide titles. You can always modify it, if you prefer a different appearance of the slides. |
Thanks to @kohske for so diligently plugging away on this. Encoding issues are not the most pleasant ones to be working on, but are so critical. I will try to merge this pull request this weekend, after ensuring that it doesn't break any other features of slidify. @kohske, please add yourself as a contributor in the DESCRIPTION! |
@ramnathv I did it, thanks. |
Thanks to @kohske, I just merged in some changes that provide for better encoding support. You can install it from the library(devtools)
install_github("ramnathv/slidify@fix-encode") Can you install it and test if it solves the encoding issues you had mentioned here? |
This fix everything on my system. But I am using Win 7 instead of Win XP now. I hope it doesn't matter. I created two library(devtools)
install_github("ramnathv/slidify@fix-encode")
# setwd(...)
require(slidify)
slidify('index.Rmd', encoding='CP936')
slidify('index-UTF8.Rmd', encoding='UTF8') The result is great. |
Thanks @ramnathv, everything works perfectly with Japanese_Japan.CP932 and UTF8 under Win7. |
Thanks all your efforts! @hetong007 @ramnathv @kohske |
All credit should go to @kohske for painstakingly working on fixing encoding related issues. |
Is the fix-encode branch ready to be merged, then? |
Yes. I will be merging it this weekend, when I will be working on slidify. |
Chinese characters are encoded as UTF8 in Linux/OS x, but they are encoded as GBK in Windows. Slidify is having problem with understanding UTF8 and GBK now.
One can clone my repo Douban_Folksonomy to reproduce the following result. A properly generated html version(under Ubuntu 12.04) is available here. I am using Windows XP, but the same problem could be found on Windows 7 as well.
Here are the first few lines in my 'index.Rmd' file:
When using Windows, if my 'index.Rmd' file is encoded as UTF8, then function slidify will throw out an Error , with unrecognized Chinese characters.
Obviously showing different characters and of course nobody could understand the latter one.
If I turn to GBK for Chinese characters, function slidify will work:
But the html contains unrecongnized characters:
Comparing to the proper version:
The text was updated successfully, but these errors were encountered: