Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R studio doesn't have good Hindi support for typing and reading #4027

Closed
vikram-rawat opened this issue Dec 11, 2018 · 13 comments
Closed

R studio doesn't have good Hindi support for typing and reading #4027

vikram-rawat opened this issue Dec 11, 2018 · 13 comments
Labels
ace bug encoding stale

Comments

@vikram-rawat
Copy link

@vikram-rawat vikram-rawat commented Dec 11, 2018

I opened an issue of community.rstudio.com and they asked me to open an issue here.

https://community.rstudio.com/t/r-studio-doesnt-have-good-hindi-support/19721/4

I have been using r studio for many years and I never noticed it until recently. When I wanted to write my book in R studio through bookdown I chose Hindi as my language.

It turns out that when I type Hindi the curser moves ahead of the words and it becomes very hard to understand which word are you on. Especially when you want to delete something.

Then I switched to vscode and I found it had no such problem writing Hindi Unicode. So did atom. Somehow rstudio have such problem.

Just find any text in Hindi from Google and try to delete it and you will understand what I mean. Let me find some random text for you.

मोहनदास करमचन्द गांधी (२ अक्टूबर १८६९ - ३० जनवरी १९४८) भारत एवं भारतीय स्वतंत्रता आंदोलन के एक प्रमुख राजनैतिक एवं आध्यात्मिक नेता थे। वे सत्याग्रह (व्यापक सविनय अवज्ञा) के माध्यम से अत्याचार के प्रतिकार के अग्रणी नेता थे, उनकी इस अवधारणा की नींव सम्पूर्ण अहिंसा के सिद्धान्त पर रखी गयी थी जिसने भारत को आजादी दिलाकर पूरी दुनिया में जनता के नागरिक अधिकारों एवं स्वतन्त्रता के प्रति आन्दोलन के लिये प्रेरित किया। उन्हें दुनिया में आम जनता महात्मा गांधी के नाम से जानती है। संस्कृत भाषा में महात्मा अथवा महान आत्मा एक सम्मान सूचक शब्द है। गांधी को महात्मा के नाम से सबसे पहले १९१५ में राजवैद्य जीवराम कालिदास ने संबोधित किया था।[1]। उन्हें बापू (गुजराती भाषा में બાપુ बापू यानी पिता) के नाम से भी याद किया जाता है। सुभाष चन्द्र बोस ने ६ जुलाई १९४४ को रंगून रेडियो से गांधी जी के नाम जारी प्रसारण में उन्हें राष्ट्रपिता कहकर सम्बोधित करते हुए आज़ाद हिन्द फौज़ के सैनिकों के लिये उनका आशीर्वाद और शुभकामनाएँ माँगीं थीं।[2] प्रति वर्ष २ अक्टूबर को उनका जन्म दिन भारत में गांधी जयंती के रूप में और पूरे विश्व में अन्तर्राष्ट्रीय अहिंसा दिवस के नाम से मनाया जाता है।

Please fix this if possible.

And there is a problem with viewing a data frame which has a column written in Hindi.

@kevinushey
Copy link
Contributor

@kevinushey kevinushey commented Dec 11, 2018

Thanks for the bug report! I've reproduced this; we'll see if we can support this in a future release of RStudio. (Currently, RStudio only supports monospaced fonts, which IIUC makes Hindi more difficult to support)

@jmcphers
Copy link
Member

@jmcphers jmcphers commented Dec 11, 2018

@kevinushey Is this issue separate from #3698?

@kevinushey
Copy link
Contributor

@kevinushey kevinushey commented Dec 11, 2018

I think it's different, since the font here isn't monospace, as opposed to being a monospace font with unicode connecting characters.

@vikram-rawat
Copy link
Author

@vikram-rawat vikram-rawat commented Dec 12, 2018

Thanks for taking this into account. Please also remember same thing happens to a character column with Hindi fonts.

I was trying to analyze election data in India and had problem reading the column in R notebook.

How ever ggplot2 was able to show it perfectly fine. But R studio didn't print it well.

@ronblum ronblum added the bug label Dec 17, 2018
@vikram-rawat
Copy link
Author

@vikram-rawat vikram-rawat commented Dec 29, 2018

I tried to read an image file in R today and it turns out to be that I could read it fine when it's in text format but not when it is in a data.frame

text<-ocr(image = "image/CMB0760301.PDF_1.png",engine = 'hin')

this command produces this output
image

but if I use it with dataframe output

text_dt<-ocr_data(image = "image/CMB0760301.PDF_1.png",engine = 'hin')

image

Please help me out if possible.

This is the image

cmb0760301 pdf_1

Please let me know how to fix it

@SandeepShaw2017
Copy link

@SandeepShaw2017 SandeepShaw2017 commented Jun 26, 2019

I am facing similar issue when writing from hindi to a text file ..... the output is like:

U+0968><U+096C>; <U+0916><U+0940> <U+0939> _ 3" -“/ <U+0967>-<U+096B>. > 9

pls suggest how to write a file in hindi in R

@KapilKhanal
Copy link

@KapilKhanal KapilKhanal commented May 28, 2020

was this resolved in the new Rstudio release?

@jmcphers
Copy link
Member

@jmcphers jmcphers commented May 28, 2020

It was not, sorry. Does it work as you expect in the Ace Kitchen Sink?

https://ace.c9.io/build/kitchen-sink.html

@KapilKhanal
Copy link

@KapilKhanal KapilKhanal commented Jun 2, 2020

I am having same problems. Thank you for the link. I did not knew about this service before.

@stale
Copy link

@stale stale bot commented Feb 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, per https://github.com/rstudio/rstudio/wiki/Issue-Grooming. Thank you for your contributions.

@stale stale bot added the stale label Feb 5, 2021
@stale
Copy link

@stale stale bot commented Feb 19, 2021

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Feb 19, 2021
@kulbhushanchand
Copy link

@kulbhushanchand kulbhushanchand commented Feb 22, 2021

I'm facing the same problem i.e. incorrect display of Hindi font in RStudio, which makes .Rmd files difficult to edit.
For example, in screenshot below, the appearance of non-editable markup represented by yellow lines.

image

@aartimalik
Copy link

@aartimalik aartimalik commented Feb 16, 2022

Same issue. Please resolve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ace bug encoding stale
Projects
None yet
Development

No branches or pull requests

8 participants