Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source_codepage autodetect with enca program. #1838

Closed
mc-butler opened this issue Nov 20, 2009 · 29 comments
Closed

source_codepage autodetect with enca program. #1838

mc-butler opened this issue Nov 20, 2009 · 29 comments
Assignees
Labels
area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress
Milestone

Comments

@mc-butler
Copy link

Important

This issue was migrated from Trac:

Origin https://midnight-commander.org/ticket/1838
Reporter ASM (@BASM)
Keywords enca, codepage, encoding, autodetect, source_codepage

Hello.

I wrote stupid patch for autodetect source_codepage. It working then open/view/edit events.

It using enca package (http://gitorious.org/enca, for fedora: https://admin.fedoraproject.org/pkgdb/packages/name/enca)

Use autodetect source_codepage very convenient. Please don't ignore it.

Note

Original attachments:

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Nov 20, 2009 at 11:45 UTC (comment 1)

  • Milestone changed from 4.7.0 to 4.7

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 20, 2009 at 13:37 UTC

Fix, if codepage don't detect, don't warning.

@mc-butler
Copy link
Author

Changed by osgx (osgxdvyg@….com) on Nov 24, 2009 at 10:44 UTC (comment 2)

There must be way to disable auto-detection or to manually select codepage if enca fails to do it right.

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Nov 24, 2009 at 10:57 UTC (comment 3)

Yes, right. But this is an enhancement (not a bugfixing or code cleanup). Therefore this ticket not for '4.7.0' milestone. Just await :)

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 24, 2009 at 11:45 UTC (comment 4)

I tested my patch and made a few notes:

  • Need add on MC menu:
    • Enca on/off
    • Enca language
    • Maybe add default source_codepage encoding (if enca fail)
  • Don't use mc.ext for enca. Is unnecessary. Program 'file' is often mistaken for utf-8 files, view their 'data'. If enca is on, need exec it for all source files. If enca wrong, need set display_codepage(or another).

Another idea?

There must be way to disable auto-detection or to manually select codepage if enca fails to do it right.

Is't work. (F9->Command->Encoding)

---
I'm no good code writer, but I'm working on that. I try rewrite patch.
Sorry my english, but I'm working on that too. :-)

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Nov 24, 2009 at 11:52 UTC (comment 5)

ASM: is you have own publish git-repo? For example, on http://github.com/ or on
http://repo.or.cz/

This better to develop in your own branch, because in this case you will maintainer of your idea and anybody will send patches to you instread of... :)

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 26, 2009 at 13:03 UTC (comment 6)

Hello, folks!

I add enca support (autodetect). Need add to ini file new option.
What name give this? How to use?

For example something like:

  • codepage_autodetect= no use enca
  • codepage_autodetect=ru use enca, set language Russian
  • codapage_autodetect=off no use enca

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 26, 2009 at 13:56 UTC (comment 7)

codepage_autodetect -> autodetect_codeset

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 27, 2009 at 9:50 UTC (comment 8)

I release it in http://github.com/BASM/mc-basm/tree/ASM/1838_enca.

  • added autodetect parameter,
  • added enca support,
  • added default codepage, if enca fail.

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Nov 27, 2009 at 10:55 UTC

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Nov 27, 2009 at 10:56 UTC

Please, use one-style source coding

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 27, 2009 at 14:53 UTC (comment 9)

  • Style code fixed,
  • Bugs fixed.

I think autodetect language done.

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 27, 2009 at 14:57 UTC (comment 10)

s/language/encoding/

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 30, 2009 at 10:35 UTC

Fixed version.

@mc-butler
Copy link
Author

Changed by ASM (@BASM) on Nov 30, 2009 at 10:38 UTC (comment 11)

My branch released in: http://github.com/BASM/mc-basm/tree/1838_enca

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Feb 5, 2010 at 13:11 UTC (comment 12)

  • Severity changed from no branch to on review
  • Status changed from new to accepted
  • Owner set to slavazanko
  • Version changed from 4.7.0-pre4 to master

Created & rebased branch 1838_add_enca_support

Initial [b5b119ffac2bca5edb30aaec98a32b6fa9bab031]

Review, please.

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Feb 5, 2010 at 13:14 UTC (comment 13)

  • Votes set to slavazanko

@mc-butler
Copy link
Author

Changed by angel_il (@ilia-maslakov) on Feb 5, 2010 at 13:48 UTC (comment 14)

Changeset: [71d58780d127cc8d2426e00b40f8ffc0a4854c9d] (forced update)

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Feb 5, 2010 at 18:08 UTC (comment 15)

  • Votes slavazanko deleted

Code was fixed and documentaion was updated. Please revire again.
[dd93a53afc6ec190a95ef6d83b9a72e9cea5f3e4] -- code
[cc432bd67f4c2edfecd39b60e22f6158d5127005] -- documentation

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Feb 8, 2010 at 13:44 UTC (comment 16)

  • Votes set to slavazanko

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Feb 8, 2010 at 13:46 UTC (comment 17)

  • Severity changed from on review to approved
  • Votes changed from slavazanko to slavazanko andrew_b

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Feb 8, 2010 at 13:51 UTC (comment 18)

  • Resolution set to fixed
  • Status changed from accepted to testing
  • Severity changed from approved to merged
  • Votes changed from slavazanko andrew_b to commited-master

merged into master: [ac60804]

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Feb 8, 2010 at 13:52 UTC (comment 19)

  • Status changed from testing to closed

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 17, 2010 at 7:27 UTC (comment 20)

  • Resolution fixed deleted
  • Milestone changed from 4.7 to 4.7.3
  • Type changed from enhancement to defect
  • Status changed from closed to reopened
  • Severity changed from merged to on review
  • Votes committed-master deleted

Charset autodetection is partially broken in recent master (4.7.2-46-g7843203).
In my KOI8-R locale 8-bit the UTF-8 charset is not autodetected as in editor as in viewer. 8-bit locales are autodetected.

To fix this issue, the 1838_codeset_autodetect_fix branch was created. Parent branch is master.
Initial [e053e1d29d79d92de96d0fcdaae1cfd231d6bcd9]

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 17, 2010 at 7:28 UTC (comment 21)

  • Owner changed from slavazanko to andrew_b
  • Status changed from reopened to accepted

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on May 17, 2010 at 7:46 UTC (comment 22)

  • Votes set to slavazanko

@mc-butler
Copy link
Author

Changed by angel_il (@ilia-maslakov) on May 27, 2010 at 9:50 UTC (comment 23)

  • Severity changed from on review to approved
  • Votes changed from slavazanko to slavazanko angel_il

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 27, 2010 at 10:03 UTC (comment 24)

  • Severity changed from approved to merged
  • Resolution set to fixed
  • Votes changed from slavazanko angel_il to committed-master
  • Status changed from accepted to testing

Merged to master.
[12969b4]

git log --pretty=oneline 6117c5b..12969b4

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 27, 2010 at 10:03 UTC (comment 25)

  • Status changed from testing to closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress
Development

No branches or pull requests

2 participants