rchardet is an encoding auto-detection library in Ruby. This library is a port of the auto-detection code in Mozilla. It means taking a sequence of bytes in an unknown character encoding, and attempting to determine the encoding so you can read the text. It’s like cracking a code when you don’t have the decryption key.
+This fork is compatible with ruby 1.9, and runs in production at [](
confidence = cd['confidence'] # 0.0 <= confidence <= 1.0
+Encoding Detection Strategy
+rchardet isn’t a very reliable tool to determine a file encoding and should be used as the last resort. There are plenty of ways to detect a file’s encoding before having to use rchardet. For instance, by reading and detecting the [BOM](, or by looking for hints in the text you’re working on (for instance, don’t headers or footers have `charset="utf-8` somewhere?
+You can read an [introductory blog post to our encoding detection strategy](
+I suggest you open your file to detect in `ASCII-8BIT`.
+``` ruby
+file_content = open(self.file_path, external_encoding: 'ASCII-8BIT') { |f| }
+encoding = CharDet.detect(file_content)
+You don’t know what’s your file’s encoding just yet, so in which encoding will you open your file? Ruby defines the encoding `ASCII-8BIT`, with an alias of `BINARY`, which does not correspond to any known encoding. It is intended to be associated with binary data or for text of unknown encoding.
+Once you’ve detected the encoding you can then convert it:
+``` ruby
+converter =[:encoding].name.upcase, "UTF-8")
Running tests

