Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
Jun 8, 2016
Jun 8, 2016

README.md

Introduction

Compact Encoding Detection(CED for short) is a library written in C++ that scans given raw bytes and detect the most likely text encoding.

Basic usage:

#include "compact_enc_det/compact_enc_det.h"

const char* text = "Input text";
bool is_reliable;
int bytes_consumed;

Encoding encoding = CompactEncDet::DetectEncoding(
        text, strlen(text),
        nullptr, nullptr, nullptr,
        UNKNOWN_ENCODING,
        UNKNOWN_LANGUAGE,
        CompactEncDet::WEB_CORPUS,
        false,
        &bytes_consumed,
        &is_reliable);

How to build

You need CMake to build the package. After unzipping the source code , run autogen.sh to build everything automatically. The script also downloads Google Test framework needed to build the unittest.

$ cd compact_enc_det
$ ./autogen.sh
...
$ bin/ced_unittest

On Windows, run cmake . to download the test framework, and generate project files for Visual Studio.

D:\packages\compact_enc_det> cmake .

About

compact_enc_det - Compact Encoding Detection

Resources

License

Releases

No releases published
You can’t perform that action at this time.