Skip to content

Releases: Joungkyun/libchardet

1.0.6

13 May 20:13
Compare
Choose a tag to compare

Github Releases (by Release)

Security Issues

On 1.0.5 and before, a bug that has accessed deleted heap memory in chardet and chardet_r api has been fixed. (#18) Thanks for @gaoxiang-ut

Changes:

  • fixed #9 configure.ac needs subdir-objects

  • fixed #10 autogen failure because AM_PROG_AR with automake 1.11.1

  • fixed #12 No include guard

  • fixed #13 bom member has been added to the DetectObj structure

    • New unicode language model : BOCU-1, GB-18030, SCSU, UTF-1, UTF-7, UTF-EBCDIC
    diff --git a/src/chardet.h b/src/chardet.h
    index 84975a3..f603a37 100644
    --- a/src/chardet.h
    +++ b/src/chardet.h
    @@ -89,6 +89,7 @@ extern "C" {
        typedef struct DetectObject {
            char * encoding;
            float confidence;
    +       short bom;
        } DetectObj;
    
        CHARDET_API char * detect_version (void);
    #ifdef CHARDET_BOM_CHECK
        printf ("#1 %s : %s : %f : %d\n", string, obj->encoding, obj->confidence, obj->bom);
    #else
        printf ("#1 %s : %s : %f\n", string, obj->encoding, obj->confidence);
    #endif
  • fixed #14 can't detect short euc-kr

  • fixed #15 support automake style 'make check'

  • fixed #18 SECURITY! Invalid memory approach (heap-use-after-free) (@gaoxiang-ut)

1.0.5

11 May 15:37
Compare
Choose a tag to compare

Changes:

  • #8 fixed can not detect UTF-16/32.

    • This is binary safe problems
    • In order to solve this problems, support _detect_r_ and _detect_handledata_r_ API.
    • Support _CHARDET_BINARY_SAFE_ consantant whether support _detect_r_ or _detect_handledata_r_
    #ifdef CHARDET_BINARY_SAFE
            if ( detect_r (str[i], strlen (str[i]), &obj) == CHARDET_OUT_OF_MEMORY )
    #else
            if ( detect (str[i], &obj) == CHARDET_OUT_OF_MEMORY )
    #endif
            {
                fprintf (stderr, "On handle processing, occured out of memory\n");
                return CHARDET_OUT_OF_MEMORY;
            } 
    
    #ifdef CHARDET_BINARY_SAFE
            if ( detect_handledata_r (&d, str[i], strlen (str[i]), &obj) == CHARDET_OUT_OF_MEMORY )
    #else
            if ( detect_handledata (&d, str[i], &obj) == CHARDET_OUT_OF_MEMORY )
    #endif
            {
                fprintf (stderr, "On handle processing, occured out of memory\n");
                return CHARDET_OUT_OF_MEMORY;
            }
  • Merge uchardet's improves

    • #6 fixed extended character range on EUT-KR and EUC-TW
      • can detect CP949 (for example, "똠방각하", "뷁")
      • can detect extended EUC-TW ("灣,是,台" and so on)
    • #2, #5 Improve single-byte charset detection confidence algorithm
    • #4 New single-byte language model
      • Arabic
      • Danash
      • Esperanto
      • German
      • Spanish
      • Turkish
      • Vietnamese
  • #3 Update language model of Greek, Hungarian and Thai

  • fixed man pages wrong macro bug (martin.gansser@gmail.com)

1.0.4

23 May 00:23
Compare
Choose a tag to compare
  • fixed duplicated path on chardet.pc @8ddca58f
  • Windows support
    • support windows native library with MinGW @215952ce
    • support Code::Blocks project @215952ce
    • fixed build error and shared library on cygwin @b201aa5

1.0.3

23 May 00:24
Compare
Choose a tag to compare
  • add chardet.pc (Lee ByungYoung <darklin20@gamil.com>) @2fd56976
  • applied automake @0fdd1e28
  • add english man page @da00ca24
  • fixed comparison on JpCntx.cpp @7f624fb2

1.0.2

23 May 00:31
Compare
Choose a tag to compare
  • support visibility attribute on gcc4 @1f2d6ec5
  • add version information api @d42e4c01

1.0.1

23 May 00:30
Compare
Choose a tag to compare
  • fix wrong detect TIS-620 charset @f828c136