Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid UTF-8 in some language files. #1

Open
NathanGibbs3 opened this issue Feb 14, 2019 · 2 comments
Open

Invalid UTF-8 in some language files. #1

NathanGibbs3 opened this issue Feb 14, 2019 · 2 comments
Assignees
Labels
bug Something isn't working invalid This doesn't seem right LCB-TechDebt Issue exists in Legacy Code Base. We inherited it. Prod Observed in Production Environment. Translation Issues related to Language Translation Data
Milestone

Comments

@NathanGibbs3
Copy link
Owner

NathanGibbs3 commented Feb 14, 2019

The following files in the language directory contain invalid UTF-8 characters.

languages/czech.lang.php
languages/danish.lang.php
languages/finnish.lang.php
languages/french.lang.php
languages/italian.lang.php
languages/norwegian.lang.php
languages/russian.lang.php
languages/swedish.lang.php
languages/turkish.lang.php

This was messing up code coverage report submissions.
We were getting "Malformed UTF-8 characters, possibly incorrectly encoded" errors.

Depends on: #11 Dependency Type: Soft

@NathanGibbs3 NathanGibbs3 added bug Something isn't working help wanted Extra attention is needed labels Feb 14, 2019
@NathanGibbs3
Copy link
Owner Author

The listed files have been excluded from code coverage until they can be fixed.

NathanGibbs3 added a commit that referenced this issue Feb 15, 2019
NathanGibbs3 added a commit that referenced this issue Feb 15, 2019
NathanGibbs3 added a commit that referenced this issue Feb 15, 2019
NathanGibbs3 added a commit that referenced this issue Feb 26, 2019
@NathanGibbs3 NathanGibbs3 added the invalid This doesn't seem right label Mar 5, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
NathanGibbs3 added a commit that referenced this issue Mar 7, 2019
20190213 Code Coverage #1
@NathanGibbs3 NathanGibbs3 added the LCB-TechDebt Issue exists in Legacy Code Base. We inherited it. label Mar 11, 2019
@NathanGibbs3 NathanGibbs3 reopened this May 14, 2019
@NathanGibbs3
Copy link
Owner Author

Will most likely use Enca to try to iron this out after Issue #11 is closed.

NathanGibbs3 added a commit that referenced this issue May 29, 2019
          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->Caps
                   Language Capitalization flag.
                   Dependent on Spacing.
   Language files: All
                   Migration of 1 to 1 word TD item $UI_AD_PWD to
                   $UI_CW_Pw for use in cascading translation.
                   Finnish, Swedish, Turkish
                   Replaced invalid UTF-8 character F6 with
                   multibyte C3B6.
             Code: Support UILang & Legacy translation data formats.
NathanGibbs3 added a commit that referenced this issue May 30, 2019
                   Partial fix for Issue #1 in czech TD.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CPA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character ED with
                   multibyte C3AD.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
@NathanGibbs3 NathanGibbs3 moved this from Needs triage to Low priority in Translation Data Migration / UILang Class Development. May 30, 2019
NathanGibbs3 added a commit that referenced this issue Jun 27, 2019
                   Closes #30
                   Partial fixes for Issue #1 TD for czech,
                   swedish, & turkish.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
      Function(s): Phrase()
                   Add flag to use   spacing alternative when
                   needed.
Data Structure(s): UILang->CPA
                   UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character E1 with
                   multibyte C3A1.
                   Migrate 1 to 1 word translation Items from
                   UILang->CPA to UILang->CWA. Closes #30
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
NathanGibbs3 added a commit that referenced this issue Jun 28, 2019
                   Complete fix for Issue #1 TD for finnish.
                   Partial fixes for Issue #1 TD for czech,
                   swedish, & turkish.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CPA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character E4 with
                   multibyte C3A4.
                   Replaced invalid UTF-8 character FD with
                   multibyte C3BD.
                   Replaced invalid UTF-8 character FE with
                   multibyte C3BE.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
NathanGibbs3 added a commit that referenced this issue Jun 29, 2019
                   Partial fixes for Issue #1 TD for czech,
                   french & turkish.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CPA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character E9 with
                   multibyte C3A9.
                   Replaced invalid UTF-8 character FC with
                   multibyte C3BC.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
NathanGibbs3 added a commit that referenced this issue Jul 1, 2019
                   Partial fix of Issue #1 TD for french,
                   italian, & turkish.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
                   Laid out some index elements for furture TD Items.
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character E0 with
                   multibyte C3A0.
                   Replaced invalid UTF-8 character D6 with
                   multibyte C396.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
NathanGibbs3 added a commit that referenced this issue Jul 1, 2019
                   Converted russian TD file from windows-1251
                   charset to UTF-8 charset via enca.
                   enconv -L ru -x UTF-8 russian.lang.php
@NathanGibbs3 NathanGibbs3 added Prod Observed in Production Environment. and removed help wanted Extra attention is needed labels Jul 1, 2019
NathanGibbs3 added a commit that referenced this issue Jul 2, 2019
                   Partial fix of Issue #1 TD for danish,
                   & norwegian.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character E6 with
                   multibyte C3A6.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
NathanGibbs3 added a commit that referenced this issue Jul 3, 2019
                   _MARCG, _APRIL, _MAY, _JUNE, _JULY, _AUGUST,
                   _SEPTEMBER, _OCTOBER, _NOVEMBER, _DECEMBER.
                   Partial fix of Issue #1 TD for czech, french,
                   & italian.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
                   UILang->ILocale
                   Auto Generated Locale.
     Functions(s): Class Init
                   Auto Generate Long/Short Month Names via Intl with
                   fall back to installed Locale(s), then to TD file
                   in either UILang or legacy format. 😄
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character C8 with
                   multibyte C388.
                   Replaced invalid UTF-8 character DA with
                   multibyte C39A.
                   Replaced invalid UTF-8 character E8 with
                   multibyte C3A8.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
NathanGibbs3 added a commit that referenced this issue Jul 4, 2019
                   Partial fix of Issue #1 TD for czech,
                   danish, & norwegian.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character D8 with
                   multibyte C398.
                   Replaced invalid UTF-8 character EC with
                   multibyte C3AC.
                   Replaced invalid UTF-8 character F8 with
                   multibyte C3B8.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
NathanGibbs3 added a commit that referenced this issue Jul 15, 2019
                   Complete fix of Issue #1 TD for norwegian.
                   Partial fix of Issue #1 TD for czech,
                   danish, french, italian, swedish, & turkish.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Replaced invalid UTF-8 character DD with
                   multibyte C39D.
                   Replaced invalid UTF-8 character E5 with
                   multibyte C3A5.
                   Replaced invalid UTF-8 character F9 with
                   multibyte C3B9.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
NathanGibbs3 added a commit that referenced this issue Jul 19, 2019
                   Partial fix of Issue(s) #1 & #56 for czech TD.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Issue #1 Czech
                   Replaced invalid UTF-8 character AE with
                   multibyte C5BD.
                   Replaced invalid UTF-8 character B9 with
                   multibyte C5A1.
                   Replaced invalid UTF-8 character BE with
                   multibyte C5BE.
                   Issue #56 Czech
                   Replace incorrectly converted ISO-8859-2 character
                   C8 to UTF-8 C388 with C48C.
                   Replace incorrectly converted ISO-8859-2 character
                   D8 to UTF-8 C398 with C598.
                   Replace incorrectly converted ISO-8859-2 character
                   F9 to UTF-8 C3B9 with C5AF.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
NathanGibbs3 added a commit that referenced this issue Apr 26, 2020
                   Partial fix of Issue(s) #1 & #56 for Czech TD.
                   Partial fix of Issue #56 for Turkish TD.
                   Misc. Output fixes.

          File(s): includes/base_lang.inc.php
            Class: UILang
Data Structure(s): UILang->CWA
   Language files: Support additional UILang capabilities. 😄
                   Issue #1 Czech
                   Replaced invalid UTF-8 character EF with
                   multibyte C48F
                   Replaced invalid UTF-8 character F3 with
                   multibyte C3B3.
                   Replaced invalid UTF-8 character FA with
                   multibyte C3BA.
                   Issue #56 Czech
                   Replace incorrectly converted ISO-8859-2 character
                   E8 to UTF-8 with C48D.
                   Replace incorrectly converted ISO-8859-2 character
                   EC to UTF-8 with C49B.
                   Replace incorrectly converted ISO-8859-2 character
                   F8 to UTF-8 with C599.
                   Issue #56 Turkish
                   Replace incorrectly converted ISO-8859-2 character
                   FD to UTF-8 with C4B1.
             Code: Support UILang & Legacy TD formats.
                   Minor output fixes.
        Update(s): TD Migration Matrix.
                   Unit Tests.
@NathanGibbs3 NathanGibbs3 self-assigned this Apr 28, 2020
@NathanGibbs3 NathanGibbs3 added the Translation Issues related to Language Translation Data label May 15, 2020
@NathanGibbs3 NathanGibbs3 added this to the 1.4.7 milestone Oct 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right LCB-TechDebt Issue exists in Legacy Code Base. We inherited it. Prod Observed in Production Environment. Translation Issues related to Language Translation Data
Development

No branches or pull requests

1 participant