Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UT_SONAR_TEST_REPORTER: wrong/missing encoding information #676

Closed
pesse opened this Issue May 21, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@pesse
Copy link
Member

pesse commented May 21, 2018

Atm the encoding charset of the result-XML of ut_sonar_test_reporter is not correct (or missing?)

See utPLSQL/utPLSQL-cli#78

We should add charset encoding based on the current session's charset setting.
We should also check other reporters which might be affected from charset mismatch/missing information.

@lwasylow

This comment has been minimized.

Copy link
Member

lwasylow commented May 25, 2018

Hi @pesse did you check this on develop branch. A failure message was changed to no longer use a cddata but to convert message by xml for 3.1.2. Wonder if that will help.

@pesse

This comment has been minimized.

Copy link
Member Author

pesse commented May 25, 2018

Can you point to the commit?

@lwasylow

This comment has been minimized.

Copy link
Member

lwasylow commented May 25, 2018

It's in PR
#669

@pesse

This comment has been minimized.

Copy link
Member Author

pesse commented May 25, 2018

Can't see how this should solve the problem to be honest.

@jgebal

This comment has been minimized.

Copy link
Member

jgebal commented May 25, 2018

It will not help when there are special characters in test description for example

@jgebal

This comment has been minimized.

Copy link
Member

jgebal commented May 25, 2018

Trouble is - I cant find any valuable info on how to detect current encoding and put it into XML.
I think there are 2 way to move from here:

  1. Encode everything in utf-8 and then we don't need to specify encoding - that is an issue wfor characters beyond utf-8
  2. Encode in current session encoding - but I have no idea how to get the info on that encoding in a format compatible with XML expected values for encoding
@jgebal

This comment has been minimized.

Copy link
Member

jgebal commented Jun 4, 2018

After some digging around here is what I've found.

Resources

Info on allowed&valid values for encoding in XML here.
That page links to iana.org with list of all Character-sets names.

Converting between Oracle and IANA names

Lookup of Oracle to IANA on google led me to ora_i18n.map_charset function.

Having that function we can now convert Oracle character-set name to IANA name.

select utl_i18n.map_charset(value) iana_charser, 
       value oracle_charset
  from v$nls_valid_values x
  where parameter = 'CHARACTERSET';

We could now use the below query to get the XML/HTML encoding based on database character set.

select utl_i18n.map_charset(value), value from v$nls_parameters where parameter = 'NLS_CHARACTERSET';

The trouble is however that we don't know client-side encoding.
Then NLS_LANG fundamentals says:

Note:
SELECT USERENV ('language') FROM DUAL; gives the session's <Language>_<territory> but the DATABASE character set not the client, so the value returned is not the client's complete NLS_LANG setting!

The part of NLS_LANG is NOT shown in any system table or view.
On Windows you have two possible options, normally the NLS_LANG is set in the registry, but it can also be set in the environment, however this is not often done and generally not recommended to do so. The value in the environment takes precedence over the value in the registry and is used for ALL Oracle_Homes on the server if defined as a system environment variable.

Conclusions?

We need to take one of 2 actions:

  • provide NLS_LANG from client to utPLSQL to allow for generation of proper XML/HTML encoding and update reporters to include that information based on client
  • add the encoding on client side using ora_i18n.map_charset to convert from Oracle to IANA character-set name.

Note
Not all Oracle character-sets have a corresponding mapping
See select value from v$nls_valid_values where parameter = 'CHARACTERSET' and utl_i18n.map_charset(value) is null; for character-sets that do not have corresponding IANA name

@pesse

This comment has been minimized.

Copy link
Member Author

pesse commented Jun 7, 2018

New plan after ongoing discussion in utPLSQL/utPLSQL-cli#78:

  • Create new parameter a_encoding in ut_runner.run
  • Fix XML-reporters to write starting xml-tag <?xml version="1.0" encoding="$a_encoding"?>
  • Fix HTML-reporters to write <meta charset="$a_encoding"> in head section

I guess the latter two tasks will require to add new param to ut_output_reporter type - might also need additional care in java-api for compatibility.

@jgebal jgebal added this to the 3.1.2 milestone Jun 12, 2018

@jgebal jgebal closed this in #697 Jun 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.