Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to the XMLRPC system to support UTF-8 encoded problems #956

Merged
merged 3 commits into from
Jun 26, 2019

Conversation

taniwallach
Copy link
Member

Changes to the XMLRPC system to allow it to handle UTF-8 encoded problems.

Rendering works already. However, submitting UTF-8 encoded text which is not English (Latin1) as form values (in particular as answer) is not working yet. Due to this, the values set on the submit buttons are currently forced back into English.

clients/sendXMLRPC.pl was extended to take a language setting as an argument.

A sample Hebrew UTF-8 problem was added as clients/t/test-utf8-hebrew.pg to allow testing. On my system, testing using

  • ./sendXMLRPC.pl -b -f simple t/test-utf8-hebrew.pg
  • ./sendXMLRPC.pl -b -l heb -f simple t/test-utf8-hebrew.pg
    both give reasonable behavior, so long as no special characters are used in the answers.

lib/WebworkClient.pm contains code to set the HTML language and text direction both at the level of the "page" and at the level of the DIV element enclosing a problem (based on data set in the PG file).

lib/WeBWorK/Localize.pm received a new method getLangHandle() used by lib/WebworkClient.pm to get access to maketext.

lib/WeBWorK/Utils/DetermineProblemLangAndDirection.pm was extended to allow forcing options not available via a request hash when the XMLRPC system is rendering problems.

Small related changes and some cleanup work was done to several of the formats in lib/WebworkClient/ including changes to support UTF-8, to standardize the title setting, to the copyright dates. Additional settings were added to the

element and to the "problem-content"
DIV element to be more similar to what is generated by the regular ContentGenerator system. forcePortNumber was added as a new hidden field. The value settings of the submit buttons are set by variables so they can get localized text. At present, this is not in active use as UTF-8 values render properly but then cause problems when the buttons are clicked and the non-English values are sent as form data.

This PR also includes changes to allow forcing a port number to be used by the XMLRPC system. This is intended to help when using the XMLRPC system with a Docker container or a host which requires using a port number for the site_url and form_action_url. This is managed by setting the value of forcePortNumber in the credentials and/or in the address used to access html2xml.
Ex: http://localhost:8080/webwork2/html2xml?&answersSubmitted=0&language=en&sourceFilePath=some_pg_file.pg&problemSeed=123567890&displayMode=MathJax&courseID=daemon_course&userID=daemon&course_password=daemon&outputformat=simple&forcePortNumber=8080

problems.

Rendering works already. However, submitting UTF-8 encoded
text which is not English (Latin1) as form values (in particular
as answer) is not working yet. Due to this, the values set on the
submit buttons are currently forced back into English.

clients/sendXMLRPC.pl was extended to take a language setting
as an argument.

lib/WebworkClient.pm contains code to set the HTML language and
text direction both at the level of the "page" and at the level
of the DIV element enclosing a problem (based on data set in the
PG file).

lib/WeBWorK/Localize.pm received a new method getLangHandle()
used by lib/WebworkClient.pm to get access to maketext.

lib/WeBWorK/Utils/DetermineProblemLangAndDirection.pm was extended
to allow forcing options not available via a request hash when
the XMLRPC system is rendering problems.
Small related changes and some cleanup work was done to several of
the formats in lib/WebworkClient/ including changes to support UTF-8,
to standardize the title setting, to the copyright dates. Additional
settings were added to the <form> element and to the "problem-content"
DIV element to be more similar to what is generated by the regular
ContentGenerator system. "forcePortNumber" was added as a new hidden
field. The value settings of the submit buttons are set by variables
so they can get localized text. At present, this is not in active use
as UTF-8 values render properly but then cause problems when the
buttons are clicked and the non-English values are sent as form data.

This PR also includes changes to allow forcing a port number to be
used by the XMLRPC system. This is intended to help when using the
XMLRPC system with a Docker container or a host which requires using
a port number for the site_url and form_action_url. This is managed
by setting the value of forcePortNumber in the credentials and/or
in the address used to access html2xml. Ex:

http://localhost:8080/webwork2/html2xml?&answersSubmitted=0&language=en&sourceFilePath=some_pg_file.pg&problemSeed=123567890&displayMode=MathJax&courseID=daemon_course&userID=daemon&course_password=daemon&outputformat=simple&forcePortNumber=8080
@taniwallach
Copy link
Member Author

Pay attention to #955 and the workaround given there when trying to test the XMLRPC system with the 2.15 branch.

@taniwallach
Copy link
Member Author

Although this works fine (for rendering) - submitting answers with non-latin characters makes trouble.

Since that would not have worked at all before the UTF-8 support was added, leaving that issue unresolved by this PR is still significant progress on getting XMLRPC working with foreign languages.

@taniwallach
Copy link
Member Author

taniwallach commented Jun 21, 2019

Some thoughts on the issue with non-English characters in answers / form field data:

  1. WW usually decodes the incoming form data using Encode::decode_utf8() in lib/WeBWorK/Request.pm.
  2. I am guessing that this data is already in Perl's unicode internal format when it is being put into an internal XMLRPC call sent as an internal "web" request from the first stage processing code to the second stage processing code. When that happens
  3. My guess is that we need to either prevent the form data from being passed through Encode::decode_utf8() as it enters the first stage, or pass it through Encode::encode_utf8() again before it is passed from the first stage to the second stage.
    • I was not successful yet in determining how and where either of these approaches would occur in the code (but did try a few initial things with no success).

At present I see two error messages when submitting non-English in an answer (using the simple theme):

  1. Wide character in subroutine entry at /usr/local/share/perl/5.22.1/XMLRPC/Lite.pm line 181.
  2. Instead of the problem text:
    - Unable to decode problem text
    - xmlrpcCall to renderProblem returned no result for PG-FILENAME

If I hack the form to change to the form value for outputformat to debug before submitting and have the following line un lib/WebworkClient.pm uncommented (to generate the debug output):

       # my $pretty_print_self  = pretty_print($self);

I can see that the form data is in 3 locations:

  1. Stored directly in the %inputs_ref hash under the "field name".
  2. Stored directly in the %request_object hash under the "field name".
  3. Buried in a copy of the %inputs_ref hash which is stored inside %envir hash which is inside the %request_object hash.

In /var/log/apache2/error.log inside the Docker container running these calls I see lines like:

WebworkClient.pm 294 xmlrpcCall sent to http://localhost
WebworkClient.pm 295 xmlrpcCall issued with command renderProblem
[Fri Jun 21 06:45:17.674642 2019] [perl:warn] [pid 2031] [client 172.18.0.1:54960] [/webwork2/html2xml] Use of uninitialized value in join or string at /opt/webwork/webwork2/lib/WebworkClient.pm line 296., referer: http://localhost:8080/webwork2/html2xml
[Fri Jun 21 06:45:17.674674 2019] [perl:warn] [pid 2031] [client 172.18.0.1:54960] [/webwork2/html2xml] Use of uninitialized value in join or string at /opt/webwork/webwork2/lib/WebworkClient.pm line 296., referer: http://localhost:8080/webwork2/html2xml
WebworkClient.pm 296 input is: MuLtIaNsWeR_AnSwEr0002_1  MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_0_1  session_key E6ZolnuBUg4Y7FqfUPxe5BCNKnhk0NnN MaTrIx_AnSwEr0002_0_2 עגכע MaTrIx_AnSwEr0002_1_0  problem_state HASH(0x5587be7cc3c0) WWsubmit Check Answers MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_1_0  extra_packages_to_load ARRAY(0x5587be813368) courseName daemon_course envir HASH(0x5587bd982b90) outputformat simple command renderProblem MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_2_2  source  displayMode MathJax MaTrIx_AnSwEr0002_1_1  showSummary 1 MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_0_2  previous_AnSwEr0002  pathToProblemFile  problem-result-score 0.1 problemSource  AnSwEr0001 B0 MaTrIx_AnSwEr0002_0_1  problemSeed 123567890 MaTrIx_AnSwEr0002_2_1  courseID daemon_course userID daemon MaTrIx_AnSwEr0002_1_2  MaTrIx_AnSwEr0002_2_2  course_password daemon modules_to_evaluate ARRAY(0x5587be5e61e8) forcePortNumber 8080 mode  course  library_name Library MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_2_1  MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_2_0  answersSubmitted 1 AnSwEr0002  answer_form_submitted 1 language heb sourceFilePath SplitAsSymAntisym.pg MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_1_2  MaTrIx_MuLtIaNsWeR_AnSwEr0002_1_1_1  MaTrIx_AnSwEr0002_2_0 
WebworkClient.pm 297 xmlrpcCall renderProblem initiated webwork webservice object XMLRPC::Lite=HASH(0x5587be7b5b48)
There were a lot of errors
Errors: 
 Wide character in subroutine entry at /usr/local/share/perl/5.22.1/XMLRPC/Lite.pm line 181.

 End Errors
xmlrpcCall to renderProblem returned no result for SplitAsSymAntisym.pg

outputing sensitive information when pretty_print_self is allowed.
We block specific keys from CourseEnvironment as well as the entire
expected seed_ce.
@taniwallach
Copy link
Member Author

taniwallach commented Jun 21, 2019

I added in a second commit to the PR code to sanitize the output of the debug format when

     # my $pretty_print_self  = pretty_print($self);

is uncommented. The code explicitly avoids all sorts of sensitive fields from any included CourseEnvironment from being output, and blocks the entire seed_ce CourseEnvironment from being output.

@mgage
Copy link
Sponsor Member

mgage commented Jun 26, 2019

I see the following minor error messages when I run this:

Unknown PerlIO layer "uft8" at /Volumes/WW_test/opt/webwork/webwork2/../pg//lib/PGcore.pm line 34.
"our" variable @path_list redeclared at ./sendXMLRPC.pl line 406.
"my" variable $credentials_string masks earlier declaration in same scope at ./sendXMLRPC.pl line 408.
"our" variable $UNIT_TESTS_ON redeclared at ./sendXMLRPC.pl line 575.
home directory .

I see you noticed these as well Tani. Can you fix them in your pull request? or would you like me to fix them.

Copy link
Sponsor Member

@mgage mgage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the following minor error messages when I run this:

Unknown PerlIO layer "uft8" at /Volumes/WW_test/opt/webwork/webwork2/../pg//lib/PGcore.pm line 34.
"our" variable @path_list redeclared at ./sendXMLRPC.pl line 406.
"my" variable $credentials_string masks earlier declaration in same scope at ./sendXMLRPC.pl line 408.
"our" variable $UNIT_TESTS_ON redeclared at ./sendXMLRPC.pl line 575.
home directory .

I see you noticed these as well Tani. Can you fix them in your pull request? or would you like me to fix them.

Otherwise this looks good.

@taniwallach
Copy link
Member Author

@mgage - I will fix the minor issues in a different PR, together with the fix to #955, to keep the PR mostly focused on UTF-8 support, and some small additional features to the XMLRPC system.

taniwallach added a commit to taniwallach/webwork2 that referenced this pull request Jun 26, 2019
…t.pm

to fix the main problem reported in:
    openwebwork#955
as well as the warnings mentioned there, which were also mentioned
in openwebwork#956 but which are not
related to the main theme of that PR.
taniwallach added a commit to taniwallach/webwork2 that referenced this pull request Jun 26, 2019
…t.pm

to fix the main problem reported in:
    openwebwork#955
as well as the warnings mentioned there, which were also mentioned
in openwebwork#956 but which are not
related to the main theme of that PR.
@mgage mgage merged commit 5c5de53 into openwebwork:WeBWorK-2.15 Jun 26, 2019
@taniwallach
Copy link
Member Author

The minor issues were just fixed in

Those PRs also fix #955

@taniwallach taniwallach deleted the add_lang_to_xmlrpc branch June 26, 2019 20:22
@taniwallach
Copy link
Member Author

See:
Fixed by #962 and #963 for branches develop and WeBWorK-2.15.

The issue about Unknown PerlIO layer "uft8" was fixed in ttps://github.com/openwebwork/pg/pull/413 for the PG-2.15 branch, and openwebwork/pg#414 was just made to put the same fix in PG develop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants