Skip to content

Conversation

krizex
Copy link
Member

@krizex krizex commented Apr 24, 2017

This PR aim to fix a potential bug of WLB in xapi.
For more details please refer to CA-250858 and check the two test cases that created to reproduce and verify the fix of this issue.
Signed-off-by: Yang Qian yang.qian@citrix.com

@krizex
Copy link
Member Author

krizex commented Apr 24, 2017

@jonludlam could you please have a look and merge it if possible?

@lindig
Copy link
Contributor

lindig commented Apr 24, 2017

Please expand the log message what the problem was and why this is fixing it. The ticket doesn't contain it either.

if send_len > 0 then
begin
send (String.sub s 0 send_len);
Buffer.add_substring recv_buf s send_len end_str_len
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, no data was added to recv_buf but now it is. Could you explain why? The same is happening below at Buffer.add_string recv_buf s when |s| <= |end_str|.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the ticket to add more info about the scenario.

Copy link
Member Author

@krizex krizex Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we receive the end_str in two recv operations, we should not miss it.
For example, we receive the data as following:

  1. <XmlDataSet>......</XmlData
  2. Set><OtherTag>....</OtherTag>
    Actually we receive the end_str </XmlDataSet>, but the old code miss it.

begin
let s = Buffer.contents recv_buf in
(* debug "%s %d" s !recv_state; *)
debug "state:%d, content len: %d, contents: %s" !recv_state (String.length s) s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, a space could be added after state:.

| [] ->
send s;
Buffer.clear recv_buf
(* send data as much as possiable, just keep the length of `end_str` data to avoid broken tag *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible* ... also send as much data as possible would be better here I think

@krizex
Copy link
Member Author

krizex commented Apr 24, 2017

@lindig
Issue details update on the ticket description, please navigate to CA-250858 for more details.
Thanks.

@krizex
Copy link
Member Author

krizex commented Apr 24, 2017

Code updated per your comments. @lindig @Frezzle thanks.

@lindig
Copy link
Contributor

lindig commented Apr 24, 2017

I think this tries to fix a problem resulting from bad design: trying to parse a XML file that is read in blocks with some hand-crafted scanner. The problem at hand is parsing at block boundaries. I have little confidence that this will really solve the issue. It would be better to implement the functionality with either an XML parser, or an automaton-based scanner (generated with OCamlLex or Menhir) that takes care of buffering.

@krizex
Copy link
Member Author

krizex commented Apr 24, 2017

@lindig
The original author left comments in the file says

"The response could potentially be large (megabytes), so we have to stream to avoid OCaml's 16MB string limit. "

So I think we cannot implement this parse work with an XML parser since it may need more than 16M buffer.

Regarding the state machine, it just has 3 states and works fine these years, so I think it's not complicated enough that we should generate the state machine with lex&yacc.

@jonludlam
Copy link
Contributor

The 16M limit is no longer relevant now we've switched to a 64 bit dom0.

@krizex
Copy link
Member Author

krizex commented Apr 24, 2017

@jonludlam If so, I think we can refine the parser code in wlb_reports.ml with xml parser.

BTW, I'm interested in the 16M limit, can you give more details about it?

@jonludlam
Copy link
Contributor

There's a bit here about it:

https://rwmj.wordpress.com/2009/08/05/ocaml-internals-part-2-strings-and-other-types/

I think we should stop worrying about the 16MiB thing and get rid of these functions and use the standard parser. I'd much rather see a pull request for that than to accept this one.

@lindig
Copy link
Contributor

lindig commented Apr 24, 2017

OCaml uses 10 header bits, leaving 22 bits on a 32-bit architecture and 54 bits on a 64-bit architecture. These (22, 54) bits are the number of words. On 32-bit, a word is 4 bytes and on 64-bit, 8 bytes. So on 32-bits you have 2^22*4=2^24 maximum string length but 2^54*8=2^57 on 64-bit.

@krizex
Copy link
Member Author

krizex commented Apr 27, 2017

Hi @lindig @jonludlam @Frezzle , I have refined the code by using XML parser instead of the handwritten state machine to parse the XML response. Regression test passed in Job 1752871 and 1752872. Please help review this PR, thanks.

Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a commit message that explains what problem this re-factoring solves.

let send_state = ref 1 in
let entity = ref "" in

let fill () =
Copy link
Contributor

@lindig lindig Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a fixed-size string to read input? Would using Buffer not be better such that we read all input into a Buffer.t value and grow it as needed? I'm not familiar with the Xml parser but can't the XML parser read a file directly? My understanding was that using a fixed-size buffer created the problem of parsing across buffer boundaries in the first place.

The code is fine. The 16k block is only used to build the Buffer with the entire input and the parser only sees the final result, not the blocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refine the commit message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message has been refined. Please have a look @lindig

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to suggest the following commit message:

This commit refactors the parsing of WLB XML messages. 

* It avoids a problem with the previous disign: XML was read in blocks and parsing it could fail at block boundaries
* Parsing of XML is now delegated to an XML parser.
* The "send" function no longer escapes special characters and emits the message as is

…k report

from WLB response was cut
1. Refactor parser code. Replace hand-written state machine with OCaml XML
parser
2. Refine the code of `send`, since XML parser has escape the content, there is
no need to manually escape the content in `send`

Signed-off-by: Yang Qian <yang.qian@citrix.com>
Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this code passes the tests, I'm happy to accept it. Using an XML parser improved the design considerably.

@jonludlam
Copy link
Contributor

Yup, this looks a lot nicer now.

@lindig lindig merged commit d7f780b into xapi-project:master May 2, 2017
@krizex krizex deleted the CA-250858 branch May 16, 2017 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants