-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Error Reporting for Schema Validation #116
Improve Error Reporting for Schema Validation #116
Conversation
Write down a plan Add notes about replacing transmute Add notes about whether calling xmlResetError is needed
Revert "Add notes about replacing transmute" This reverts commit 7e7dab4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also spent some time working on a solution and came out with the same approach. I will test your patch and send you more comments if needed.
|
||
/// Human-readable informative error message | ||
pub fn message(&self) -> &str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that at least for a couple of revisions leave this function for backwards compatibility
src/error.rs
Outdated
fn drop(&mut self) { | ||
unsafe { bindings::xmlResetError(self.0) } | ||
/// Returns the provided c_str as Some(String), or None if the provided pointer is null. | ||
unsafe fn convert_to_owned(c_str: *mut c_char) -> Option<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the function might be clearer like this: ptr_to_string
.
src/error.rs
Outdated
return None; | ||
} | ||
|
||
let raw_str = CStr::from_ptr(c_str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only unsafe statement, might be more clear to have:
let raw_str = unsafe { CStr::from_ptr(c_str) };
Instead of making the whole function unsafe
src/error.rs
Outdated
impl StructuredError { | ||
/// Copies the error information stored at `error_ptr` into a new `StructuredError` | ||
pub fn from_raw(error_ptr: *mut bindings::_xmlError) -> Self { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more clear to have just unsafe
blocks in the respective assigments instead of having a big unsafe
block
@JDSeiler thank you for the thorough work in this PR, and thanks to @imcsk8 for the unexpected review. This is already more than the usual PR burden on this wrapper repo, I appreciate the contributions. Maybe it is a sign of the crate starting to mature. As to the questions by @JDSeiler :
Aside: What actually stuck out to me is libxml's decision to add trailing |
src/error.rs
Outdated
/// StructuredError `message` field directly. | ||
#[deprecated(since="0.3.3", note="Please use the `message` field directly instead.")] | ||
pub fn message(&self) -> &String { | ||
self.message.as_ref().unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the original &str
type, I think this is a workable variation:
pub fn message(&self) -> &str {
self.message.as_deref().unwrap_or("")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah excellent! I was about to push a change where the implementation was self.message.as_ref().unwrap().as_str()
. I didn't think it was even possible to get both the &str
return type and a default value working, since the compiler was quite mad about returning references to local values.
I'll push up that change shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be happy to help maintaining. What do you need me to do? |
@imcsk8 basically what you just did in reviewing here, whenever you have time/interest. I am currently available enough to ship minor releases, but I have been a little easy-going with spending time in developing the crate. Btw this is also how @triptec got added in here some time back, when he had a season of libxml work. And since this is currently a care-free open source crate you can also just sit on the rights without doing anything, it keeps the crate safer in case I'm missing for whatever reasons. |
@imcsk8 and in the interest of speed, I just added you as an admin to the repo (you should get an invite). Feel free to merge this PR in as a warm-up, if the general setup sounds reasonable enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I approved but also left a couple of suggestions in case you want to add them to the PR
let line = if error.line == 0 { | ||
None | ||
} else { | ||
Some(error.line) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be too much but I leave you a more idiomatic way of doing these assignments
let line = if error.line == 0 { | |
None | |
} else { | |
Some(error.line) | |
}; | |
let line = match error.line { | |
0 => None, | |
_ => Some(error.line) | |
}; |
let col = if error.int2 == 0 { | ||
None | ||
} else { | ||
Some(error.int2) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let col = if error.int2 == 0 { | |
None | |
} else { | |
Some(error.int2) | |
}; | |
let col = match error.int2 { | |
0 => None, | |
_ => Some(error.int2) | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the if
-else
variant is the first snippet in the std::option doc page, so it's not really criminal... This one may come down to taste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the
if
-else
variant is the first snippet in the std::option doc page, so it's not really criminal... This one may come down to taste.
You're right, it's a matter of taste 👍
Thanks @JDSeiler for the patch and thanks @dginev for the quick review and merge!!
I'm using heavily this library and I would like to help on whatever I can.
Cheers!
I have some prior Rust experience, but I'd consider myself an intermediate Rust programmer at best, and I have very little experience with C and FFI. Point being, I'm very open to feedback on these changes.
Motivation
Closes #115
Previously,
StructuredError
was a thin wrapper around a raw pointer to libxml2'sxmlError
struct. This was problematic because libxml2 does not allocate separatexmlError
structs. Instead, it uses an effectively global (or thread-global)xmlError
that is rewritten (but does not move) every time a new error is generated. As a result, all of the errors produced by this library were all pointing to the same memory and all had the same contents: the last error libxml2 produced.The following code from
libxml2
was consulted to confirm its behavior:schannel
in libxml)to
.to
is not a fresh xmlError, but rather a reference to thexmlLastError
global. This either points to a file-levelxmlError
defined inglobals.c
, or a struct-member that's part of a thread-specific global context. In either case, it does not appear thatxmlLastError
is ever allocated more than once.Description of Changes
I replaced the wrapper around the raw pointer with a more traditional Rust struct so that each individual error could be preserved. Because the struct no longer has any ties to the underlying C-managed data once constructed, the
Drop
implementation was removed.Not all of the fields in the
xmlError
struct had obvious utility, or even safe ways of managing them. The fields ommitted are:str1
,str2
,str3
:: I couldn't find any indication as to what these would be used for. Perhaps they are "empty buckets" for users to put their own custom error messages into? It didn't seem worth spending the CPU cycles on looking at them.int1
:: Similarly, I couldn't find any information on this aside from the cryptic "extra number information".ctxt
,node
:: Both of these arevoid *
and I have 0 clue how to safely include them without massively complicating things.Regarding enums, the
xmlErrorLevel
field was converted to a normal Rust enum because it's small. Thecode
anddomain
fields were included in their raw forms because they might be useful to someone. However, thexmlErrorDomain
andxmlParserError
C enums are 30 and 700+ (!!) members respectively, so it didn't seem wise to write out Rust enum versions of them.Draft -> Ready for Review Checklist
XmlErrorLevel::from_raw
function be? Do we trust thatlibxml2
is always going to provide something in the range[0, 3]
, and the catch-all branch can be something likeunreachable!()
?Option<String>
acceptable for themessage
andfilename
fields? What about the lossy conversion to utf-8? I went through a few different options here. The underlying C string could be a null pointer (at least forfilename
) and of course there's no guarantee that the underlying C string is valid utf-8.