-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utf8Error when trying to parse 2902.dat #21
Comments
This is the code I'm running to test with. use std::path::Path;
use weldr::{parse, FileRefResolver, ResolveError, SourceMap};
struct MyCustomResolver;
impl FileRefResolver for MyCustomResolver {
fn resolve(&self, filename: &str) -> Result<Vec<u8>, ResolveError> {
let catalog_path = Path::new(r"C:\Users\Public\Documents\LDraw");
let base_paths = vec![
catalog_path.join("p"),
catalog_path.join("p").join("48"),
catalog_path.join("parts"),
catalog_path.join("parts").join("s"),
];
for prefix in &base_paths {
let full_path = prefix.join(filename);
if let Ok(bytes) = std::fs::read(&full_path) {
return Ok(bytes);
}
}
Err(ResolveError::new_raw(filename))
}
}
fn main() {
let resolver = MyCustomResolver {};
let mut source_map = SourceMap::new();
for file in std::fs::read_dir(r"C:\Users\Public\Documents\LDraw\parts").unwrap() {
let file = file.unwrap().path().to_string_lossy().to_string();
match parse(&file, &resolver, &mut source_map) {
Ok(file_ref) => {
let root = file_ref.get(&source_map);
let count = root.iter(&source_map).count();
println!("File: {:?}, Commands: {}", root.filename, count);
}
Err(e) => println!("error parsing: {file:?}: {e}"),
}
}
} |
It looks like the issue is actually in how "0 Cran creusé" is represented in |
I've not touched that code for a long time, but it looks from the description like this is an issue with a non-canonical encoding of a UTF-8 string, which Rust |
I'm reworking the parsing code on my fork to use the latest version of nom with functions instead of macros. This also removes the unwrap calls on invalid utf8 and returns an error instead. I should have a PR ready to review soon. I've already recieved a response from the moderators on the ldraw forums. The files in question are not conforming to the spec and will be fixed in the next parts update, so there's no need to do any kind of sanitization in weldr. |
Nice, thanks for all of that! |
I just checked, |
I was looking for robust and efficient ldraw parsers in Rust and found this crate. From my initial experience, the library has been very well documented and easy to use. I noticed in #2 that the code likely hasn't been run against the entire parts library yet. I've already found some panics.
I'm getting the error
Utf8Error { valid_up_to: 10, error_len: Some(1) }
at this section of code. Rust doesn't seem to complain about the formatting when I trying and convert the file to utf8 withstd::fs::read_to_string
. This happens on the recently downloaded ldraw as well as bricklink studio parts libraries. I can upload the dat file if needed.weldr/lib/src/lib.rs
Lines 414 to 420 in aae1bff
The text was updated successfully, but these errors were encountered: