Skip to content

Behavior change parsing Windows file paths using path_segments_mut.extend #1077

@westonpace

Description

@westonpace
  • Note that this crate implements the URL Standard not RFC 1738 or RFC 3986

Describe the bug

Here is a snippet to reproduce. It only relies on the tempfile crate and the url crate.

    #[test]
    fn test_url_parsing() {
        let tmpdir = tempfile::tempdir().unwrap();
        let tmp_path = tmpdir.path().to_str().unwrap();
        let tmp_path = tmp_path.replace("\\", "%5C");
        let mut url = url::Url::parse("file://").unwrap();
        url.path_segments_mut().unwrap().pop_if_empty().extend(std::iter::once(tmp_path));
        println!("{:?}", url);
        url.to_file_path().unwrap();
    }

This test passes in version 2.5.4 and fails in version 2.5.7. The URL that gets printed is slightly different in each version:

2.5.4: /C:/%255CUsers%255Cwesto%255CAppData%255CLocal%255CTemp%255C.tmpSjrKkh
2.5.7: /C:%255CUsers%255Cwesto%255CAppData%255CLocal%255CTemp%255C.tmpw9u6Zh

The latter URL fails the to_file_path call. It may be that this behavior change is intentional, I am not certain. The path_segments_mut().unwrap().pop_if_empty().extend(...) pattern is coming from the object_store crate here: https://github.com/apache/arrow-rs-object-store/blob/b82979d44a916cf1615e719c8e80800766fd6efe/src/local.rs#L257 (and I am encountering issues in lance as a downstream user of object_store)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions