Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Debug implementation for proj::Proj #72

Merged
merged 1 commit into from
Feb 12, 2021
Merged

Add Debug implementation for proj::Proj #72

merged 1 commit into from
Feb 12, 2021

Conversation

frewsxcv
Copy link
Member

Closes #64

let proj = Proj::new(wgs84).unwrap();
let debug_string = format!("{:?}", proj);
assert_eq!(
"Proj { id: Some(\"longlat\"), description: Some(\"PROJ-based coordinate operation\"), definition: Some(\"proj=longlat datum=WGS84 no_defs ellps=WGS84 towgs84=0,0,0\"), has_inverse: true, accuracy: -1.0 }",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if PROJ changes any of the text here in across PROJ versions, this test will fail. Not sure how concerned we should be about that.

@@ -571,8 +607,7 @@ impl Proj {
/// # Safety
/// This method contains unsafe code.
pub fn def(&self) -> Result<String, ProjError> {
let rv = unsafe { proj_pj_info(self.c_proj) };
_string(rv.definition)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previous code assumed that definition field would always be not null in the proj_pj_info result, which might be the case. But in my testing, proj_pj_info will sometimes return null for the id and description string fields. The PROJ docs doesn't provide clarity on null values for the proj_pj_info API, so for safety I wrapped all the String fields in an Option.

So in this instance, I replaced a potential null pointer dereference with a panic via expect

@@ -124,15 +124,18 @@ impl Area {
}

/// Easily get a String from the external library
pub(crate) fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> {
let c_str = unsafe { CStr::from_ptr(raw_ptr) };
pub(crate) unsafe fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function takes in a pointer and dereferences it, it's possible for someone (one of us since it's pub(crate)) to misuse it and reference invalid memory and cause undefined behavior, so we need to mark it as unsafe

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest force push marks a couple functions in network.rs as unsafe for the same reason. They surfaced in CI because _string gets used within those network functions.

_string(rv)
unsafe {
let rv = proj_errno_string(code);
_string(rv)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing my previous comment about _string safety– because we use proj_errno_string and _string in conjunction here, this function is safe. No matter what value of code we pass to our error_message function, we'll never reference invalid memory (under the assumption proj_errno_string always returns a valid string pointer) so we don't need to mark this function as unsafe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! Would you preserve this rationale in a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under the assumption proj_errno_string always returns a valid string pointer

Lo and behold, proj_errno_string does not always return a valid string pointer:

https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L12-L14

https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L37-L75

Specifically, it returns a null pointer if code == 0, so we should handle that here by adding a if rv.is_null() check. I opened a new GitHub Issue for that since I feel this pull request has already exceeded the initial scope I intended: #73. So prior to this pull request, if one called error_message(0), we would dereference a null pointer in _string. After this pull request, we'll panic because of the assert I added in _string.

Comment on lines +579 to +583
let id = if pj_info.id.is_null() {
None
} else {
Some(_string(pj_info.id).expect("PROJ built an invalid string"))
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the 1.50.0 release, we could use bool::then here, but I think it might be a little too soon to avoid breakage for downstream users. We should codify our policy around Rust versions at some point...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. rust versions. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too soon! But yes, we should.

Copy link
Member

@michaelkirk michaelkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks helpful!

_string(rv)
unsafe {
let rv = proj_errno_string(code);
_string(rv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! Would you preserve this rationale in a comment?

version: _string(pinfo.version)?,
searchpath: _string(pinfo.searchpath)?,
})
unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I've gathered from your other changes in this PR is that you advocate marking the function as unsafe fn when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function as unsafe fn.

Is that right? I've never really considered it, but that makes sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you advocate marking the function as unsafe fn when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function as unsafe fn

Yeah exactly. And I think this section of TRPL covers that better than I can explain:

https://doc.rust-lang.org/nightly/book/ch19-01-unsafe-rust.html#creating-a-safe-abstraction-over-unsafe-code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm permanently confused about when / when not to mark functions unsafe.

Copy link
Member Author

@frewsxcv frewsxcv Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When determining whether a function should be marked unsafe, I ask myself: when a user calls this function, is it possible for the user to provide inputs to the function such that it could result in any of the behaviors Rust considers undefined, which includes:

Dereferencing (using the * operator on) a dangling or unaligned raw pointer.

and

...raw pointer read from uninitialized memory, ...

So consider this function (taken directly from the FFI page in the Nomicon):

fn kaboom(ptr: *const i32) -> i32 {
    *ptr
}

I'm able to call this function with whatever pointer I want and it will attempt to read from that memory location (e.g. something like: kaboom(0x1a1a as *const i32) will attempt to read a i32 at memory address 0x1a1a). So I am able to call this function with a certain input that will cause it to perform undefined behavior (in this case read from uninitialized memory). Thus we need to mark the function as unsafe.

Coming back to PROJ-land, consider this example:

/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
    let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
    let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
    println!("Description: {}", _string(pj_proj_info.description));
}

Is it possible to call this function and have it succeed? Yes! Pass it a valid crs_definition that we know has a description and it should print.

Is it possible to call this function that will result in undefined behavior? Yes! Pass it a crs_definition PROJ doesn't know about and proj_sys::proj_create will return a null pointer, and then proj_sys::proj_pj_info will attempt to deference the null pointer, which is considered undefined. Thus we would need to mark this function as unsafe. And since it's marked as unsafe we would add a section in the docs for this function # Safety that describes how you could use this function in a safe way (only passing it valid CRS definitions).

And one more example:

/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
    let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
    if pj_ptr.is_null() {
        return;
    }
    let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
    if pj_proj_info.description.is_null() {
        return;
    }
    println!("Description: {}", _string(pj_proj_info.description));
}

Is it possible to call this function that will result in undefined behavior? Assuming I'm not overlooking anything, I don't think so! You can pass it whatever string you want and we shouldn't encounter any undefined behavior. Thus we don't need to mark it as unsafe.

Hope that helps a little!

@@ -142,7 +142,7 @@ pub(crate) unsafe extern "C" fn network_open(
}

/// Where the ACTUAL work happens, taking advantage of Rust error-handling etc
fn _network_open(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below regarding the changes in this file: https://github.com/georust/proj/pull/72/files#r574620593

@urschrei
Copy link
Member

This is great, thanks for catching the potential null pointer deref. Are you happy for this to merge as-is?

@frewsxcv frewsxcv merged commit 81b233a into master Feb 12, 2021
@frewsxcv frewsxcv deleted the debug branch February 12, 2021 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add debug representation for proj::Proj
3 participants