Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Debug implementation for proj::Proj #72

Merged
merged 1 commit into from
Feb 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 9 additions & 13 deletions src/network.rs
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ pub(crate) unsafe extern "C" fn network_open(
}

/// Where the ACTUAL work happens, taking advantage of Rust error-handling etc
fn _network_open(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below regarding the changes in this file: https://github.com/georust/proj/pull/72/files#r574620593

unsafe fn _network_open(
_: *mut PJ_CONTEXT,
url: *const c_char,
offset: c_ulonglong,
Expand Down Expand Up @@ -173,14 +173,12 @@ fn _network_open(
error_handler(&mut res, eh_rb)?;
// Write the initial read length value into the pointer
let contentlength = res.content_length().ok_or(ProjError::ContentLength)? as usize;
unsafe { out_size_read.write(contentlength) };
out_size_read.write(contentlength);
let headers = res.headers().clone();
// Copy the downloaded bytes into the buffer so it can be passed around
unsafe {
&res.bytes()?
.as_ptr()
.copy_to_nonoverlapping(buffer as *mut u8, contentlength.min(size_to_read))
};
&res.bytes()?
.as_ptr()
.copy_to_nonoverlapping(buffer as *mut u8, contentlength.min(size_to_read));
// Store req into the handle so new ranges can be queried
let hd = HandleData::new(req, headers, None);
// heap-allocate the struct and cast it to a void pointer so it can be passed around to PROJ
Expand All @@ -189,10 +187,8 @@ fn _network_open(
let opaque: *mut PROJ_NETWORK_HANDLE = void as *mut PROJ_NETWORK_HANDLE;
// If everything's OK, set the error string to empty
let err_string = "";
unsafe {
out_error_string.copy_from_nonoverlapping(err_string.as_ptr().cast(), err_string.len());
out_error_string.add(err_string.len()).write(0);
}
out_error_string.copy_from_nonoverlapping(err_string.as_ptr().cast(), err_string.len());
out_error_string.add(err_string.len()).write(0);
Ok(opaque)
}

Expand Down Expand Up @@ -237,14 +233,14 @@ pub(crate) unsafe extern "C" fn network_get_header_value(
}

/// Network callback: get header value
fn _network_get_header_value(
unsafe fn _network_get_header_value(
_: *mut PJ_CONTEXT,
handle: *mut PROJ_NETWORK_HANDLE,
header_name: *const c_char,
_: *mut c_void,
) -> Result<*const c_char, ProjError> {
let lookup = _string(header_name)?.to_lowercase();
let mut hd = unsafe { &mut *(handle as *mut c_void as *mut HandleData) };
let mut hd = &mut *(handle as *mut c_void as *mut HandleData);
let hvalue = hd
.headers
.get(&lookup)
Expand Down
108 changes: 88 additions & 20 deletions src/proj.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use proj_sys::{
proj_trans, proj_trans_array, PJconsts, PJ_AREA, PJ_CONTEXT, PJ_COORD, PJ_DIRECTION_PJ_FWD,
PJ_DIRECTION_PJ_INV, PJ_INFO, PJ_LP, PJ_XY,
};
use std::fmt::Debug;
use std::fmt::{self, Debug};

#[cfg(feature = "network")]
use proj_sys::proj_context_set_enable_network;
Expand Down Expand Up @@ -124,15 +124,18 @@ impl Area {
}

/// Easily get a String from the external library
pub(crate) fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> {
let c_str = unsafe { CStr::from_ptr(raw_ptr) };
pub(crate) unsafe fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function takes in a pointer and dereferences it, it's possible for someone (one of us since it's pub(crate)) to misuse it and reference invalid memory and cause undefined behavior, so we need to mark it as unsafe

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest force push marks a couple functions in network.rs as unsafe for the same reason. They surfaced in CI because _string gets used within those network functions.

assert!(!raw_ptr.is_null());
let c_str = CStr::from_ptr(raw_ptr);
Ok(str::from_utf8(c_str.to_bytes())?.to_string())
}

/// Look up an error message using the error code
fn error_message(code: c_int) -> Result<String, ProjError> {
let rv = unsafe { proj_errno_string(code) };
_string(rv)
unsafe {
let rv = proj_errno_string(code);
_string(rv)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing my previous comment about _string safety– because we use proj_errno_string and _string in conjunction here, this function is safe. No matter what value of code we pass to our error_message function, we'll never reference invalid memory (under the assumption proj_errno_string always returns a valid string pointer) so we don't need to mark this function as unsafe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! Would you preserve this rationale in a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under the assumption proj_errno_string always returns a valid string pointer

Lo and behold, proj_errno_string does not always return a valid string pointer:

https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L12-L14

https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L37-L75

Specifically, it returns a null pointer if code == 0, so we should handle that here by adding a if rv.is_null() check. I opened a new GitHub Issue for that since I feel this pull request has already exceeded the initial scope I intended: #73. So prior to this pull request, if one called error_message(0), we would dereference a null pointer in _string. After this pull request, we'll panic because of the assert I added in _string.

}
}

/// Set the bounding box of the area of use
Expand Down Expand Up @@ -196,15 +199,17 @@ pub trait Info {
/// # Safety
/// This method contains unsafe code.
fn info(&self) -> Result<Projinfo, ProjError> {
let pinfo: PJ_INFO = unsafe { proj_info() };
Ok(Projinfo {
major: pinfo.major,
minor: pinfo.minor,
patch: pinfo.patch,
release: _string(pinfo.release)?,
version: _string(pinfo.version)?,
searchpath: _string(pinfo.searchpath)?,
})
unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I've gathered from your other changes in this PR is that you advocate marking the function as unsafe fn when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function as unsafe fn.

Is that right? I've never really considered it, but that makes sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you advocate marking the function as unsafe fn when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function as unsafe fn

Yeah exactly. And I think this section of TRPL covers that better than I can explain:

https://doc.rust-lang.org/nightly/book/ch19-01-unsafe-rust.html#creating-a-safe-abstraction-over-unsafe-code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm permanently confused about when / when not to mark functions unsafe.

Copy link
Member Author

@frewsxcv frewsxcv Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When determining whether a function should be marked unsafe, I ask myself: when a user calls this function, is it possible for the user to provide inputs to the function such that it could result in any of the behaviors Rust considers undefined, which includes:

Dereferencing (using the * operator on) a dangling or unaligned raw pointer.

and

...raw pointer read from uninitialized memory, ...

So consider this function (taken directly from the FFI page in the Nomicon):

fn kaboom(ptr: *const i32) -> i32 {
    *ptr
}

I'm able to call this function with whatever pointer I want and it will attempt to read from that memory location (e.g. something like: kaboom(0x1a1a as *const i32) will attempt to read a i32 at memory address 0x1a1a). So I am able to call this function with a certain input that will cause it to perform undefined behavior (in this case read from uninitialized memory). Thus we need to mark the function as unsafe.

Coming back to PROJ-land, consider this example:

/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
    let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
    let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
    println!("Description: {}", _string(pj_proj_info.description));
}

Is it possible to call this function and have it succeed? Yes! Pass it a valid crs_definition that we know has a description and it should print.

Is it possible to call this function that will result in undefined behavior? Yes! Pass it a crs_definition PROJ doesn't know about and proj_sys::proj_create will return a null pointer, and then proj_sys::proj_pj_info will attempt to deference the null pointer, which is considered undefined. Thus we would need to mark this function as unsafe. And since it's marked as unsafe we would add a section in the docs for this function # Safety that describes how you could use this function in a safe way (only passing it valid CRS definitions).

And one more example:

/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
    let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
    if pj_ptr.is_null() {
        return;
    }
    let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
    if pj_proj_info.description.is_null() {
        return;
    }
    println!("Description: {}", _string(pj_proj_info.description));
}

Is it possible to call this function that will result in undefined behavior? Assuming I'm not overlooking anything, I don't think so! You can pass it whatever string you want and we shouldn't encounter any undefined behavior. Thus we don't need to mark it as unsafe.

Hope that helps a little!

let pinfo: PJ_INFO = proj_info();
Ok(Projinfo {
major: pinfo.major,
minor: pinfo.minor,
patch: pinfo.patch,
release: _string(pinfo.release)?,
version: _string(pinfo.version)?,
searchpath: _string(pinfo.searchpath)?,
})
}
}

/// Check whether network access for [resource file download](https://proj.org/resource_files.html#where-are-proj-resource-files-looked-for) is currently enabled or disabled.
Expand Down Expand Up @@ -544,7 +549,14 @@ impl Proj {
let south = unsafe { out_south_lat_degree.assume_init() };
let east = unsafe { out_east_lon_degree.assume_init() };
let north = unsafe { out_north_lat_degree.assume_init() };
let name = unsafe { out_area_name.assume_init() };
let name = unsafe {
let name = out_area_name.assume_init();
if !name.is_null() {
Some(_string(name)?)
} else {
None
}
};

let area = if west != -1000.0 && south != -1000.0 && east != -1000.0 && north != -1000.0
{
Expand All @@ -557,12 +569,36 @@ impl Proj {
} else {
None
};
let name = if !name.is_null() {
Some(_string(name)?)
Ok((area, name))
}
}

fn pj_info(&self) -> PjInfo {
unsafe {
let pj_info = proj_pj_info(self.c_proj);
let id = if pj_info.id.is_null() {
None
} else {
Some(_string(pj_info.id).expect("PROJ built an invalid string"))
};
Comment on lines +579 to +583
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the 1.50.0 release, we could use bool::then here, but I think it might be a little too soon to avoid breakage for downstream users. We should codify our policy around Rust versions at some point...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. rust versions. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too soon! But yes, we should.

let description = if pj_info.description.is_null() {
None
} else {
Some(_string(pj_info.description).expect("PROJ built an invalid string"))
};
Ok((area, name))
let definition = if pj_info.definition.is_null() {
None
} else {
Some(_string(pj_info.definition).expect("PROJ built an invalid string"))
};
let has_inverse = pj_info.has_inverse == 1;
PjInfo {
id,
description,
definition,
has_inverse,
accuracy: pj_info.accuracy,
}
}
}

Expand All @@ -571,8 +607,7 @@ impl Proj {
/// # Safety
/// This method contains unsafe code.
pub fn def(&self) -> Result<String, ProjError> {
let rv = unsafe { proj_pj_info(self.c_proj) };
_string(rv.definition)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previous code assumed that definition field would always be not null in the proj_pj_info result, which might be the case. But in my testing, proj_pj_info will sometimes return null for the id and description string fields. The PROJ docs doesn't provide clarity on null values for the proj_pj_info API, so for safety I wrapped all the String fields in an Option.

So in this instance, I replaced a potential null pointer dereference with a panic via expect

Ok(self.pj_info().definition.expect("proj_pj_info did not provide a definition"))
}

/// Project geodetic coordinates (in radians) into the projection specified by `definition`
Expand Down Expand Up @@ -826,6 +861,27 @@ impl Proj {
}
}

struct PjInfo {
id: Option<String>,
description: Option<String>,
definition: Option<String>,
has_inverse: bool,
accuracy: f64,
}

impl fmt::Debug for Proj {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let pj_info = self.pj_info();
f.debug_struct("Proj")
.field("id", &pj_info.id)
.field("description", &pj_info.description)
.field("definition", &pj_info.definition)
.field("has_inverse", &pj_info.has_inverse)
.field("accuracy", &pj_info.accuracy)
.finish()
}
}

impl Drop for Proj {
fn drop(&mut self) {
unsafe {
Expand Down Expand Up @@ -933,6 +989,18 @@ mod test {
"proj=longlat datum=WGS84 no_defs ellps=WGS84 towgs84=0,0,0"
);
}

#[test]
fn test_debug() {
let wgs84 = "+proj=longlat +datum=WGS84 +no_defs";
let proj = Proj::new(wgs84).unwrap();
let debug_string = format!("{:?}", proj);
assert_eq!(
"Proj { id: Some(\"longlat\"), description: Some(\"PROJ-based coordinate operation\"), definition: Some(\"proj=longlat datum=WGS84 no_defs ellps=WGS84 towgs84=0,0,0\"), has_inverse: true, accuracy: -1.0 }",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if PROJ changes any of the text here in across PROJ versions, this test will fail. Not sure how concerned we should be about that.

debug_string
);
}

#[test]
#[should_panic]
// This failure is a bug in libproj
Expand Down