New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: Introduce GString #389

Open
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@philn
Contributor

philn commented Nov 16, 2018

This new module keeps track of CStrings to help avoid allocating new Strings in
gir's generated code.

@philn philn force-pushed the philn:cstringholder branch 2 times, most recently from 6dc40d0 to 7541346 Nov 16, 2018

@philn

This comment has been minimized.

Contributor

philn commented Nov 16, 2018

@sdroege I think this is ready for review finally :)

Show resolved Hide resolved src/cstringholder.rs Outdated
Show resolved Hide resolved src/cstringholder.rs Outdated

@philn philn force-pushed the philn:cstringholder branch from 7541346 to dcba9dc Nov 16, 2018

@GuillaumeGomez

This comment has been minimized.

Member

GuillaumeGomez commented Nov 16, 2018

Awesome! 👍

Show resolved Hide resolved src/cstringholder.rs Outdated
pub struct CStringHolder {
slice: Box<CStr>,
ptr: *const c_char,
owned: bool,

This comment has been minimized.

@sdroege

sdroege Nov 16, 2018

Member

I would only do this for owned strings, otherwise it doesn't not really make sense or does it? Do you have an example usage for a non-owned string?

This comment has been minimized.

@philn

philn Nov 16, 2018

Contributor

In the unit-tests of the gstreamer subcrate I had to manually create CStringHolders.

This comment has been minimized.

@sdroege

sdroege Nov 16, 2018

Member

You have a link?

pub fn as_str(&self) -> &str {
let bytes = self.slice.to_bytes();
match std::str::from_utf8(bytes) {

This comment has been minimized.

@sdroege

sdroege Nov 16, 2018

Member

Here you could do

  let bytes = slice::from_raw_parts(ptr, len); // calculate the length via strlen in the constructor
  let cstr = CString::from_bytes_with_nul_unchecked(bytes);
  cstr.to_str().unwrap()

This comment has been minimized.

@philn

philn Nov 16, 2018

Contributor

That doesn't work, CStr::from_bytes_with_nul_unchecked expects an &[u8] but bytes is &[i8]

This comment has been minimized.

@sdroege

sdroege Nov 16, 2018

Member

Cast the *const c_char to a *const u8 before from_raw_parts. It's i8 or u8 depending on architecture

}
}
impl From<String> for CStringHolder {

This comment has been minimized.

@sdroege

sdroege Nov 16, 2018

Member

This one would involve copying, not sure where it would be useful to use?

This comment has been minimized.

@philn

philn Nov 16, 2018

Contributor

Yeah let's remove it, doesn't seem useful indeed :)

Show resolved Hide resolved src/cstringholder.rs Outdated
Show resolved Hide resolved src/cstringholder.rs Outdated
Show resolved Hide resolved src/cstringholder.rs Outdated
Show resolved Hide resolved src/translate.rs Outdated
Show resolved Hide resolved src/translate.rs Outdated
Show resolved Hide resolved src/lib.rs
@EPashkin

This comment has been minimized.

Member

EPashkin commented Nov 16, 2018

Note: building with rust 1.28.0 produce error:

error[E0658]: access to extern crates through prelude is experimental (see issue #44660)
  --> D:/eap/rust/0/glib\src\cstringholder.rs:44:15
   |
44 |         match std::str::from_utf8(bytes) {
   |               ^^^
@philn

This comment has been minimized.

Contributor

philn commented Nov 16, 2018

Hopefully this code won't be part of the next patch iteration :)

@philn philn force-pushed the philn:cstringholder branch 3 times, most recently from d6d3a81 to 54b0486 Nov 18, 2018

@philn

This comment has been minimized.

Contributor

philn commented Nov 18, 2018

Here's my WIP gstreamer fork adding support for CStringHolder too, https://gitlab.freedesktop.org/philn/gstreamer-rs/commits/master

@philn

This comment has been minimized.

Contributor

philn commented Nov 18, 2018

@cgwalters

This comment has been minimized.

cgwalters commented Nov 19, 2018

I only glanced at this but it looks like there's some overlap with https://crates.io/crates/c_utf8 ? I started using that crate in projectatomic/rpm-ostree#1588

@philn

This comment has been minimized.

Contributor

philn commented Nov 20, 2018

This was briefly discussed on IRC, the outcome was that for this specific task we would prefer to avoid adding a new external dependency.

@philn philn force-pushed the philn:cstringholder branch from 54b0486 to e68fc03 Nov 25, 2018

@philn

This comment has been minimized.

Contributor

philn commented Nov 25, 2018

PR updated. I suspect CI might fail again though, will keep an eye on it :)

@philn philn force-pushed the philn:cstringholder branch from e68fc03 to e896713 Nov 25, 2018

@EPashkin

Don't fully understand safe use this class or not 😢

};
let ptr = slice.as_ptr();
assert!(!ptr.is_null());
Self { ptr, length: s.len() - 1, owned: false }

This comment has been minimized.

@EPashkin

EPashkin Nov 25, 2018

Member

With impl FromGlibPtrNone<*const c_char> for CStringHolder { this function worried me,
what happened if C string freed?

This comment has been minimized.

@sdroege

sdroege Nov 25, 2018

Member

CStringHolder can only really be implemented in a useful way of from_glib_full

This comment has been minimized.

@EPashkin

EPashkin Nov 25, 2018

Member

IMHO from_glib_borrow also can be safely implemented for signal string parameters.
Problem with string, returned from function as from_glib_none, I don't see safe way to remove copy.

Show resolved Hide resolved src/cstringholder.rs Outdated
Show resolved Hide resolved src/cstringholder.rs Outdated
let bytes = std::slice::from_raw_parts(self.ptr as *const u8, self.length + 1);
let cstr = CStr::from_bytes_with_nul_unchecked(bytes);
cstr.to_str().unwrap()
}

This comment has been minimized.

@EPashkin

EPashkin Nov 25, 2018

Member

Not sure about safety here too.
Also not too many operation for "as_xx" functions? As I remember it preferred be almost 0-cost

Show resolved Hide resolved src/cstringholder.rs Outdated
fn to_glib_none(&'a self) -> Stash<'a, *const c_char, Self> {
unsafe {
let tmp = CString::from_raw(self.ptr as *mut i8);
Stash(tmp.as_ptr(), tmp)

This comment has been minimized.

@EPashkin

EPashkin Nov 25, 2018

Member

CString created for all call to_glib_none?
On owned=true you can just use CStringHolder as storage,
on owned=false depend of lifetime.

This comment has been minimized.

@philn

philn Dec 1, 2018

Contributor

I don't understand what you mean there :) The stash would expect a CStringHolder then, not a reference. Also I'm not sure how the storage can be changed depending on owned value.

This comment has been minimized.

@EPashkin

EPashkin Dec 1, 2018

Member

You right, storage can't changed depending on owned value, sorry.

This comment has been minimized.

@EPashkin

EPashkin Dec 1, 2018

Member

Actually it can be something like Result<&CStringHolder, CString> or tupple with 2 options, but not sure about &CStringHoder part

This comment has been minimized.

@philn

philn Dec 1, 2018

Contributor

Seems to work with a tuple

Show resolved Hide resolved src/cstringholder.rs Outdated
@EPashkin

This comment has been minimized.

Member

EPashkin commented Nov 25, 2018

Maybe in not_owned case use CString field as internal storage?

@EPashkin

This comment has been minimized.

Member

EPashkin commented Nov 25, 2018

@philn Reason to add CStringHolder is minimize memory copy when receiving C string?

@sdroege

This comment has been minimized.

Member

sdroege commented Nov 25, 2018

Maybe in not_owned case use CString field as internal storage?

CString would involve a copy, CStr would be unsafe. Strings that are transfer none we must always directly copy to new memory.

@EPashkin

This comment has been minimized.

Member

EPashkin commented Nov 25, 2018

Assuming that CStringHolder need to minimize memory copy when receiving C string
IMHO it need implemented like AnyBox: https://github.com/gtk-rs/glib/blob/master/src/boxed.rs#L259-L263 with 3 separated cases:
for from_glib_full it store pointer and drop it,
for from_glib_borrow it store pointer and not drop it,
for from_glib_none it store String or CString with copy data (this is bad because too many string returned that way).

@EPashkin

This comment has been minimized.

Member

EPashkin commented Nov 25, 2018

Name CStringHolder IMHO misleading, something like GStr(ing)?Holder is better.

Show resolved Hide resolved src/cstringholder.rs Outdated
@sdroege

This comment has been minimized.

Member

sdroege commented Nov 26, 2018

What @EPashkin said in #389 (comment) seems like the best way forward here. I also agree that the name is a bit suboptimal. I would've called it GString.

@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 6, 2018

Looks good, but I still prefer use from_glib_full instead GString::new etc. in manual code.

pub struct GString {
s: Option<CString>,
ptr: *const c_char,
length: usize

This comment has been minimized.

@EPashkin

EPashkin Dec 6, 2018

Member

Maybe better use enum instead struct ?
I do not insist.

Show resolved Hide resolved src/gstring.rs Outdated
Show resolved Hide resolved src/gstring.rs Outdated
Show resolved Hide resolved src/gstring.rs Outdated
@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 6, 2018

@philn Hm, seems you lost usage in trampolines aka from_glib_borrow.

let ptr = data.into_raw();
unsafe {
let holder = GString::new(ptr);

This comment has been minimized.

@EPashkin

EPashkin Dec 7, 2018

Member

This line cause strange test failure in appveyor: no gstring::tests::test_holder in log and error: test failed, to rerun pass '--lib' as result.
Its because holder tried call g_free on drop, so std::mem::forget(holder); or change to borrowed constructor needed

unsafe {
let holder = GString::new(ptr);
assert_eq!(holder.as_str(), "foo");
let foo: Box<str> = holder.into();

This comment has been minimized.

@EPashkin

EPashkin Dec 7, 2018

Member

If impl From<GString> for Box<str> called on "owned ptr" case, then seems it needed mem::forget too,
maybe impl From<GString> for String too, but not sure

This comment has been minimized.

@philn

philn Dec 9, 2018

Contributor

I'm sorry, I'm not sure to understand what you mean, can you please clarify?

This comment has been minimized.

@EPashkin

EPashkin Dec 9, 2018

Member

current impl From<GString> for Box<str> drops consumed CString at end,
for owned case returned Box<str> points to g_freed location.
Note: this test passed pointer to rust owned string in GString::new, better use glib duplicate,
to test "owned" case, "borrowed" case can use rust

This comment has been minimized.

@EPashkin

EPashkin Dec 9, 2018

Member

Missed let ptr_copy = glib_ffi::g_strdup(ptr); in this test

@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 7, 2018

Found other problem:
We have https://gtk-rs.org/docs/gtk/trait.ScaleExt.html#tymethod.connect_format_value:
signal returning string (Ex. https://github.com/gtk-rs/examples/blob/master/src/bin/gtktest.rs#L78-L81),
with gtk-rs/gir#666 it will be changed to return GString,
so closure need be more complex, as seems no conversion from nonstatic String to GString.

Also I don't see way to prevent this change in gir

@philn

This comment has been minimized.

Contributor

philn commented Dec 7, 2018

I have a gir change for signal trampolines returning String...

@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 7, 2018

@sdroege, @GuillaumeGomez, @philn Maybe it too late,
but IMHO worth of using this class for all function string results is arguable,
as it add extra problem to users (usually need extra call) and give benefit (no memory copy) in about half usages (Ex. in gtk-3.0.gir: 112none, const char* vs 60full, char*).
GString helps as trampoline parameters for sure, I found only 12 cases of it in gtk.

So we better decide use it/not/in which cases before @philn do more works on it.

For me pros slightly less that cons but feels almost equals.

@sdroege

This comment has been minimized.

Member

sdroege commented Dec 7, 2018

I'll review this another time tomorrow :) sorry for the delay

@philn

This comment has been minimized.

Contributor

philn commented Dec 8, 2018

@EPashkin please look at the broader picture. The GTK bindings are not the only users of the GLib bindings, consider the GStreamer bindings for instance.

@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 8, 2018

@philn Problem that I don't know which benefit from this class for GStreamer too.
Can you tell more?

@sdroege

This comment has been minimized.

Member

sdroege commented Dec 8, 2018

but IMHO worth of using this class for all function string results is arguable,
as it add extra problem to users (usually need extra call) and give benefit (no memory copy)

There will always be a memory copy in case of transfer none return values, that's correct. We only prevent copies in transfer full cases (and in callbacks as those borrow the strings).

What do you mean with users have to call functions? As long as they use it in places where a &str is required, nothing should be needed (as this here is Deref<Target=str>). If they need a proper String then into() or similar is required, that's correct, but that seems like an exceptional case.

(Don't merge yet btw, I'll review the latest code later. Not now :) )

Show resolved Hide resolved src/lib.rs
Show resolved Hide resolved src/gstring.rs Outdated
Show resolved Hide resolved src/gstring.rs
Show resolved Hide resolved src/gstring.rs
Show resolved Hide resolved src/gstring.rs Outdated
Show resolved Hide resolved src/gstring.rs Outdated
@EPashkin

This comment has been minimized.

Member

EPashkin commented Dec 8, 2018

@sdroege You right, I missed Deref<Target=str> already implemented and used .as_str() directly.
Updated examples.

Show resolved Hide resolved src/gstring.rs

philn added some commits Dec 9, 2018

lib: Introduce GString
This new module keeps track of C strings to help avoid allocating new Strings in
gir's generated code.

@philn philn force-pushed the philn:cstringholder branch from c948dc1 to c084ce4 Dec 9, 2018

@philn

This comment has been minimized.

Contributor

philn commented Dec 9, 2018

Ok, the PR is updated. I'm not sure I addressed all the comments but I did my best 😂

@EPashkin

Thanks.
IMHO only "last big" problem is "borrowed" case

unsafe fn from_glib_borrow(ptr: *const c_char) -> Self {
GString::new(ptr)
}
}

This comment has been minimized.

@EPashkin

EPashkin Dec 9, 2018

Member

Borrowed ptr can't be g_freed

Show resolved Hide resolved src/gstring.rs
unsafe {
let holder = GString::new(ptr);
assert_eq!(holder.as_str(), "foo");
let foo: Box<str> = holder.into();

This comment has been minimized.

@EPashkin

EPashkin Dec 9, 2018

Member

Missed let ptr_copy = glib_ffi::g_strdup(ptr); in this test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment