-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ctypes implementation #2364
base: main
Are you sure you want to change the base?
Ctypes implementation #2364
Conversation
I just pushed a commit to get stuff compiling. Excited to review this! |
One thing to note is that some things are directly translation from PyPy, others from CPython's. Also there's that difference I said on Gitter, regarding declaring a class like: class Custom(_SimpleCData):
pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: class must define a '_type_' attribute which here will succeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I have so far; I think it would be better to use more of libffi's middle
API in ctypes::function
, just to avoid as much unsafe
as possible for something like ctypes 🙃 . Now that the functions are copied from cpython, it should be easier to refactor into something more idiomatic.
vm/src/stdlib/ctypes/basics.rs
Outdated
#[pyclass(module = "_ctypes", name = "_CData")] | ||
pub struct PyCData { | ||
_objects: AtomicCell<Vec<PyObjectRef>>, | ||
_buffer: PyRwLock<Vec<u8>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the buffer is supposed to be a raw pointer and a length, so that you can mutate the original data it if you want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like
struct RawBuffer{
inner: *mut u8,
size: usize
}
?
vm/src/stdlib/mod.rs
Outdated
#[cfg(any(unix, windows, target_os = "wasi"))] | ||
modules.insert("_ctypes".to_owned(), Box::new(ctypes::make_module)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of duplicating same cfg, it can be grouped like:
#[cfg(any(unix, windows, target_os = "wasi"))]
{
modules.insert(os::MODULE_NAME.to_owned(), Box::new(os::make_module));
modules.insert("_ctypes".to_owned(), Box::new(ctypes::make_module));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't know if ctypes would be available on wasi, I don't think it has dll functionality yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would libffi work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, wasm doesn't really have any sort of support for loading another wasm module at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe interesting, Blazor does load DLLs at runtime from what I can see (from this blog). How feasible is something similar for Rust?
vm/src/stdlib/ctypes/function.rs
Outdated
} | ||
void => { | ||
ptr::null_mut() | ||
arg(&ptr::null::<usize>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replace this for
ptr::null::<c_void>()
since it seemed more "correct", which I'changed after committing this. So in a next change batch it will be replaced.
@@ -252,11 +242,17 @@ struct PyCDataBuffer { | |||
// This trait will be used by all types | |||
impl Buffer for PyCDataBuffer { | |||
fn obj_bytes(&self) -> BorrowedValue<[u8]> { | |||
PyRwLockReadGuard::map(self.data.borrow_value(), |x| x.as_slice()).into() | |||
PyRwLockReadGuard::map(self.data.borrow_value(), |x| unsafe { | |||
slice::from_raw_parts(x.inner, x.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is copying things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it isn't. slice is a sort of view
@@ -265,14 +261,21 @@ impl Buffer for PyCDataBuffer { | |||
&self.options | |||
} | |||
} | |||
pub struct RawBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coolreader18 is this what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safe expression of this type in Rust is Box<[u8]>
which is exactly equivalent to this struct. Is this intended work to avoid somethiing like ownership check? I think there must be a better way.
vm/src/stdlib/ctypes/function.rs
Outdated
} | ||
pointer => { | ||
usize::try_from_object(vm, obj).unwrap() as *mut c_void | ||
arg(&usize::try_from_object(vm, obj).unwrap()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also replace this with
arg(&(usize::try_from_object(vm, obj)? as *mut usize as *mut c_void))
vm/src/stdlib/ctypes/array.rs
Outdated
buffer | ||
.obj_bytes() | ||
.chunks(4) | ||
.map(|c| NativeEndian::read_u32(c)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use u32::from_ne_bytes
for this; also, there's already a byteorder dependency in vm/Cargo.toml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what function is this supposed to be? CArray doesn't have a value property from what I can see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use
u32::from_ne_bytes
for this; also, there's already a byteorder dependency in vm/Cargo.toml
Oh boy, I missed that one, I've found the bt
and le
counterparts. I can simply ignore the whole ByteOrder
crate, then.
Also, what function is this supposed to be? CArray doesn't have a value property from what I can see
These properties come from it's metaclass (PyCArrayType_Type
), which are CharArray_getsets
and WCharArray_getsets
fn obj_bytes(&self) -> BorrowedValue<[u8]> { | ||
// @TODO: This is broken | ||
PyRwLockReadGuard::map(self.data.borrow_value(), |x| unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This need to be fixed, "casting" the associated type somehow
What are the major blockers for this PR? I think I want to give this a try :) |
Unfortunately there are several problems with the state of things and I haven't been able to continue working on this (and will not be able to do so for a while). The main problem is how it behaves differently from CPython's and PyPy's, a meta class should be added for Struct, CSimpleType, Pointer, ... (all but CFuntion), so that you can do |
3334aa9
to
4725a80
Compare
@youknowone I tried to sync with master a while ago, but there was a problem and I got too busy to look at that again, so thanks for that! Are you planning into work on this? If so, I think there are some tests I ported from CPython that I altered just to see if the implementation was working-ish. To be fully compatible to CPython and PyPy, the types must have a metaclass that's not I am happy to help with discussion and pointers, but I'm afraid I don't have much time to code :/ |
@darleybarreto Hi, thank you for the comment. I am not planning for working on ctypes at the moment, but looking for a way to merge it as it is for future contributors - because you already worked really a lot of it - if it doesn't break things too much. How do you think about it? |
You mean merging as is? |
yes, if we can manage it not to break CI |
I think that we need to do a few things to help future contributions in this code before merging:
Undoubtedly class _CDataMeta(type):
...
def __mul__(self, length):
return create_array_type(self, length)
...
class SimpleType(_CDataMeta):
...
class SimpleCData(..., metaclass=SimpleType):
... I couldn't find a way to make Point |
Thank you for detailed explanation. The metaclass issue is recently solved. Here is the test: $ cat t.py
class _CDataMeta(type):
def __mul__(self, length):
return create_array_type(self, length)
class SimpleType(_CDataMeta):
pass
class SimpleCData(object, metaclass=SimpleType):
pass
print(type(SimpleCData))
assert type(SimpleCData) == SimpleType
$ cargo run t.py
Finished dev [unoptimized + debuginfo] target(s) in 0.06s
Running `target/debug/rustpython t.py`
<class '__main__.SimpleType'> I will look in the code for 3 |
Oh, I'm sorry, I meant to do the same thing in the Rust side. CPython does it in the C side of things by manually filling the type: #define MOD_ADD_TYPE(TYPE_EXPR, TP_TYPE, TP_BASE) \
do { \
PyTypeObject *type = (TYPE_EXPR); \
Py_SET_TYPE(type, TP_TYPE); \
type->tp_base = TP_BASE; \
if (PyModule_AddType(mod, type) < 0) { \
return -1; \
} \
} while (0)
MOD_ADD_TYPE(&Simple_Type, &PyCSimpleType_Type, &PyCData_Type); Here |
vm/src/stdlib/ctypes/array.rs
Outdated
} | ||
} | ||
|
||
fn slice_adjust_size(length: isize, start: &mut isize, stop: &mut isize, step: isize) -> isize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't compare precisely, but this function might be a duplication of inner_indices
in slice.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have some resemblance, but are different nonetheless.
vm/src/stdlib/ctypes/array.rs
Outdated
_type_: new_simple_type(Either::A(&outer_type), vm)?.into_ref(vm), | ||
_length_: length, | ||
_buffer: PyRwLock::new(RawBuffer { | ||
inner: Vec::with_capacity(length * itemsize).as_mut_ptr(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this line is safe. The new Vec
will be removed right after this line and the given pointer will be a dangling pointer.
// @TODO: Is this copying? | ||
|
||
let buffered = if copy { | ||
unsafe { slice::from_raw_parts_mut(buffer.as_mut_ptr(), buffer.len()) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here
let buffered = if copy {
unsafe { slice::from_raw_parts_mut(buffer.as_mut_ptr(), buffer.len()) }
.as_mut_ptr()
} else {
buffer.as_mut_ptr()
};
The idea is to copy the bytes from the buffer if copy
, or return a view otherwise. Should I use to_contiguous
to copy it?
vm/src/stdlib/ctypes/basics.rs
Outdated
#[pyimpl] | ||
pub trait PyCDataFunctions: PyValue { | ||
#[pymethod] | ||
fn size_of_instances(zelf: PyRef<Self>, vm: &VirtualMachine) -> PyResult<PyObjectRef>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn size_of_instances(zelf: PyRef<Self>, vm: &VirtualMachine) -> PyResult<PyObjectRef>; | |
fn size_of_instances(zelf: PyRef<Self>, vm: &VirtualMachine) -> PyResult; |
PyResult
== PyResult<PyObjectRef>
@@ -252,11 +242,17 @@ struct PyCDataBuffer { | |||
// This trait will be used by all types | |||
impl Buffer for PyCDataBuffer { | |||
fn obj_bytes(&self) -> BorrowedValue<[u8]> { | |||
PyRwLockReadGuard::map(self.data.borrow_value(), |x| x.as_slice()).into() | |||
PyRwLockReadGuard::map(self.data.borrow_value(), |x| unsafe { | |||
slice::from_raw_parts(x.inner, x.size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it isn't. slice is a sort of view
vm/src/stdlib/ctypes/basics.rs
Outdated
PyCDataFunctions::alignment_of_instances(zelf.into_ref(vm), vm) | ||
} | ||
Either::B(obj) if obj.has_class_attr("alignment_of_instances") => { | ||
let size_of = vm.get_attribute(obj, "alignment_of_instances").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let size_of = vm.get_attribute(obj, "alignment_of_instances").unwrap(); | |
let size_of = vm.get_attribute(obj, "alignment_of_instances")?; |
is this unwrap()
intended?
vm/src/stdlib/ctypes/function.rs
Outdated
if vm.isinstance(&argtypes, &vm.ctx.types.list_type).is_ok() | ||
|| vm.isinstance(&argtypes, &vm.ctx.types.tuple_type).is_ok() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if vm.isinstance(&argtypes, &vm.ctx.types.list_type).is_ok() | |
|| vm.isinstance(&argtypes, &vm.ctx.types.tuple_type).is_ok() | |
if vm.isinstance(&argtypes, &vm.ctx.types.list_type).and_then(|_| vm.isinstance(&argtypes, &vm.ctx.types.tuple_type)).map_err(|e| vm.new_type_error(format!( | |
"_argtypes_ must be a sequence of types, {} found.", | |
argtypes.to_string() | |
)))? { |
by watching |
Let me get the CPython definitions as example MOD_ADD_TYPE(&Struct_Type, &PyCStructType_Type, &PyCData_Type);
MOD_ADD_TYPE(&Union_Type, &UnionType_Type, &PyCData_Type);
MOD_ADD_TYPE(&PyCPointer_Type, &PyCPointerType_Type, &PyCData_Type);
MOD_ADD_TYPE(&PyCArray_Type, &PyCArrayType_Type, &PyCData_Type);
MOD_ADD_TYPE(&Simple_Type, &PyCSimpleType_Type, &PyCData_Type);
MOD_ADD_TYPE(&PyCFuncPtr_Type, &PyCFuncPtrType_Type, &PyCData_Type); Here we have |
I'm not sure why, but I cannot comment on the following reply
The idea of |
What happens when the buffer is destroyed? Does it deletes only the pointer(as a reference) or also deletes the data it contains (as an owner)? If it does both, then does it need a flag to distinguish them? |
I forgot where these two functions are used, but based on the docs: |
@youknowone So the idea is to do that for all files in |
Yes, that's possible to adapt to all files. It is up to you for your convenience. |
@youknowone I basically patched all your comments, you wrote:
For this code fn obj_bytes(&self) -> BorrowedValue<[u8]> {
PyRwLockReadGuard::map(self.data.borrow_value(), |x| unsafe {
slice::from_raw_parts(x.inner, x.size)
})
.into()
} What did you meant there? |
There was a question asking if it does copy. The comment is an answer about it. Creating an slice object doesn't copy anything. |
I added a C file that should be compiled as a shared lib and bundled together with the interpreter. Its import _ctypes_test |
|
Adding more meta impls.
@youknowone I haven't figured out a nice way to implement the data = b'data'
ubyte = c_ubyte * len(data)
byteslike = ubyte.from_buffer_copy(data) # an instance of ubyte made from data by copying things
My implementation of #[pyclassmethod]
fn from_buffer_copy(
cls: PyRef<Self>,
obj: PyObjectRef,
offset: OptionalArg,
vm: &VirtualMachine,
) -> PyResult {
let buffer = try_buffer_from_object(vm, &obj)?;
let opts = buffer.get_options().clone();
let (size, offset) = Self::buffer_check(cls, opts, offset, vm); //ignore this for now
let src_buffer = buffer.obj_bytes();
let empty_instance = ... //creates a new empty instance
let dst_buffer = empty_instance.obj_bytes_mut();
dst_buffer.copy_from_slice(&src_buffer[offset..offset+size]);
Ok(empty_instance.as_object().clone())
} So the idea is simply get the bytes from |
I don't know well about your requirements, but I guess you can do same way as how |
I think this PR is (practically) not possible to rebase anymore in this state. Do you mind if we squash the commits into single commit for rebase? |
Not at all :) I'm afraid I can't contribute in the following months. I can help others eventually, tho. |
Is this effort still ongoing / has this ctypes implementation been abandoned for another? |
Co-authored-by: Jeong YunWon <jeong@youknowone.org> Co-authored-by: Rodrigo Oliveira <rodrigo.redcode@gmail.com> Co-authored-by: Darley Barreto <darleybarreto@gmail.com> Co-authored-by: Noah <33094578+coolreader18@users.noreply.github.com>
Co-authored-by: Jeong YunWon <jeong@youknowone.org> Co-authored-by: Rodrigo Oliveira <rodrigo.redcode@gmail.com> Co-authored-by: Darley Barreto <darleybarreto@gmail.com> Co-authored-by: Noah <33094578+coolreader18@users.noreply.github.com>
Hi folks, I think we should close this, it's too old for any merging attempt. |
This is still a best trial of ctypes now. I wouldn't close it unless we actually merge any ctypes implementation. For anyone who will try another ctypes implementation, I hope it can be gradually mergable, not to be suffered by huge size of rebase. |
Agreed, the rebase effort was slowed by refactors that have happened since this pr, I'm planning to start incremental work on this soon. |
@darleybarreto and I are implementing the
ctypes
module. This an initial implementation for review. At this stage, we are focusing on Linux platforms to in the future extend to other platforms likectypes
fromcpython
does.