Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetroCore API improvements #17

Closed
wants to merge 10 commits into from

Conversation

InquisitiveCoder
Copy link
Contributor

@InquisitiveCoder InquisitiveCoder commented Nov 29, 2022

Introduction

You'll probably want to turn off whitespace changes to weed out indentation changes in the code.

This PR is a collection of miscellaneous API improvements I tinkered with over the weekend. The goals for these changes, in rough order of importance, were:

  1. Expose the full functionality of the libretro API.
  2. Future-proof the API against new environment commands.
  3. Enforce more of libretro's rules using types.
  4. Make the crate's abstractions as close to zero-cost as possible.
  5. Minor project housekeeping (e.g. update the gitignore.)

I wanted to get these breaking changes out of the way now before moving on to adding more environment commands and features.

Major Changes

Environment overhaul

Scoped environments

Addresses #2 and lays the groundwork for #16. The new implementation exposes an environment trait for every retro_* function, so that we can limit which non-raw environment commands are available. The *_raw methods are always available, since they're the user's escape hatch for functionality the crate doesn't implement yet.

I initially tried to implement this by adding a PhantomData parameter to the environment, but this required casting in every RetroInstance callback as well as polluting the API with a lot of marker structs. I found the trait-based implementation much more straightfoward, since the C callback can implement all of the environment traits simultaneously, and it allows users to unit test their RetroCore since they no longer have a hard-coded dependency on the C function pointer.

Reworked raw methods

The base environment trait defines its raw_* methods in terms of Rust types, not pointers, and there's a new method fn mut_ref_raw<T>(&mut self, key: u32, data: &mut T) for commands that mutate structs.

All raw methods are unsafe

All raw methods should be unsafe since they allow the user to violate libretro's requirements and we can't account for future additions to the environment callback.

RetroCore API improvements

Exposed every retro_* function in RetroCore

Because libretro can add new environment commands at any time, and those commands may require being called during specific retro_* callbacks, the only way to expose libretro's full functionality and future-proof the API is to surface all of the retro_* to the user.

Other benefits of this change are less glue code in RetroInstance and a much tighter correspondence between the Rust API and C API, which should help users cross-reference the documentation.

Removed redundant method parameters.

Some functions in the C API pass an array pointer and the array's size as separate parameters. In RetroCore, the array pointers were translated into Rust slices, but the redundant size parameter wasn't removed. Since slices carry their length, I've removed these size parameters.

Refine c_uint parameters.

libretro's ABI uses unsigned int for all integers, even if its parameters would fit into a byte or short. To the best of my ability, I've replaced most c_uint/u32 parameters with the narrowest fixed-size type that will accept all of its values.

Additionally, two of these parameters pass user-defined values that are established in environment commands. These have been replaced with two associated types:

pub trait RetroCore: Sized {
  type SpecialGameType: TryFrom<u8>;
  type SubsystemMemoryType: TryFrom<u8>;

  fn load_game_special(&mut self, env: &mut impl LoadGameSpecialEnvironment, game_type: Self::SpecialGameType, info: RetroGame) -> bool;

  fn get_memory_data(&mut self, env: &mut impl GetMemoryDataEnvironment, id: RetroMemoryType<Self::SubsystemMemoryType>) -> Option<&mut [u8]>;
  // etc.

This will allow users to use their own enums, provided the enum is convertible into a u8. Cores that have no need for this functionality can use the provided empty tuple struct NotApplicable:

impl RetroCore for Emulator {
  type SpecialGameType = NotApplicable;
  type SubsystemMemoryType = NotApplicable;
  // etc.
}

Overhauled string handling

Some C strings returned by libretro are not guaranteed to be UTF-8 encoded. The crate should expose these as &CStr and let the user decide if they want to assume they're UTF-8 or handle them some other way. Convenience methods have been provided to convert Option<&CStr> to Option<&str> in one step.

Additionally, the API uses the CUtf8 crate when exposing UTF-8 C Strings, since this type preserves the fact that the string slice is both UTF-8 and nul-terminated. This allows for trivial conversions to both &CStr or &str. &str can't do this because once a &CStr is converted to &str, you've discarded the fact that it was originally nul-terminated.

Internal changes

Zero-cost struct conversions

Rust versions of the C structs generated by bindgen are now implemented as #declare(transparent) newtype wrappers. The public API still uses idiomatic Rust types, but the conversion to a libretro struct should be a no-op:

/// Rust interface for [retro_system_timing].
#[repr(transparent)]
#[derive(Clone, Debug)]
pub struct RetroSystemTiming(retro_system_timing);

impl RetroSystemTiming {
  /// Main constructor.
  pub fn new(fps: f64, sample_rate: f64) -> Self { Self(retro_system_timing { fps, sample_rate }) }
  pub fn fps(&self) -> f64 { self.0.fps }
  pub fn sample_rate(&self) -> f64 { self.0.sample_rate }
  pub fn into_inner(self) -> retro_system_timing { self.0 }
}

Modularized the crate

lib.rs was getting crowded, so wherever I overhauled a type I moved it to a new file. For the sake of making the diff easier to verify, any code that only required minor changes was left in lib.rs.

Removed libc

To the best of my knowledge, we weren't using it for anything that core::ffi::* didn't already provide.

Updated bindgen version

I updated bindgen to the latest version. This resulted in 2 or 3 lines in RetroInstance having some casts adjusted.

core over std

I've replaced references to std with core wherever possible, which should help us be more conscious of places where we're doing allocations and may allow the crate to function in a no_std context.

Afterword

Sorry for the big PR. I wasn't expecting to submit this much at once, but as I was experimenting one thing just led to another. As usual, let me know where you think a change missed the mark.

* Exposed every callback in RetroCore for future-proofing.
* Implemented callback-specific environments.
* Distinguished environment gets/sets of values and structs.
* Refactored Rust structs to be newtypes.
* Removed redundant size params from RetroCore interface.
* Added associated types to RetroCore for game and memory types.
* Updated bindgen version.
* Removed libc dependendency.
* Replaced all uses of std with core for no_std support.
* Updated gitignore.
* Updated example project.
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot to consider with this PR, I've left some initial comments. The biggest thing here is that is would break a lot of things, which I'm not completely opposed to, but it needs to be justified.

We should consider adding a prelude.rs as well as re-export the types from lib.rs so that they can still be imported as before. This would mitigate a lot of the concern about breaking the API.

Aside: Are you using rustfmt? The formatting looks different than what it would generate for me. I want to make sure style differences don't generate unnecessary churn in code diffs.

@@ -4,6 +4,8 @@ macro_rules! libretro_core {
#[doc(hidden)]
mod __libretro_rs_gen {
use super::*;
use core::ffi::*;
use core::ffi::c_char;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use core::ffi::c_char; is redundant since there's also use core::ffi::*; 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think that, but there's a second star import bringing in c_char so I need that to disambiguate it :) Figuring out where the second c_char is coming from is still on my to-do list...it's tough when it happens inside a macro.

Re: Formatting, I've been relying on CLion for formatting, I'll be sure to use rustfmt moving forward.

Re: The large amount of breakage, I'm aware I just upended a ton of the crate's API, but if you were planning on release a major version of it anyways, I think it makes sense to rip off the bandaid and do all the necessary changes to future-proof it now instead of waiting for someone to request a backwards-incompatible change because they can't get at some functionality they need.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think that, but there's a second star import bringing in c_char so I need that to disambiguate it :) Figuring out where the second c_char is coming from is still on my to-do list...it's tough when it happens inside a macro.

The preference would be to then just import the things we need from core::ffi, since it would be confusing otherwise (case and point: I just got it wrong 😉).

Re: Formatting, I've been relying on CLion for formatting, I'll be sure to use rustfmt moving forward.

👍🏼 if you don't have it already, I also recommend editorconfig as that'll keep some basic editor settings aligned project-wide. I assume CLion just supports it out of the box though.

Re: The large amount of breakage, I'm aware I just upended a ton of the crate's API, but if you were planning on release a major version of it anyways, I think it makes sense to rip off the bandaid and do all the necessary changes to future-proof it now instead of waiting for someone to request a backwards-incompatible change because they can't get at some functionality they need.

Having slept on it, I think the best way to approach this would be to divvy up this PR into a few separate ones. A few concrete reasons for doing so:

  1. This is "all or nothing", I can either accept all of it or reject all of it. Considering there are a few things I'd like to at least discuss in depth, my inclination as of now would be to reject it but I don't really want to do that since there's a lot of good here.
  2. This is hard to consider all at once, and covers a lot of different areas.
  3. I'd like to establish a workflow where issues are made to discuss changes first, then PRs follow to address the issues. I'm guilty of making PRs like this, so do as I say, not as I do 😜 I think this will be a better way to collaborate and save time if a particular change is too radical or if it needs to "soak" a while.

I would almost immediately accept the .gitignore changes, and the libretro_core! macro changes (with the note I just left about star imports in that file). The rest will need some more hashing out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I still wanted to open the PR so you could see all the changes together and try them out.

Normally I prefer opening PRs for individual changes but I honestly didn't know how this was going to turn out going into it. I'm still fairly inexperienced when it comes to Rust API design and how traits and ownership will work out in practice.

I take it you want to start with the environment changes?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I still wanted to open the PR so you could see all the changes together and try them out.

Open as many as you feel is right, and we can work through them 😄

Normally I prefer opening PRs for individual changes but I honestly didn't know how this was going to turn out going into it. I'm still fairly inexperienced when it comes to Rust API design and how traits and ownership will work out in practice.

You're doing a great job! No need to apologize. I am very grateful for your contributions!

I take it you want to start with the environment changes?

Again, whatever you feel is right. Just aim for more bite sized PRs and we'll be able to merge them more rapidly.

Thanks again for putting this together, it is very much appreciated!

instance_mut(|instance| instance.on_get_memory_data(id))
}

#[no_mangle]
extern "C" fn retro_get_memory_size(id: libc::c_uint) -> libc::size_t {
extern "C" fn retro_get_memory_size(id: c_uint) -> usize {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a cursory search, there seems to be a lot of uncertainty around libc::size_t being equivalent to usize, even within official rust code and documents. I think we should leave this as libc::size_t (or the ffi equivalent) until this is resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From following the discussions on Strict Provenance, Rust conflating size_t and uintptr_t wasn't ideal, but the distinction seems to only matter in really exotic architectures. The only one I know of is CHERI, which is specifically mentioned in the Strict Provenance discussion. It's a research architecture that has a 64-bit size_t but 128-bit intptr_t since it stores metadata in its pointers to track the regions of memory the pointer is allowed to access. The Strict Provenance tracking issue doesn't mention any other architectures like this. And usize certainly shouldn't be smaller than size_t since that'd mean there's objects larger than the address space.

Case in point, Bindgen is currently using usize to represent size_t in libretro.h; as long as Bindgen is doing that, there's no point in using libc::size_t elsewhere, since the create will only compile as long as those two types are aliases of each other. And a peek at the libc crate's repo on Github shows that size_t is consistently aliased to usize (either directly, or transitively through uintptr_t.)

One last data point is the experimental core::ffi::c_size_t:

Equivalent to C’s size_t type, from stddef.h (or cstddef for C++).

This type is currently always usize, however in the future there may be platforms where this is not the case.

This suggests there's currently no platforms Rust supports where usize and size_t are different (and the hypothetical future platforms are probably a reference to CHERI.)

( $head:expr , $( $tail:expr ),+ ) => {
Extensions(Some(c_utf8!(concat!($head, $("|", $tail),+))))
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change the methods that accept Extensions parameters to instead accept impl Into<Extensions> and provide impl From<Vec<&'static str>> for Extensions then you can call methods like needs_extensions(vec!["z64", "n64"]). Which would then obviate the need for a custom macro and provide more flexibility for other Into<Extensions> implementations, even custom ones outside of this crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

impl <T> GetMemoryDataEnvironment for T where T: RetroEnvironment {}

pub trait GetMemorySizeEnvironment: RetroEnvironment {}
impl <T> GetMemorySizeEnvironment for T where T: RetroEnvironment {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these traits are a first step towards #2? I have some thoughts on how to do this that differ from this implementation. Let's make those changes in a separate PR, so we can consider them in isolation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I did try the PhantomData approach but this turned out easier/less error-prone to implement. Particularly, it didn't require any casting.

@@ -46,7 +45,7 @@ macro_rules! libretro_core {
}

#[no_mangle]
extern "C" fn retro_set_environment(cb: retro_environment_t) {
extern "C" fn retro_set_environment(cb: EnvironmentCallback) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide the motivation behind this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retro_environment_t is an alias for Option<fn ...>; the original code was passing that Option all the way to the RetroEnvironment struct and checking if it was Some on every use of get_raw. I removed the Option since the argument to retro_set_environment should always be a valid function pointer.

I could leave it as retro_environment_t and unwrap it immediately if you feel strongly about it, but I don't see the point in trying to guard against a bad frontend. There's just too many ways the frontend could pass bad data and cause undefined behavior if it really wanted to. We're already playing with fire every time we use a pointer or the environment callback. I think we may as well assume the frontend behaves correctly, just like we do with the Rust standard library.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could leave it as retro_environment_t and unwrap it immediately if you feel strongly about it, but I don't see the point in trying to guard against a bad frontend. There's just too many ways the frontend could pass bad data and cause undefined behavior if it really wanted to. We're already playing with fire every time we use a pointer or the environment callback. I think we may as well assume the frontend behaves correctly, just like we do with the Rust standard library.

I actually think the exact opposite here. The library should handle the potential (admittedly, edge case) ugly bits about interoping with libretro front ends. If a front end or other malicious actor wanted to send nullptr here, we should handle that gracefully and not bother user code with it. I think the best we can do in this case is .expect("the argument for 'cb' cannot be a null pointer."), I'm also not sure how the FFI would handle a method that we've declared is never null being passed a null value. I believe it's always a good practice to validate the inputs you're being given before using them, my $0.02

@ghost
Copy link

ghost commented Dec 2, 2022

@InquisitiveCoder going to close this as we discussed opening a few smaller PRs. I am eagerly anticipating them 😄

@ghost ghost closed this Dec 2, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant