Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IDs v4 #244

Merged
merged 44 commits into from
Sep 15, 2021
Merged

Fix IDs v4 #244

merged 44 commits into from
Sep 15, 2021

Conversation

marioortizmanero
Copy link
Collaborator

@marioortizmanero marioortizmanero commented Aug 30, 2021

Description

I've finally understood what the Rust compiler was trying to tell me. This PR implements type-safe ID wrappers, it is the only way I've managed to do it. Everything is thoroughly explained in rspotify-model/src/idtypes.rs, but if you have any questions please do let me know. In summary, I've ended up being able to make Id an object-safe trait, so that it can be used as dyn Id. This way, it is possible to take e.g. Vec<dyn PlayableId>.

This closes #203

Motivation and Context

Now that I've implemented it, the question is: are type-safe IDs worth it? There is currently a partial implementation of IDs in the master branch in which I'm basing this PR. We have to either finish said implementation, or remove it completely in a different PR.

Advantages:

  • Type safety: with this, it is impossible for the user to make a logic error related to IDs. The user can't mistakenly pass an ID instead of an URI or vice-versa. IDs are grouped up by types now as well; one can't pass the URI of an artist instead of the URI of a track. @Felix-Droop commented in bug: token refresh doesn't work #235 that he did like strongly typed IDs, for example.
  • Strongly typed IDs also makes endpoints self-documented. Taking a TrackId as a parameter makes it pretty clear what kind of ID is required, in comparison with &str, in which case it would have to be specified in the documentation (or by context).
  • All IDs have useful methods available for use: Id::uri, Id::url, Id::_type
  • The model is now more concise. Things like FullTrack only need an id field; uri and _type can be removed.

Disadvantages:

  • IDs may be more cumbersome to use; there is a considerable difference between endpoint(id) and endpoint(&TrackId::from_uri(id)?).
  • Library complexity, though IDs are much easier to use than I thought initially; it's just a trait and an implementation for each type. This PR only adds about ~120 lines of code, many of which are documentation and tests.
  • IDs are Strings instead of str, which require an allocation. If this ends up being important in terms of performance, we could look into crates like smol_str (which may not allocate until the string is longer than X bytes), or even Cow. So performance may not be that much of a problem; we can look into it in the future.
  • We lose the ability to let the compiler deduce the kind of ID we're dealing with, i.e. Id::from_id won't work anymore. This is only a disadvantage over the current implementation in master, not over not using type-safe IDs at all.

Dependencies

Nothing new

Type of change

Please delete options that are not relevant.

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Though it is a breaking change, the only breakage over master occurs when Id::from_id and similars were used directly. Instead, a concrete type has to be provided now, like ArtistId::from_id. The other main breakage is in rspotify's model; some fields have been removed and modified so that they use this new feature instead of just strings. Additionally, IdBuf has been removed, since now it's always owned anyway.

How Has This Been Tested?

There are new tests in rspotify-model/src/idtypes.rs. Take a look in there. The previous tests still pass.

@marioortizmanero
Copy link
Collaborator Author

marioortizmanero commented Aug 30, 2021

Something I have yet to decide: should I remove the _type fields in rspotify-model now that most of them have an ID with an already available type? Sounds good to me, right? Ok this was in the meta issue. I've removed the unnecessary _type fields.

@ramsayleung
Copy link
Owner

Sorry for not reviewing earlier, I took two hours to understand the context and problem came from #203. Just in case I don't get your point, I am going to explain the problem and your solution in my words.

The below endpoint takes uris as a parameter with the type of impl IntoIterator<Item = &'a Id<T>> + Send + 'a

    async fn start_uris_playback<'a, T: PlayableIdType + 'a>(
        &self,
        uris: impl IntoIterator<Item = &'a Id<T>> + Send + 'a,
        device_id: Option<&str>,
        offset: Option<crate::model::Offset<T>>,
        position_ms: Option<u32>,
    )
    // if we pass the below uris to `start_uris_playback`, the type of `uris` will be deduced to `impl IntoIterator<Item = &'a Id<Episode>>`.
    let uris = &[
        EpisodeId::from_uri("spotify:episode:7zC6yTmY67f32WOShs0qbb").unwrap(),
        EpisodeId::from_uri("spotify:episode:3hCrwz19QlXY0myYh0XgmL").unwrap(),
        TrackId::from_uri("spotify:track:0hAMkY2kwdXPPDfQ1e3BmJ").unwrap()
    ];
	// It will fail to compile since `Track` isn't compatible with `Episode`(not the same type)

Your solution to fix this issue is that replacing Id<T> with explicit ID type?

@marioortizmanero
Copy link
Collaborator Author

marioortizmanero commented Sep 11, 2021

Hi Ramsay,

Sorry, the original issue is probably too confusing and lengthy to understand now. I went in circles when trying to implement these IDs and had multiple attempts until I fully understood it. So please ignore it, I'll make a quick summary:

To fully cover all the endpoints with type-safe IDs, we need to be able to have IDs of types known at runtime. This is important so that we can have a vector of any kind of ID, such as Vec<Box<dyn Id>>. This should also work for more specific groups of IDs like Vec<Box<dyn PlayableId>>. Previously, we used a generic parameter T: PlayableId and then Vec<T>, but notice that these two are far from equivalent; Vec<Box<dyn PlayableId>> allows a list of episodes, tracks, or similars all mixed up, and Vec<T> allows a list of either episodes, tracks, or similars. Which is not what we want; the current implementation of start_uris_playback is wrong because you can't pass a track and an episode in the same list. The example you wrote showcases this perfectly, so yes, it's correct.

Now that we know the requirements, we need a solution. In master we have Id<T> where T is the type of Id (say, Artist, Album, etc). But something like Vec<Id<Box<dyn PlayableId>>> is impossible, so we need to rework it. What ended up working for me was to just use dynamic dispatching. We just need an object-safe trait and then we can do something like Vec<Box<dyn ObjectSafeTrait>>. So I basically turned Id into an object-safe trait, and implemented it for each type of Id, resulting in ArtistId, AlbumId, etc, which can be downcasted into dyn Id, meaning that start_uris_playback is now correct.

Regarding your comment in #241, the thing is that in the first iteration I tried to do this in a different way without dynamic dispatching. I simply added an Unknown type so that we can use Vec<Id<Unknown>> instead, which certainly allows a list of episodes, tracks, or similars all mixed up. You just need to be able to freely convert Id<Track> and Id<Episode> into Id<Unknown> and then you can mix them up because they're the same type, Unknown. The problem is that (a) it's impossible to convert Id<Track> into Id<Unknown> and (b) when converting Id<T> into Id<Unknown> you lose information about the type of the ID, which is important to e.g. call the uri method (otherwise you'd get spotify:unknown:xxxx). This is not so important to understand because it's just a wrong approach I tried to follow. It only explains why dynamic dispatch is the only possible solution for this.

If you have any more doubts please let me know.

@ramsayleung
Copy link
Owner

ramsayleung commented Sep 14, 2021

But something like Vec<Id<Box<dyn PlayableId>>> is impossible, so we need to rework it.

Because of the requirements of zero-cost abstraction, then Box needs to allocate memory on heap?

PS: I get your point after adding the below comment, as the original Id is a struct, there is no way to have something like dyn Id. The trait is similar with interface in Java, the struct is different from the class in Java, there is no way to cast between two struct

We just need an object-safe trait and then we can do something like Vec<Box<dyn ObjectSafeTrait>>. So I basically turned Id into an object-safe trait, and implemented it for each type of Id, resulting in ArtistId, AlbumId, etc, which can be downcasted into dyn Id, meaning that start_uris_playback is now correct.

You turned Id into object-safe trait by convert it from struct Id to trait Id and implement Id for each type Id, do I get your point correctly?

You just need to be able to freely convert Id and Id into Id and then you can mix them up because they're the same type, Unknown. The problem is that (a) it's impossible to convert Id into Id and (b) when converting Id into Id you lose information about the type of the ID, which is important to e.g. call the uri method

It sounds like the downcast problem in OOP language(C++, or Java), results in losing information of derived object by casting derived object to base object

CHANGELOG.md Outdated Show resolved Hide resolved

#[inline]
unsafe fn from_id_unchecked(id: &str) -> Self {
$name(id.to_owned())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little unsure that is it possible to construct a $name(for example AlbumId) from String? It seems there is no such constructor?

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a tuple struct (defined as struct $name(String)), which you can initialize with $name(my_string). It's basically a newtype over String.

/// IDs or just `Box<dyn Id>` in case the type wasn't known at compile time.
/// This is why this trait includes both [`Self::_type`] and
/// [`Self::_type_static`], and why all of the static methods use `where Self:
/// Sized`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, after including the Self::_type and Self::_type_static for Id, the compiler can deduce the specific type of Id?

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods are just getters for the Id's type just like what Id::_type was before this PR (which returns a Type). See the next comment as to why _type is necessary.

///
/// The string passed to this method must be made out of alphanumeric
/// numbers only; otherwise undefined behaviour may occur.
unsafe fn from_id_unchecked(id: &str) -> Self
Copy link
Owner

@ramsayleung ramsayleung Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not getting your point, what are the use cases of _type_static and from_id_unchecked, are they necessary?

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the thing is that Id is now a trait. And just like how the Rspotify clients work, its structs need to provide getters and similars so that the implementations in Id work for all their types.

  • from_id_unchecked serves as a constructor. Since the user shouldn't initialize the ID directly (they should use from_id and similars), it's marked as unsafe. But otherwise there would be no way to construct the Id within the trait (how else would from_id be implemented, for instance?). Note that we need object safety, so it uses where Self: Sized

  • _type serves as a getter for the Type, and so does _type_static. The main difference is that _type is a method, and _type_static is a static function (it doesn't take self). This is because there are two different use-cases to get the type: in constructors, which are static, and in regular methods:

    • In the from_id constructor (static function) if we want to get the type of the ID, we need a static function that provides it. We can't use a method that takes &self because the type is not constructed yet. So we just call Self::_type_static.
    • In the uri method if we want to get the type of the ID, we need either a static function or a method. Any of these would actually work, because the type is already constructed. The problem here is that we want to be able to use this method with a dyn Id. Which requires that uri is in Id's vtable. So we can't use a static function in this case, uri must be a method. And in order for uri to be a method, it can't make calls to static functions. So Self::_type_static won't work, we need self._type.

    You can see yourself how _type is necessary by trying to use _type_static in uri instead. You'll get a compiler error that explains it:

    fn uri(&self) -> String {
        format!("spotify:{}:{}", Self::_type_static(), self.id())
    }

    results in:

    error[E0277]: the size for values of type `Self` cannot be known at compilation time
       --> rspotify-model/src/idtypes.rs:159:34
        |
    159 |         format!("spotify:{}:{}", Self::_type_static(), self.id())
        |                                  ^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
        |
    note: required by `idtypes::Id::_type_static`
       --> rspotify-model/src/idtypes.rs:140:5
        |
    140 | /     fn _type_static() -> Type
    141 | |     where
    142 | |         Self: Sized;
        | |____________________^
    help: consider further restricting `Self`
        |
    158 |     fn uri(&self) -> String where Self: Sized {
        |                             ^^^^^^^^^^^^^^^^^
    
    For more information about this error, try `rustc --explain E0277`.
    error: could not compile `rspotify-model` due to previous error
    

    which can be fixed with:

    fn uri(&self) -> String where Self: Sized {
        format!("spotify:{}:{}", Self::_type_static(), self.id())
    }

    But that won't work as we want it to because uri isn't in the vtable:

    error: the `uri` method cannot be invoked on a trait object
       --> src/clients/oauth.rs:258:54
        |
    258 |         let uris = track_ids.into_iter().map(|id| id.uri()).collect::<Vec<_>>();
        |                                                      ^^^
        |
       ::: /home/mario/Programming/rspotify/rspotify-model/src/idtypes.rs:158:41
        |
    158 |     fn uri(&self) -> String where Self: Sized {
        |                                         ----- this has a `Sized` requirement
    
    error: the `uri` method cannot be invoked on a trait object
       --> src/clients/oauth.rs:976:50
        |
    976 |             "uris": uris.into_iter().map(|id| id.uri()).collect::<Vec<_>>(),
        |                                                  ^^^
        |
       ::: /home/mario/Programming/rspotify/rspotify-model/src/idtypes.rs:158:41
        |
    158 |     fn uri(&self) -> String where Self: Sized {
        |                                         ----- this has a `Sized` requirement
    
    error: could not compile `rspotify` due to 2 previous errors
    

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I (almost) understand your design, these from_id, from_uri and from_id_or_uri functions both are static constructor, all Id types can call these constructors without warring about vtable(of course you could make these functions as virtual function, and all Id types implement them, but it's clumsy).

As for type and id methods, it's necessary to make them as virtual function, while they vary between different Id

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I (almost) understand your design, these from_id, from_uri and from_id_or_uri functions both are static constructor, all Id types can call these constructors without warring about vtable(of course you could make these functions as virtual function, and all Id types implement them, but it's clumsy).

Exactly. There's no point on making these constructors virtual functions in the vtable because you couldn't use them. dyn Id requires an already initialized Id, so a virtual constructor makes no sense AFAIK. What we want is methods like uri to be virtual so that we can use them in the endpoints when e.g. taking Vec<dyn Id> and turning it into a vector of URIs.

As for type and id methods, it's necessary to make them as virtual function, while they vary between different Id

Yup.

@@ -34,16 +32,13 @@ pub struct SimplifiedPlaylist {
pub collaborative: bool,
pub external_urls: HashMap<String, String>,
pub href: String,
pub id: String,
pub id: PlaylistId,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems these Id types have not implemented the customized serialize and deserialize function, is it ok to replace String with PlaylistId, could serde handle the serialization/deserialization of these Id type properly?

It's same to the other Id type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh it's probably wrong now that you mention it, as the deserialization may even accept an uri on an invalid id. I'll take a look later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed after the last commit

pub positions: Vec<u32>,
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to remove this new constructor function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't get why it was needed in the first place if all the fields are public. You can just construct it with TrackPostitions { id, positions } just like how it works for every other model.

CHANGELOG.md Show resolved Hide resolved

// Deserialization may take either an Id or an URI, so its
// implementation has to be done manually.
impl<'de> Deserialize<'de> for $name {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it's necessary to implement the serialize function manually? Otherwise, the serialized output will be different from the original Json data?

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that when deserializing we want to be more flexible and accept both IDs and URIs. And when serializing we can only choose one, so IDs make the most sense to me. What do you suggest otherwise?

rspotify-model/src/show.rs Outdated Show resolved Hide resolved
rspotify-model/src/show.rs Outdated Show resolved Hide resolved
rspotify-model/src/show.rs Outdated Show resolved Hide resolved
rspotify-model/src/show.rs Outdated Show resolved Hide resolved
@@ -253,7 +253,7 @@ pub trait OAuthClient: BaseClient {
async fn playlist_replace_tracks<'a>(
&self,
playlist_id: &PlaylistId,
track_ids: impl IntoIterator<Item = &'a TrackId> + Send + 'a,
track_ids: impl IntoIterator<Item = &'a dyn PlayableId> + Send + 'a,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's an endpoint to manipulate track, it doesn't need to be dyn PlayableId. TrackId is enough, right?

Copy link
Collaborator Author

@marioortizmanero marioortizmanero Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just made a commit that leaves this and similar occurrences as a TODO for another PR: 98c299c. This is because most of the playlist endpoints work with both tracks and episodes, and we don't currently take that into account consistently. This change is correct, but there are many similar ones that should be fixed as well. I've left it as a TODO to avoid making big changes so that this can be merged finally.

@ramsayleung
Copy link
Owner

I think this PR is ready to merge :)

@marioortizmanero
Copy link
Collaborator Author

Great! Thanks for the review :)

@marioortizmanero marioortizmanero merged commit 89b3721 into master Sep 15, 2021
@marioortizmanero marioortizmanero deleted the fix-ids-4 branch September 15, 2021 15:20
@marioortizmanero marioortizmanero mentioned this pull request Sep 15, 2021
87 tasks
@ramsayleung
Copy link
Owner

Ha, thanks for your contribution, you are the guy worked hard :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Endpoints with lists of Ids don't accept multiple of them
2 participants