Skip to content

Refactor SpotifyId#459

Closed
alcore wants to merge 1 commit intolibrespot-org:devfrom
alcore:refactor-id
Closed

Refactor SpotifyId#459
alcore wants to merge 1 commit intolibrespot-org:devfrom
alcore:refactor-id

Conversation

@alcore
Copy link
Copy Markdown
Contributor

@alcore alcore commented Apr 10, 2020

This PR includes a refactoring of SpotifyId.

In my particular use case I am dealing with a relatively large amount of track IDs (GBs) as playback history - those need to be base62 encoded onto the wire in batches of up to 100 and the current encoding algo was at my QPS prohibitively expensive as it performs 128-bit division + modulo, which compiles down to an unoptimized runtime call.

It now instead uses 64-bit arithmetic (which could relatively trivially be optimized for 32-bit arches behind a build condition) with an algo in use in the cryptocurrency ecosystem for their base58 encoding, further optimized for our case.

On rustc 1.44.0-nightly (b543afca9 2020-04-05), Windows 10, i7 4770k @ 4.4GHz (windows/x86-64) the result is as follows:

benchmark          old ns/op     new ns/op     speedup
to_base62()           3343.2         164.6        ~20x

I then went ahead to pick some further low hanging fruit in the other methods, without resorting to LUTs nor SIMD intrinsics. I.e. the change to the base62 encoding algo is the only non-trivial piece of code contained.

benchmark          old ns/op     new ns/op     speedup
from_base62()          362.7          31.2        ~11x
to_base16()            183.4          75.2         ~2x
from_base16()          144.3          27.8         ~5x
from_uri()             759.9          38.3        ~20x
FileId::to_base16()   1905.4          72.0        ~26x

from_uri() had an overhead of ~400ns/op on top of its from_base62() call, which is now reduced to 7ns, with some trivial inline parsing based on our knowledge of the inputs. FileId gain was primarily due to it needlessly allocating 20 Strings - one for each byte.

I included simple tests and rudimentary docs, which felt less welcome than the code - given there are none of either in the entire codebase ;-)


Rationale

I realize librespot itself in a player scenario barely uses those codepaths and that the PR introduces a bit of additional complexity. I would still like to be able to use this as a library and reuse the types in my project, given they do what I want -- just not efficiently enough.

Additional

  • This conflicts with the change proposed in Ability to get the Spotify URI from a SpotifyId #440 as it includes a to_uri() method. However, it moves part of the logic to the SpotifyAudioType enum (and is 100ns faster, albeit less idiomatic) where I feel it should belong;
  • If breaking API changes are perfectly fine pre-1.0, I'd like to - in a subsequent PR - as suggested in Podcasts  #381 (comment), add a SpotifyObject struct to encapsulate the ID and object type instead.
    This would involve:
    • swapping the SpotifyAudioType for an SpotifyObjectType enum and filling it with other known types (including non-playable ones), keeping it unitary (with a placeholder Unknown tag for unrecognized types);
    • moving to/from_uri() from SpotifyId to SpotifyObject. For consistency I'd love to rename ID to SpotifyObjectId in that case as well;
    • adding a SpotifyContext to encapsulate a SpotifyObjectType in a given context (including the URI);
    • From and Into implementations could be given and be clear. In this PR I refrained as it is currently not clear whether a SpotifyId should convert to/from an URI or the base62 encoding. That is, if bumping MSRV is a go, as those need to be TryInto/TryFrom;

A 'go' for this would be welcome as I'm eager to get coding.

- perf:
  * base62 encoding is an order of magnitude faster (~20x);
  * base16/62 enc/dec and from_uri are several times faster (~2-20x);
  * Let FileId::to_base16() reuse the hex encoder (~20x);

- changes:
  * Add to_uri() method;
  * Make from_uri() error handling consistent;
  * Move audio type from string matching to a SpotifyAudioType factory (private);
  * Implent From/Into<&str> for SpotifyAudioType;
  * Add representation sizes as associated constants (private);

- cs/docs:
  * Add rudimentary docs for most public funcs;
  * Add trivial test cases for the codecs;
Copy link
Copy Markdown
Contributor

@willstott101 willstott101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual changes LGTM. I think it's good to have someone working with librespot for non-audio tasks as it can only help make the whole thing more reliable, with fewer assumptions. Thanks for going into such depth.

There are just a couple of comments I've made inline.

W.R.T. your suggestions on where to go next: I think it all sounds reasonable, and inspired by comments made by @ashthespy which can only be good ;)

If you're happy to get going, go for it.

Comment thread core/src/spotify_id.rs
}
}

#[inline]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding that #[inline] doesn't do anything unless the function is used across crates without LTO (which it can't be since it's not pub). Is there any point of it here?

Comment thread core/src/spotify_id.rs
}
}

#[cfg(test)]
Copy link
Copy Markdown
Contributor

@willstott101 willstott101 May 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the added complexity is fine since you've taken the time to add tests.

But could you add running the tests to the travis config?

And if it's not a huge PITA, it would be good to see a commit with the tests passing for the old implementation to prove correctness - at least for the methods that already existed.

@ashthespy
Copy link
Copy Markdown
Member

@alcore Sorry I seem to have missed this! Thanks for the effort, and appreciate the benchmarking!
Please go ahead with your SpotifyObjectId and SpotifyContext as well, we can handle the api changes depending on your implementation..

@sashahilton00
Copy link
Copy Markdown
Member

Just triggering a rebuild. The API changes look fine to me as well. @alcore are you planning on refactoring SpotifyObjectId and SpotifyContext as well?

@Johannesd3
Copy link
Copy Markdown
Contributor

Any reason why this doesn't get merged?

Comment thread core/src/spotify_id.rs
Comment on lines +146 to +149
let mut id = match SpotifyId::from_base62(&src[id_i..]) {
Ok(v) => v,
Err(e) => return Err(e),
};
Copy link
Copy Markdown
Contributor

@Johannesd3 Johannesd3 Jan 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what the question mark operator does.

Comment thread core/src/spotify_id.rs
Comment on lines +122 to +127
if src.len() != SpotifyId::SIZE {
return Err(SpotifyIdError);
};

let mut arr: [u8; 16] = Default::default();
arr.copy_from_slice(&data[0..16]);
let mut dst = [0u8; SpotifyId::SIZE];
dst.copy_from_slice(src);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use try_into to convert a slice into a fixed-size array.

@sashahilton00
Copy link
Copy Markdown
Member

Wasn't merged because no response received from OP, and likely requires a review to account for potential conflicts/side effects as described in the original comment. Also API change is breaking, though this is less of an issue.

@sashahilton00 sashahilton00 added breaking includes a breaking change enhancement labels Jan 25, 2021
@Johannesd3
Copy link
Copy Markdown
Contributor

I don't think the change is breaking, the function signatures are the same. It's a pity that this isn't merged, the code seems well-written and well-documented... If @alcore isn't interested anymore, I will perhaps create a new pr based on this one (if it's ok).

@sashahilton00
Copy link
Copy Markdown
Member

@Johannesd3 feel free. I'm happy to merge a new PR based off this one, but it needs to be updated to work with changes to dev since it was originally submitted I believe.

@Johannesd3 Johannesd3 mentioned this pull request Jan 29, 2021
@sashahilton00
Copy link
Copy Markdown
Member

Closing in favour of #587

@sashahilton00 sashahilton00 removed the breaking includes a breaking change label Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants