Refactor SpotifyId by alcore · Pull Request #459 · librespot-org/librespot

alcore · 2020-04-10T17:12:58Z

This PR includes a refactoring of SpotifyId.

In my particular use case I am dealing with a relatively large amount of track IDs (GBs) as playback history - those need to be base62 encoded onto the wire in batches of up to 100 and the current encoding algo was at my QPS prohibitively expensive as it performs 128-bit division + modulo, which compiles down to an unoptimized runtime call.

It now instead uses 64-bit arithmetic (which could relatively trivially be optimized for 32-bit arches behind a build condition) with an algo in use in the cryptocurrency ecosystem for their base58 encoding, further optimized for our case.

On rustc 1.44.0-nightly (b543afca9 2020-04-05), Windows 10, i7 4770k @ 4.4GHz (windows/x86-64) the result is as follows:

benchmark          old ns/op     new ns/op     speedup
to_base62()           3343.2         164.6        ~20x

I then went ahead to pick some further low hanging fruit in the other methods, without resorting to LUTs nor SIMD intrinsics. I.e. the change to the base62 encoding algo is the only non-trivial piece of code contained.

benchmark          old ns/op     new ns/op     speedup
from_base62()          362.7          31.2        ~11x
to_base16()            183.4          75.2         ~2x
from_base16()          144.3          27.8         ~5x
from_uri()             759.9          38.3        ~20x
FileId::to_base16()   1905.4          72.0        ~26x

from_uri() had an overhead of ~400ns/op on top of its from_base62() call, which is now reduced to 7ns, with some trivial inline parsing based on our knowledge of the inputs. FileId gain was primarily due to it needlessly allocating 20 Strings - one for each byte.

I included simple tests and rudimentary docs, which felt less welcome than the code - given there are none of either in the entire codebase ;-)

Rationale

I realize librespot itself in a player scenario barely uses those codepaths and that the PR introduces a bit of additional complexity. I would still like to be able to use this as a library and reuse the types in my project, given they do what I want -- just not efficiently enough.

Additional

This conflicts with the change proposed in Ability to get the Spotify URI from a SpotifyId #440 as it includes a to_uri() method. However, it moves part of the logic to the SpotifyAudioType enum (and is 100ns faster, albeit less idiomatic) where I feel it should belong;
If breaking API changes are perfectly fine pre-1.0, I'd like to - in a subsequent PR - as suggested in Podcasts #381 (comment), add a SpotifyObject struct to encapsulate the ID and object type instead.
This would involve:
- swapping the SpotifyAudioType for an SpotifyObjectType enum and filling it with other known types (including non-playable ones), keeping it unitary (with a placeholder Unknown tag for unrecognized types);
- moving to/from_uri() from SpotifyId to SpotifyObject. For consistency I'd love to rename ID to SpotifyObjectId in that case as well;
- adding a SpotifyContext to encapsulate a SpotifyObjectType in a given context (including the URI);
- From and Into implementations could be given and be clear. In this PR I refrained as it is currently not clear whether a SpotifyId should convert to/from an URI or the base62 encoding. That is, if bumping MSRV is a go, as those need to be TryInto/TryFrom;

A 'go' for this would be welcome as I'm eager to get coding.

- perf: * base62 encoding is an order of magnitude faster (~20x); * base16/62 enc/dec and from_uri are several times faster (~2-20x); * Let FileId::to_base16() reuse the hex encoder (~20x); - changes: * Add to_uri() method; * Make from_uri() error handling consistent; * Move audio type from string matching to a SpotifyAudioType factory (private); * Implent From/Into<&str> for SpotifyAudioType; * Add representation sizes as associated constants (private); - cs/docs: * Add rudimentary docs for most public funcs; * Add trivial test cases for the codecs;

willstott101

The actual changes LGTM. I think it's good to have someone working with librespot for non-audio tasks as it can only help make the whole thing more reliable, with fewer assumptions. Thanks for going into such depth.

There are just a couple of comments I've made inline.

W.R.T. your suggestions on where to go next: I think it all sounds reasonable, and inspired by comments made by @ashthespy which can only be good ;)

If you're happy to get going, go for it.

willstott101 · 2020-05-10T17:24:18Z

    }
 }
+
+#[inline]


It's my understanding that #[inline] doesn't do anything unless the function is used across crates without LTO (which it can't be since it's not pub). Is there any point of it here?

willstott101 · 2020-05-10T17:35:12Z

+    }
+}
+
+#[cfg(test)]


I think the added complexity is fine since you've taken the time to add tests.

But could you add running the tests to the travis config?

And if it's not a huge PITA, it would be good to see a commit with the tests passing for the old implementation to prove correctness - at least for the methods that already existed.

ashthespy · 2020-05-11T20:05:33Z

@alcore Sorry I seem to have missed this! Thanks for the effort, and appreciate the benchmarking!
Please go ahead with your SpotifyObjectId and SpotifyContext as well, we can handle the api changes depending on your implementation..

sashahilton00 · 2020-07-24T01:52:14Z

Just triggering a rebuild. The API changes look fine to me as well. @alcore are you planning on refactoring SpotifyObjectId and SpotifyContext as well?

Johannesd3 · 2021-01-23T19:58:54Z

Any reason why this doesn't get merged?

Johannesd3 · 2021-01-24T09:49:06Z

+        let mut id = match SpotifyId::from_base62(&src[id_i..]) {
+            Ok(v) => v,
+            Err(e) => return Err(e),
+        };


This is exactly what the question mark operator does.

Johannesd3 · 2021-01-24T09:49:17Z

+        if src.len() != SpotifyId::SIZE {
            return Err(SpotifyIdError);
        };

-        let mut arr: [u8; 16] = Default::default();
-        arr.copy_from_slice(&data[0..16]);
+        let mut dst = [0u8; SpotifyId::SIZE];
+        dst.copy_from_slice(src);


You could use try_into to convert a slice into a fixed-size array.

sashahilton00 · 2021-01-24T23:30:12Z

Wasn't merged because no response received from OP, and likely requires a review to account for potential conflicts/side effects as described in the original comment. Also API change is breaking, though this is less of an issue.

Johannesd3 · 2021-01-25T16:50:47Z

I don't think the change is breaking, the function signatures are the same. It's a pity that this isn't merged, the code seems well-written and well-documented... If @alcore isn't interested anymore, I will perhaps create a new pr based on this one (if it's ok).

sashahilton00 · 2021-01-25T22:17:33Z

@Johannesd3 feel free. I'm happy to merge a new PR based off this one, but it needs to be updated to work with changes to dev since it was originally submitted I believe.

sashahilton00 · 2021-01-30T02:23:09Z

Closing in favour of #587

willstott101 suggested changes May 10, 2020

View reviewed changes

sashahilton00 closed this Jul 24, 2020

sashahilton00 reopened this Jul 24, 2020

Johannesd3 reviewed Jan 24, 2021

View reviewed changes

sashahilton00 added breaking includes a breaking change enhancement labels Jan 25, 2021

Johannesd3 mentioned this pull request Jan 29, 2021

Refactor SpotifyId #587

Merged

sashahilton00 closed this Jan 30, 2021

sashahilton00 removed the breaking includes a breaking change label Jan 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor SpotifyId#459

Refactor SpotifyId#459
alcore wants to merge 1 commit intolibrespot-org:devfrom
alcore:refactor-id

alcore commented Apr 10, 2020

Uh oh!

willstott101 left a comment

Uh oh!

willstott101 May 10, 2020

Uh oh!

willstott101 May 10, 2020 •

edited

Loading

Uh oh!

ashthespy commented May 11, 2020

Uh oh!

sashahilton00 commented Jul 24, 2020

Uh oh!

Johannesd3 commented Jan 23, 2021

Uh oh!

Johannesd3 Jan 24, 2021 •

edited

Loading

Uh oh!

Johannesd3 Jan 24, 2021

Uh oh!

sashahilton00 commented Jan 24, 2021

Uh oh!

Johannesd3 commented Jan 25, 2021

Uh oh!

sashahilton00 commented Jan 25, 2021

Uh oh!

sashahilton00 commented Jan 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

alcore commented Apr 10, 2020

Rationale

Additional

Uh oh!

willstott101 left a comment

Choose a reason for hiding this comment

Uh oh!

willstott101 May 10, 2020

Choose a reason for hiding this comment

Uh oh!

willstott101 May 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashthespy commented May 11, 2020

Uh oh!

sashahilton00 commented Jul 24, 2020

Uh oh!

Johannesd3 commented Jan 23, 2021

Uh oh!

Johannesd3 Jan 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Johannesd3 Jan 24, 2021

Choose a reason for hiding this comment

Uh oh!

sashahilton00 commented Jan 24, 2021

Uh oh!

Johannesd3 commented Jan 25, 2021

Uh oh!

sashahilton00 commented Jan 25, 2021

Uh oh!

sashahilton00 commented Jan 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

willstott101 May 10, 2020 •

edited

Loading

Johannesd3 Jan 24, 2021 •

edited

Loading