Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

True randomization for spaced repetition #89

Closed
Kwintenvdb opened this issue Oct 14, 2020 · 7 comments
Closed

True randomization for spaced repetition #89

Kwintenvdb opened this issue Oct 14, 2020 · 7 comments

Comments

@Kwintenvdb
Copy link

First of all thanks for creating this wonderful app and making it open source.

I am using it to train on a list of 4000 words that are sorted by common usage in its respective language. This means that the most commonly used words of the language are sorted at the top, while the less commonly used words are at the bottom of the list. Since I've imported it in this order through Google Sheets, this is also the order the words appear in when doing a spaced repetition review. I don't care so much about learning the words in that order, and would prefer a mix of more "rare" and common words together.

The shuffling of the vocabulary list implemented 9f84897 in isn't true randomization on the database level, since the set of words are first fetched from the database, and are shuffled afterwards. In other words, it would fetch the first 10 most commonly used words, then shuffle them. Not very useful when you're already at an intermediate level of the language.

I could of course archive the most common words that I already know well, but I think true randomization would still be a very helpful feature.

@minhloi
Copy link
Collaborator

minhloi commented Oct 15, 2020

I can add randomization on database level but it might have significant performance impact when your database becomes large.

The database engine does not provide a way to pick random rows. The only way to get random rows is assigning a random number to each row. Then shuffle the whole database, then sort them based on that random numbers and pick the top rows.

https://stackoverflow.com/questions/580639/how-to-randomly-select-rows-in-sql

@Kwintenvdb
Copy link
Author

Kwintenvdb commented Oct 15, 2020

I'm not sure I agree that it would be as expensive as you assume. I've prepared a very rudimentary example here: https://www.db-fiddle.com/f/giSzk8GW8ZjicBs3zGc1gm/1

Even ordering 20k rows by random (which is far more data than I would expect the vast majority of users would have in a single set of flashcards) executes in about 10ms here.

Even so, ORDER BY RAND() isn't the only solution, and you could for example also assign a randomly generated number to each flashcard as they are stored in the DB, and sort by those later.

Either way it could very well be implemented as an optional feature in the spaced repetition review so that users can choose whether they want this random ordering or not.

As mentioned, this feature is very important to me, and I'd be willing to help out with the development where needed.

EDIT: I just realized you are querying a local Sqlite database instead of the MySQL server in the backend. I've tested the same example on a much larger dataset without any noticeable performance hitch. I cannot attest to how well this performs on mobile devices running this query locally though.

EDIT2: I've tested this on a local Sqlite client running on an iPhone 11 Pro. Doing a SELECT * FROM data ORDER BY RANDOM() LIMIT 50 over a table with 1 million rows takes 130ms. Of course you will probably have a little more overhead from the Sqlite client used in the app, but considering how I wouldn't expect any set of flashcards to have over 10k rows, this should still be blazing fast.

@minhloi
Copy link
Collaborator

minhloi commented Oct 15, 2020

@Kwintenvdb This is only possible because we are querying everything from the local database. The remote database cannot handle this bottleneck. I think it will work fine for most users (might be slow on old Android devices). This feature is easy to add. I will implement it soon.

@Kwintenvdb
Copy link
Author

Fantastic! Looking forward to using this feature.

@minhloi
Copy link
Collaborator

minhloi commented Oct 16, 2020

Fixed in aa1b9a1

@minhloi minhloi closed this as completed Oct 16, 2020
@Kwintenvdb
Copy link
Author

Awesome! Thanks a lot. How long does it generally take for a new release to be published on the App Store?

@minhloi
Copy link
Collaborator

minhloi commented Oct 17, 2020

It's already available for Android now. For iOS, it sometimes takes a few days.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants