Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spaced repetition system #11

Open
hwgilbert16 opened this issue Jun 24, 2023 · 6 comments · May be fixed by #158
Open

Spaced repetition system #11

hwgilbert16 opened this issue Jun 24, 2023 · 6 comments · May be fixed by #158
Labels
help wanted Contributors are welcome on this request large This will be a large addition upcoming feature This feature has been accepted and will be worked on

Comments

@hwgilbert16
Copy link
Owner

SRS is a large part of many other flashcard software, such as Anki. Scholarsome’s implementation of an SRS will be more user-friendly while still being just as powerful.

@hwgilbert16 hwgilbert16 added the upcoming feature This feature has been accepted and will be worked on label Jun 24, 2023
@hwgilbert16 hwgilbert16 added discuss Ideas are needed before work can be done help wanted Contributors are welcome on this request large This will be a large addition labels Jul 9, 2023
@hwgilbert16 hwgilbert16 mentioned this issue Jul 9, 2023
5 tasks
@hwgilbert16
Copy link
Owner Author

See comment for info on the status of SRS #58 (comment)

@otfried
Copy link

otfried commented Feb 8, 2024

I have used a number of different flash card apps with spaced repetition, and for me the one offered by memrise works best - unfortunately, they are no longer interested in user-made or community decks...

I never got used to Anki - it requires too much effort to classify cards into very easy - easy - hard - very hard, etc., and the repetition interval grows too fast - very soon I had intervals of two years or so :-(

My suggestion would be to stick with "Know" and "Don't know" when testing words. Memrise stores the time a card was last tested, the total number of times it was tested, how many times the user answered correctly, the current repetition interval (which seems to have been tuned quite nicely), and whether the word is "difficult" (this seems to be based on the number of "Don't know" answers, perhaps as a percentage of the number of tests.) You can then later review the difficult words separately, and in smaller groups.

@hwgilbert16
Copy link
Owner Author

I never got used to Anki - it requires too much effort to classify cards into very easy - easy - hard - very hard, etc., and the repetition interval grows too fast

Yes, this has been something I have been wrestling with and why SRS has been in the works for over 6 months.

The issue you mention with Anki is the exact reason why I haven't implemented their method of SRS. When you're working on scales of thousands of cards, having to measure your response between 5 different options for every card requires a lot of thinking and is inefficient.

Memrise stores the time a card was last tested, the total number of times it was tested, how many times the user answered correctly, the current repetition interval

Sadly, this isn't a perfect method either. Memrise's method of repetition is based on machine learning. Realistically, this isn't practical to implement within Scholarsome. The reason it's so effective is because it collects many different factors and all uses them to gauge how well content is being learned. It's unrealistic to expect people who self-host to dedicate a significant amount of computational power to train a model, considering it would have to be done for every single deck and for every single user.

When an SRS does eventually get implemented, it will likely be reminiscent of Anki but with some tweaking and with significant user customizability. I'm thinking the Leitner system will be the best one to implement.

In the future, it's possible an ML-based algorithm will be developed and offered as opt-in, but I don't foresee this happening for a long while.

@otfried
Copy link

otfried commented Feb 9, 2024

I have used Memrise for 10 years, and never noticed it using machine learning. Do they claim they do? I suspect that's a marketing gimmick - there have never been any changes in the intervals, independent of whether I was active or not, and they are the same in all of my (many) decks.

Their algorithm is actually quite simple - one figures out the intervals they are using after a while, and one can actually verify them by looking at the network traffic.

For each card, Memrise stores the following attributes:

  1. creation time
  2. time the card was last tested
  3. time the card is due to be tested
  4. total number of times the card was tested (incremented every time, even within one session)
  5. total number of times the answer was correct in a test
  6. streak length (incremented for every correct test, reset on incorrect test)
  7. the current interval

The intervals they use are:
4h 12h 1d 6d 12d 24d 48d 96d 180d

A new card gets a streak of 0 and an interval of 4h. On each correct test, the streak increases by one, the interval is increased to the next one, but 180d is the last value - it does not increase from there.

On an incorrect test, the interval is set to 12h (unless it already was 12h, then it is set to 4h). The streak length is then set to -1. The card will be retested later in the same session, until it is answered correctly - then the streak will be set to 1 again. (So a test session does not end until all cards in the test have been answered correctly. If one aborts a test session, then the -1 streak length stays on the card and indicates that it needs to be tested immediately again in the next test session.)

Memrise adds some additional bells and whistles: For instance, it has the concept of "nearly correct test" - when the edit distance between correct answer and the entered answer is small enough, it accepts that as nearly correct. In that case, the streak is reset, a new test is scheduled for 4h later, but you actually keep your current interval - that's why the difference between last tested and due to be tested is not always the interval. Memrise also has an algorithm to detect "difficult" cards. I think both "nearly correct" and "difficult" are not really important.

The Memrise intervals work quite well for me. Notice the big gap between 1d and 6d. But of course one could actually make the sequence of interval times configurable for the user.

I have used something like the Leitner system when I studied with paper cards many years ago. Sebastian Leitner did not have a computer and electronic cards, so he designed a system that achieves the effect of spaced learning using paper cards and boxes. Today, with the ability to store attributes for our cards, we can simply store the due time for each card, and when creating a test session, we query the database for all cards whose due time has passed. That also works when users do not practice in regular intervals (which is the assumption underlying Leitner's implementation).

@hwgilbert16
Copy link
Owner Author

Do they claim they do? I suspect that's a marketing gimmick - there have never been any changes in the intervals, independent of whether I was active or not, and they are the same in all of my (many) decks.

I must have misread their algorithm basis - you're right, it seems like they do not integrate ML-features.

The method you describe for Memrise seems promising. Looks like it fixes many of Anki's shortfalls and adds some more on top of it. I'll definitely take this into account when I restart work on the SRS implementation. Thank you for the information.

@hwgilbert16 hwgilbert16 removed discuss Ideas are needed before work can be done on hold labels Apr 15, 2024
@hwgilbert16 hwgilbert16 linked a pull request May 26, 2024 that will close this issue
@gabbard
Copy link

gabbard commented Jun 2, 2024

I'd recommend looking at the FSRS algorithm used by recent versions of Anki, which is pretty simple to implement and is almost universally considered greatly superior to their older algorithm (in my own experience it greatly lowered my review burden while maintaining retention rates): https://github.com/open-spaced-repetition/fsrs4anki/wiki/The-Algorithm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributors are welcome on this request large This will be a large addition upcoming feature This feature has been accepted and will be worked on
Projects
Status: Long-term Todo
Development

Successfully merging a pull request may close this issue.

3 participants