Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate FSRS into Anki as an optional feature #2443

Closed
dae opened this issue Mar 16, 2023 · 81 comments · Fixed by #2654
Closed

Integrate FSRS into Anki as an optional feature #2443

dae opened this issue Mar 16, 2023 · 81 comments · Fixed by #2654

Comments

@dae
Copy link
Member

dae commented Mar 16, 2023

Currently users need to paste code into the custom scheduling section of the deck options in order to use FSRS. It may make sense to integrate this code directly into Anki, to make it easier to use.

According to Jarrett, the scheduling code does not change frequently: #2421 (comment)

Users may wish to apply different weights to different decks. The existing custom scheduling code relies on deck name matches (prefix matches?) in order to determine the weights. We could perhaps store the weights in the deck presets - this would allow a deck and its subdecks to share the same weights. Alternatively the weights could be stored in individual decks, which is the most flexible, but will lead to extra storage. The weights do not appear to be particularly large: #2421 (comment)

This would be an optional feature, and would require the add-on/weights trained externally for now. If it is successful, it may make sense to look into a Rust-based training method in the future, so that users don't need to rely on external sites/programs.

@L-M-Sherlock @RumovZ any thoughts?

@L-M-Sherlock
Copy link
Contributor

I would like to know where the code of FSRS could be integrated into Anki. FSRS only modifies the customData and the scheduledDays of ReviewState.

@dae
Copy link
Member Author

dae commented Mar 17, 2023

It would probably require changes to states/review.rs to apply the FSRS scheduling if an option had been set; the actual FSRS code would best be placed in a new file.

@L-M-Sherlock
Copy link
Contributor

Here are some codes I think are related:

scheduled_days: ctx.fuzzed_graduating_interval_good(),

scheduled_days: ctx.fuzzed_graduating_interval_good(),

scheduled_days: ctx.fuzzed_graduating_interval_easy(),

scheduled_days: self.review.failing_review_interval(ctx),

let (hard_interval, good_interval, easy_interval) = self.passing_review_intervals(ctx);

scheduled_days: self.failing_review_interval(ctx),

If FSRS is enabled, the scheduler will call a new function in a file, maybe named fsrs.rs.

@dae
Copy link
Member Author

dae commented Mar 18, 2023

Do you have thoughts on whether we should put the weights in deck presets or individual decks?

@L-M-Sherlock
Copy link
Contributor

Do you have thoughts on whether we should put the weights in deck presets or individual decks?

Deck presets are better. The current built-in algorithm also reads its configurations from deck presets.

@L-M-Sherlock
Copy link
Contributor

it may make sense to look into a Rust-based training method in the future, so that users don't need to rely on external sites/programs.

My friend and I have tried tch, which provides Rust bindings for the C++ api of PyTorch. But it required libtorch, whose size is about 200MB.

@dae
Copy link
Member Author

dae commented Aug 2, 2023

Yeah, that's too large I'm afraid. If it were to get integrated into Anki, it would need to be a smaller library, and preferably one that's implemented in native Rust. I don't know much about machine learning, so I don't know what sort of things your code relies on, or how easy/practical it would be to port the existing code to one of the options listed on https://www.arewelearningyet.com/

@L-M-Sherlock
Copy link
Contributor

I have tried a small library, tinygrad. Now we can get rid of torch: open-spaced-repetition/fsrs-optimizer-tiny (github.com). It only requires numpy and pandas (and we also can find a lightweight alternative for pandas).

However, tinygrad is Python-based and still in alpha. There is still a long way to go.

@dae
Copy link
Member Author

dae commented Aug 5, 2023

Sorry, I wasn't clear before - when I mentioned native Rust, I meant rather than a Rust wrapper for a C/C++ library. Python-based libraries aren't really an option, as any code added to Anki's core needs to work on the mobile clients too.

@L-M-Sherlock
Copy link
Contributor

Rust wrapper for a C/C++ library

Most of them are bindings for Tensorflow and PyTorch. So their size is very large. Most naive Rust frameworks are experimental.

@user1823
Copy link
Contributor

user1823 commented Aug 5, 2023

If the devs are successful in incorporating the optimizer into Anki, it would be undoubtedly a huge achievement.

But, for now, I think that the devs should focus on incorporating the scheduler and the features from the helper add-on into Anki.

If the scheduler and helper add-on code is incorporated into Anki, it would

  • make it much easier to configure the scheduler. Currently, many users do mistakes in putting the values of the parameters (w) in the scheduler code and also many of them struggle with configuring it differently for different decks.
  • enable things like creating filtered decks based on FSRS properties.

Also, given that training is something that the user doesn't need to do very frequently (only once a month, as L-M-Sherlock suggests), using an external site/program for it would not be very inconvenient.

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Aug 7, 2023

I agree with @user1823. @dae, my friend and I implement a Rust version for FSRS scheduler. Here is the repo: https://github.com/open-spaced-repetition/rs-fsrs. Maybe it would be helpful.

@dae
Copy link
Member Author

dae commented Aug 18, 2023

To summarize what is required here, as I currently understand it:

  • We add a field to DeckConfig to store the vector of weights, which should be a vector of at least a certain length.
  • We introduce a setting in DeckConfig that controls whether FSRS is on (or maybe better: use the presence/absence of the weights?)
  • We update scheduler/states/review.rs:next_states() to provide alternate values when FSRS is enabled, based on the logic in https://github.com/open-spaced-repetition/rs-fsrs. Might be nice to keep the actual FSRS scheduling logic in a separate file or files.

I presume that's the minimum required to enable usage of FSRS without using custom scheduling. The existing add-on could write the weights into the deck config, until we are able to integrate the weight calculation inside Anki directly.

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Aug 18, 2023

  • We add a field to DeckConfig to store the vector of weights, which should be a vector of at least a certain length.

And requestRetention. Here is a typical configuration for FSRS:

{
    "deckName": "global config for FSRS4Anki",
    "w": [0.4, 0.6, 2.4, 5.8, 4.93, 0.94, 0.86, 0.01, 1.49, 0.14, 0.94, 2.18, 0.05, 0.34, 1.26, 0.29, 2.61],
    "requestRetention": 0.9,
    "maximumInterval": 36500,
  }

The "deckName" and "maximumInterval" can be removed when we enable usage of FSRS without using custom scheduling.

@RumovZ
Copy link
Collaborator

RumovZ commented Aug 18, 2023

use the presence/absence of the weights

That would mean you'd lose the weights after toggling FSRS off and on again, though, as far I understand, which is a bit inconvenient.

I assume the card-specific parameters would remain in custom_data?

@dae
Copy link
Member Author

dae commented Aug 19, 2023

That would mean you'd lose the weights

Yep, good point.

would remain in custom_data

Yeah, I think that's probably best for now, as a DB schema change will be moderately expensive, and would require changes to the sync protocol.

Of the parameters shown in the screenshot here:

https://forums.ankiweb.net/t/how-to-use-the-next-generation-spaced-repetition-algorithm-fsrs-on-anki/25415/223?u=dae

  • I presume v=helper has been used to identify what wrote the details; maybe we could shorten this to something like v=1 for Anki, or omit it entirely?
  • @L-M-Sherlock what is seed used for? What are its min/max values, and would it be problematic if we used a shorter name?

@L-M-Sherlock
Copy link
Contributor

  • I presume v=helper has been used to identify what wrote the details; maybe we could shorten this to something like v=1 for Anki, or omit it entirely?

v denotes version. I store it just to locate the problem, because it has two source: the schedule and the helper. If we will implement reschedule feature, it would be useful to store the source.

  • what is seed used for? What are its min/max values, and would it be problematic if we used a shorter name?

The seed is fuzz factor. The reason I store it is that the random generator is different between JavaScript and Python. I need to store the fuzz factor to keep consistent between the scheduler (js script) and the helper (py add-on).

@Expertium
Copy link

Suggestion: allow users to have different parameters for different decks or apply the same parameters to all decks.

@dae
Copy link
Member Author

dae commented Aug 21, 2023

That's the plan. #2443 (comment)

@user1823
Copy link
Contributor

user1823 commented Sep 5, 2023

Congratulations for successfully integrating the FSRS optimizer into Anki (#2633).

For the UX, I have a suggestion.

  • Create a new option near the top of the Deck Options page for enabling/disabling FSRS.
  • Use the state of this option to decide whether to display specific deck options or not. For example,
    • Don't display the Optimize option when FSRS is disabled.
    • Don't display graduating interval, easy interval, new interval, starting ease, interval modifier, easy bonus, and hard interval when FSRS is enabled.

Also, the user should be allowed to set a requestRetention other than the one suggested by the optimizer. The suggested requestRetention is aimed at minimizing the workload but the user must be allowed to adjust the scheduler so as to achieve their learning goals (as is possible via the custom scheduling version of FSRS).

Lastly, I suggest that Check should be renamed to something like Evaluate.

@Expertium
Copy link

This might be a bit too much to ask, but I think the Supermemo approach of choosing the requested R (in Supermemo, "fogetting index" is just one minus requested retention) via a slider is really good. It ensures that the user cannot enter non-numerical characters and also ensures that the user cannot set the requested retention (or forgetting index) higher or lower than certain values. In SuperMemo, the forgetting index cannot be set lower than 3% (requested retention 97%) or higher than 20% (requested retention 80%), and I think that's a perfectly reasonable range and we should do the same.
image

@user1823
Copy link
Contributor

user1823 commented Sep 5, 2023

I think the SuperMemo approach of choosing the requested R via a slider is really good. … In SuperMemo, the forgetting index cannot be set lower than 3% (requested retention 97%) or higher than 20% (requested retention 80%), and I think that's a perfectly reasonable range and we should do the same.

I agree that we should only allow requestRetention values in the range 80% to 97%. But I don't like the idea of using a slider because using a slider to set a value is less convenient than simply typing it out.

@galantra
Copy link

galantra commented Sep 5, 2023

It could be both, a slider and a number field.

image

Each would need to adjust automatically to the value of the other.

A click on the circle would set it to the calculated optimal value.

@dae
Copy link
Member Author

dae commented Sep 5, 2023

Create a new option near the top of the Deck Options page for enabling/disabling FSRS.

A toggle already exists in the advanced section. It would definitely make sense to hide certain options when it is enabled, or display warnings (eg if a learning step is longer than a day and fsrs is enabled, warn the user that it is not advised).

The requested retention is editable by the user, and the plan is to make it use a spinbox like the other number fields, with appropriate min and max limits. I'm not sure we really need a slider in addition to a spinbox.

@Expertium
Copy link

Expertium commented Sep 6, 2023

Another suggestion: if it's possible to run the optimizer on each individual deck, then it should run only if the deck has a certain number of reviews (say, 1000), otherwise, if the user is trying to run the optimizer on a deck with very few reviews, an error message should pop up, telling the user that the data is insufficient to find optimal parameters accurately, and then the parameters of the parent deck should be used instead.

@Expertium
Copy link

Perhaps we should be making FSRS a global option instead of per deck-preset? The weights would still be per-preset.

I thought that's the plan.

@user1823
Copy link
Contributor

I like the idea, but I think it might be problematic when we enable some decks to use SM-2 and some to use FSRS, as the sorting will be incorrect when there's a mix of SM-2 and FSRS decks. Perhaps we should be making FSRS a global option instead of per deck-preset?

In my opinion, this is an issue only when the user has configured overlapping presets such that the parent deck is assigned one preset (where FSRS is enabled), and the subdeck is assigned another preset (where FSRS is disabled).

Making FSRS a global option will work. But for the current issue, I think that it is sufficient to display a warning while setting "Review Sort Order" to Ascending/Descending R if any of the subdecks of the decks affected by the preset have FSRS disabled.

Thinking more about it, the decks affected by a preset can also be changed later. So, the above warning should also be displayed if the user changes the preset used by a deck and this change causes the above problem.

Obviously, all this can be prevented just by making FSRS a global option. But, deciding whether to do this or not depends upon finding how common it for users is to enable FSRS for specific decks only.

@user1823
Copy link
Contributor

I wanted to ask a question related to presets and preset-specific weights.

Suppose that I have one parent deck that has one preset. Then, I have several subdecks, each with different presets. Can I use the same weights for all these decks? In other words, is it possible to have subdecks use weights of the parent deck even if they have different presets?

The reason for using different presets for subdecks is to control the number of new cards I get from each of them. But, in terms of their difficulty, they have similar cards. Evidence suggests that FSRS works better if you use one set of weights for all the cards rather than several sets of weights if the cards are similar in terms of their difficulty.

@user1823
Copy link
Contributor

add "Ascending R" and "Descending R" to the "Review sort order" menu

With this suggestion, I remember the discussion that happened when developing the "Advance" and "Postpone" features in the FSRS Helper add-on.

Sorting cards based only on the retention (R) creates problems because different cards may have different request retention. A card may have low R, but this card might not be a priority because it might have low request retention as well. So, it might be better to use the sorting formulas used in postpone.py and advance.py of the helper add-on.

https://github.com/open-spaced-repetition/fsrs4anki-helper/blob/e709cd48910155814a9c212be741afbfd11fb761/schedule/postpone.py#L94-L96

https://github.com/open-spaced-repetition/fsrs4anki-helper/blob/e709cd48910155814a9c212be741afbfd11fb761/schedule/advance.py#L95-L97

@Expertium
Copy link

I think that's needlessly complicated, and users may start asking whether sorting is bugged if they are shown a card with a low R, then a card with a higher R, and then a card with a lower R again.

@user1823
Copy link
Contributor

user1823 commented Sep 22, 2023

users may start asking whether sorting is bugged

But I am not suggesting that we should name this new option as Ascending/Descending R. We can continue using the current name (Relative Overdueness) or a similar name because the concept is the same, just the calculation is more accurate. Here, instead of using the scheduled interval (which includes fuzz), we use the calculated interval (which doesn't include fuzz).

If you didn't get what I said, let me tell you that these values are related as follows:
$$\frac{\frac{1}{R} - 1}{\frac{1}{R_0} - 1} = \frac{\text{Elapsed days}}{\text{Calculated interval}}$$

Here, R = current retention and R0 = request retention

@Expertium
Copy link

If it's just a different way of calculating overdueness, then I guess it's fine. @meliache what do you think?

@meliache
Copy link
Contributor

meliache commented Sep 23, 2023

@Expertium I kept out of the discussion because much more informed people got involved. I was not aware of the equivalency that @user1823 mentioned, I didn't think that with FSRS relative overdueness sorting and R sorting (given a fixed $R_0$) are equivalent.

Not using fuzz in the calculation seems to make sense. So I now agree we should keep the relative overdueness sorting option by name as-is and just consider improving the calculation based on the FSRS-calculated interval instead of the schedulead interval (or equivalently via this $\frac{1/R-1}{1/R_0-1}$ formula).

Beyond fuzz another difference between the two sides of the equation above can arise when the FSRS deck contains cards where the interval was not scheduled by FSRS. With the custom scheduling method only new cards are rescheduled and with the FSRS helper there is the option to only reschedule cards studied in the last 7 days. So currently you can have an FSRS deck where old cards use intervals scheduled by the old algorithm. Not sure how that will be when FSRS is in the core.

Also of course there is the option to manually set the due-date of a card, should that affect the "relative overdueness"? It would be weird if that action affects the "relative overdueness" sorting for non-FSRS cards but not for FSRS cards. Therefore if possible I would prefer sorting based on a relative overdueness based on the ideal calculated (not actually scheduled) interval for all cards, even non-FSRS ones. If we do that, of course that should be documented in the info popup card for the sorting options.

@dae
Copy link
Member Author

dae commented Sep 23, 2023

Suppose that I have one parent deck that has one preset. Then, I have several subdecks, each with different presets. Can I use the same weights for all these decks? In other words, is it possible to have subdecks use weights of the parent deck even if they have different presets?

With the v3 scheduler, you can use the same preset for all of your decks, and give each deck its own daily limit. If you wished to keep using separate presets, you'd need to copy the weights into each preset, and use a custom search when computing them.

@Expertium
Copy link

Also of course there is the option to manually set the due-date of a card, should that affect the "relative overdueness"?

@L-M-Sherlock what do you think? If we change sorting by relative overdueness so that it uses current R and requested R when FSRS is enabled, how should this be handled?

@L-M-Sherlock
Copy link
Contributor

@user1823
Copy link
Contributor

user1823 commented Sep 24, 2023

I was trying the Beta release and noticed two issues:

  1. The weights generated by Anki are very different from those generated by the python optimizer.

    • Weights generated by Anki: 1.1408, 2.1175, 9.0984, 90.9615, 4.7957, 1.1697, 1.2771, 0.0010, 1.6411, 0.1000, 1.0769, 2.1627, 0.0928, 0.4849, 1.0269, 0.0000, 3.4083
    • Weights generated by python optimizer: 1.1366, 1.4954, 20.8954, 41.342, 4.5015, 1.6955, 2.114, 0.0, 1.7707, 0.1551, 1.1898, 1.5944, 0.1889, 0.6854, 0.0115, 0.0, 6.6447
    • Deck to reproduce: Default.zip (change file extension to .apkg)
    • Timezone (for Python optimizer): Asia/Calcutta
    • Other settings are Default

    This issue is not caused by the bug where Anki calculates weights for all the reviews (not respecting the preset).

  2. Anki doesn't seem to reschedule the cards after updating the weights and/or desired retention.

@Expertium
Copy link

Wait, beta is already released? Can you give me a link?
Also, it's probably better to create a new issue for everything related to FSRS and the beta version

@user1823
Copy link
Contributor

Wait, beta is already released? Can you give me a link?

https://github.com/ankitects/anki/releases/tag/23.10beta1

@dae
Copy link
Member Author

dae commented Sep 24, 2023

Re: rescheduling - I wonder if that's the best idea? The advantage of the current approach is that users can enable FSRS without penalty. I don't want them to enable FSRS and then suddenly have hundreds more reviews to do - I feel like that should be some separate operation.

@user1823
Copy link
Contributor

I feel like that should be some separate operation.

I agree.

@Expertium
Copy link

Expertium commented Sep 24, 2023

Here's my feedback so far

  1. Maximum interval and desired retention do not have descriptions, even though hovering the mouse over them shows a question mark (left clicking on them does nothing)
    image

  2. The number of reviews is the same for any preset, meaning that the optimizer doesn't actually work on a per-preset basis, and goes through the entire collection instead.
    image

  3. This one is especially surprising: I get absolutely horrendous mouse stuttering (I'm not sure how else to describe it) in this Anki version. I have a 144 hz monitor and it feels like 30 hz when I'm moving my mouse. I never had this issue with Anki before.

  4. As user8123, there is no option to reschedule all cards.

Right now, I would consider this version to be borderline unusable.

@user1823
Copy link
Contributor

After clicking Compute with this search string, Anki automatically entered these weights.

image

These weights don't seem to be the default weights. So, are these weights calculated from this one card only? If so, I suggest entering the default weights (instead of the computed weights) when the number of reviews is small.

@L-M-Sherlock
Copy link
Contributor

2. The number of reviews is the same for any preset, meaning that the optimizer doesn't actually work on a per-preset basis, and goes through the entire collection instead.

You need to input query for collecting the training review logs. Such as deck:"xxxx".

@user1823
Copy link
Contributor

I have some more suggestions (for the UX):

  • show a popup when the user sets desired retention beyond the permissible range. "The permissible range for desired retention is 0.80 to 0.97." Currently, the value changes without any feedback. This seems to be a poor UX.
  • The same goes for calculating optimal retention. Show a tooltip saying, "The calculated optimal retention has been entered into the desired retention box."
  • Remove the 100.00% that remains below the compute button. I initially thought that the calculated optimal retention was 100% for me. If you don't want to remove it, change it to "XX.XX% complete."

Also, a question: How to search for memory states in the browser?

The number of reviews is the same for any preset, meaning that the optimizer doesn't actually work on a per-preset basis, and goes through the entire collection instead.

@Expertium, this bug has been fixed and the next beta should work fine. Use preset:"Preset name" until then.

@L-M-Sherlock
Copy link
Contributor

4. As user8123, there is no option to reschedule all cards.

It's automatic. When you modify the weights and click save, Anki will re-calculate all cards' memory states.

@user1823
Copy link
Contributor

It's automatic. When you modify the weights and click save, Anki will re-calculate all cards' memory states.

It re-calculates only the memory states, not the intervals and due dates.

@Expertium
Copy link

You need to input query for collecting the training review logs. Such as deck:"xxxx".

That is very confusing, and not all users will know the syntax for that. It would be much better to just make the optimizer work on the selected preset.

@L-M-Sherlock
Copy link
Contributor

Also, a question: How to search for memory states in the browser?

For example: prop:s<5, prop:d>3, prop:r<0.5.

not the intervals and due dates.

Yeah, but it wouldn't induce backlog, like auto-reschedule after each review.

@Expertium
Copy link

Question: does this version use online learning, or do you need to manually re-run the optimizer occasionally?

@L-M-Sherlock
Copy link
Contributor

By the way, this issue has been off-topic. I recommend posting your question and feature request in the forum: https://forums.ankiweb.net/

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Sep 24, 2023

Question: does this version use online learning, or do you need to manually re-run the optimizer occasionally?

It still requires user to re-run the optimizer occasionally.

@dae
Copy link
Member Author

dae commented Sep 25, 2023

I get absolutely horrendous mouse stuttering

Perhaps changing the video driver in the preferences screen will help. If not, try the qt5 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants