-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review logs that requires special treatment when calculating the memory states from review history #63
Comments
I thought we were covering such cases in convert_to_fsrs_items() and the ported version in Anki. Is that not the case? |
There are some difference between the filters in FSRS Optimizer (implemented in Pandas) and FSRS Helper (implemented in std lib). I cannot guarantee that |
Well, perhaps we should :-) If we have one function that deals with all of this, we'll know our inputs are safe to feed into any of our API. |
Yeah. And it's tricky, because I'm not very familiar with the legacies of the system of revlogs. I will collect some cases for test. |
But I don't know the structure of input from Anki. Could you give me an example? |
Yes. Is it possbile to output the review history of a card in its Rust structure? I can write an add-on to print the review logs of a card, but it's a list of object in Python rather than Rust. |
There's partial info available via card stats (assuming card id 1): mw.col.backend.card_stats(1).revlog To see the full Rust structure, try the above after running Anki with the following patch: diff --git a/rslib/src/stats/card.rs b/rslib/src/stats/card.rs
index d058c2233..7884a02c9 100644
--- a/rslib/src/stats/card.rs
+++ b/rslib/src/stats/card.rs
@@ -94,6 +94,7 @@ fn average_and_total_secs_strings(revlog: &[RevlogEntry]) -> (f32, f32) {
fn stats_revlog_entry(
entry: &RevlogEntry,
) -> anki_proto::stats::card_stats_response::StatsRevlogEntry {
+ dbg!(entry);
anki_proto::stats::card_stats_response::StatsRevlogEntry {
time: entry.id.as_secs().0,
review_kind: entry.review_kind.into(), These exceptions might be better handled not in the fsrs-rs crate, but in the Anki crate, since they're Anki-specific. Anki's port of the conversion code is here: https://github.com/ankitects/anki/blob/3742fa9f0c3118de1c23bf27198975eea62e09bb/rslib/src/scheduler/fsrs/weights.rs#L98 |
I plan to write some temporary tests in fsrs-rs and remove them after you implement the conversion code in Anki. |
That's fine if you find it easier, but it might be more efficient to add the tests directly to the anki crate if you're able to build it. Up to you though - either way works for me. :-) |
You can refer to these tests. |
@dae, I think that there must be some differences between the optimizer and the re-scheduler. The reason:
|
I made some comparisons of the approaches used in the optimizer and the rescheduler for dealing with the special cases. Here is the result of my analysis. Reviews of the same card on the same dayShould be dealt similarly in the rescheduler and the optimizer - keep only the first review and filter out the rest
Reviews in filtered deck when reschedule cards based on my answers in this deck is disabled.Should be dealt similarly in the rescheduler and the optimizer - Filter out these reviews
Manual SchedulingShould be dealt similarly in the rescheduler and the optimizer
First review type is not LEARN / Incomplete review historyDealt differently in reschedule & optimize.
Suspended cardsDealt differently in reschedule and optimizer
|
Here: Lines 61 to 66 in 4df08dd
I have refactored them in this branch: https://github.com/open-spaced-repetition/fsrs-rs/blob/Test/deal-with-revlog-from-Anki/src/temporary_test.rs |
Thanks, that's helpful. I've just pushed a draft of Anki integration you can play with, and I'll try return to this tomorrow. |
I think we may be ok now?
|
According to open-spaced-repetition/fsrs4anki-helper#16 (comment), @user1823 proposed that it is better to skip only those cards whose first review kind isn't learning AND which don't have any again rating. For example: |
Do you mean the starting point should be min(last_learning_idx, first_again_idx)? For both training and review? |
I think it only works for review. Maybe @user1823 has more ideas. And I think it's OK to skip them both in training and review, because these cases are very rare. |
The main reason this might occur is for long-term / heavy Anki users, who have removed the oldest few years of their review history to keep their collection size down & stay inside the AnkiWeb sync limits. Quite rare, but might be best to deal with. |
Yeah. For these cases, we should treat the first relearning review as the first learning review. |
To filter out just the CRAM revlogs, doesn't the condition need to be the following? entry.review_kind != Filtered || entry.ease_factor != 0 |
|
@dae, is this case (incomplete review history) dealt with in the latest beta of Anki? I suggest using this approach to deal with such cards. But obviously, if you have a better idea to deal with them, it would be fine too. |
I've changed the code in beta 2 so that a lack of learning steps no longer causes the memory state to be as if the card is new. If you have further suggestions, I'm open to them. |
If all such cards are rescheduled, it can lead to problem like this (where the interval decreased from 1.04 years to just 2 days). Edit: I have also verified that this problem exists in Beta 2. But, we also don't want to treat all such cards as new because then FSRS wouldn't work for them. So, I suggest re-reading this comment of mine, where I have discussed the issue in a greater detail. |
Once Anki no longer calculates the memory states for such cards, we will also need to figure out how to deal with such cards when they come up for review. I didn't face this problem till now because I do my reviews on AnkiDroid where FSRS is not supported and SM-2 schedules the cards instead. But when, FSRS is the only scheduler available, we will need to make a special function for such cards. Possible options:
https://github.com/open-spaced-repetition/fsrs4anki/blob/5a34201f4d5b3a9cc1a5da4511b6bc32b7c6e909/fsrs4anki_scheduler.js#L228-L241 For a textual description of this function, refer to https://github.com/open-spaced-repetition/fsrs4anki/wiki/How-does-the-scheduler-work%3F#calculate-memory-states |
Re https://forums.ankiweb.net/t/anki-23-10-beta/34912/38 and comments above: During inference, I think we may need to produce a memory state if the card has any reps on it, as it will be surprising to the user otherwise. If the user has cards with a truncated review history, or cards they used 'set due date' on to introduce without going through the new queue, excluding cards without learning steps would mean such cards could never have a memory state assigned, and the initial stability/difficulty would be used at each review based on the current rating, so intervals would never grow. We could change the criteria from "must have learning" to "must have Again", but that would mean the ratings would not grow above the initial intervals until the user rates such cards Again, which will be confusing and look like a bug in FSRS/Anki. Wouldn't it be better to just use whatever review came first? In the set-due-date case that would correctly reflect what would have happened if the user had gone through the new queue, and in the truncated revlog case, the initial stability may be wrong, but is hopefully shifted back towards actual by the subsequent reviews in the history. Regarding writing a separate function instead, that would be a bit tricky, as the current code assumes a missing memory state when a card is not new means the memory state needs to be calculated from the revlog. I think it might be simpler to try and handle such cards in the same way other cards are handled (i.e. giving them an approximate memory state). WDYT? |
Here is a method to convert Anki's interval and ease into FSRS's stability and difficulty: https://github.com/open-spaced-repetition/fsrs4anki/wiki/How-does-the-scheduler-work%3F#interval-to-stability |
fsrs-js didn't have access to the revlog history, which we do here - do you still feel converting from SM-2 ease is safer than trying to infer the memory state from incomplete logs? No preference on my end, just curious. |
There are too many complex cases for incomplete logs. It's hard to deal with those corner cases one by one. So I prefer to convert memory state from SM-2 ease and interval. |
I completely agree. With this, we can also get rid of the check for the presence of an To summarise, we are going to deal with such cards in the following way:
@L-M-Sherlock, what about using this approach in the add-on too (for Anki versions up to 2.1.66)? |
Another question that needs to be answered: Which interval is used in the following line of code? The interval before the first rating? Or the interval obtained after the first rating? Line 88 in d389762
In the first case, we will obtain the memory states before the first rating, and we will have to calculate the memory states after the first rating using the regular FSRS formulas. In my opinion, the first option is better because the memory states will tend to be more accurate (the earlier we can calculate the memory states, the more accurate the results will be) and it can deal with cards having no review history at the time of review (explained in the next para). Consider the following card. I reviewed it today and it had no review history before today. Assume that I had started using native FSRS yesterday. Now, if Anki doesn't use the first approach (calculating memory states from interval and ease before the first rating), it won't be able to schedule this card. However, I think that the interval before the first review will be available only during review (due to technical limitations). So, we might want to use a hybrid approach.
|
Context: open-spaced-repetition/fsrs-rs#63 (comment) I think that I have added most of the code required to make it work. You will just need to do some final touches so that it can become functional. I hope that editing this PR requires less effort than writing all the code yourself.
The memory state is derived when saving weights or when reviewing a card without state, so it should work as you expect. |
Cards without any review history don't get updated by the deck options, but when it comes time to review, the memory state is first derived from the ease factor and interval, and then that state is changed by the rating the user provided. |
Reviews of the same card in the same day
FSRS doesn't accept interval < 1, so we need to filter out those review logs with the same date and only keep the first one.
Reviews in filtered deck when
reschedule cards based on my answers in this deck
is disabled.These revlogs should be filtered out from review history.
These revlogs'
review_kind
isREVLOG_CRAM
, so it's easy to filter out them.Manual reset
There are two kinds of manual reset:
Set Due Date
andForget
. Theirreview_kind
isREVLOG_RESCHED
. But the fronter's ease (factor) is nonzero. The latter's ease (factor) is zero.Set Due Date
These revlogs should be filtered out from review history, too. Because I think it's equal to CTRL+J (Reschedule repetition) in SuperMemo: https://supermemopedia.com/wiki/Ctrl%2BJ_vs._Ctrl%2BShift%2BR
Forget
We should remove the review logs in front of
forget
andforget
itself. Then we need to treat the first review log afterforget
as the first learning, whose rating will be used to initialize the memory state.Incomplete revlogs
Due to some legacies, The manual reset were not recorded in the old version of Anki/AnkiDroid. So we need to deal with this case via another way. Normally, the previous revlog of a
REVLOG_LRN
shouldn't beREVLOG_REV
orREVLOG_RELRN
. If so, we need to remove those revlogs. In a word, we should find out the the first REVLOG_LRN in the last successiveREVLOG_LRNs
Besides, some users may use Transfer scheduling data from one card to another, which would generate more weird review history.
We need to deal with it case by case. Another more convenient method is to skip these cards.
@user1823 and @Expertium, if I miss anything, please let me know.
The text was updated successfully, but these errors were encountered: