Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrealistically long intervals after rescheduing #49

Closed
mantixero opened this issue Mar 16, 2023 · 30 comments · Fixed by #51, #53 or open-spaced-repetition/fsrs4anki#205
Closed

Unrealistically long intervals after rescheduing #49

mantixero opened this issue Mar 16, 2023 · 30 comments · Fixed by #51, #53 or open-spaced-repetition/fsrs4anki#205
Labels
invalid This doesn't seem right

Comments

@mantixero
Copy link

mantixero commented Mar 16, 2023

I've been using the FSRS scheduler and helper since mid-December last year (when Anki 2.1.55 came out), but hadn't paid attention to what the longer intervals were for some of my cards, since the intervals for the cards coming up for review seemed realistic enough. However, after updating my scheduler code from v3.9.6 to v3.14.3 last night, making sure to leave in my custom parameters from the optimizer that I ran in December (which were based on ~14 years of reviews), I happened to finally look at some of the intervals for cards I wasn't seeing. To my surprise, the intervals for over 7000 of my cards are longer than a decade, some approaching as much as 50 years. I can tell just from a cursory look at the material on the cards that there's no way I'd remember the information that long (i.e. for the rest of my life). Again, since I only noticed the problem after updating and rescheduling, I can't say exactly when the issue occurred. Here is an example:

egregious-interval_1

I don't see how a card could go from an interval of 2.4 months to 31.17 years after just one review, but it probably has something to do with the calculation of the "s=" value in Custom Data. I'm noticing that most if not all of the cards with decades-long intervals have been reviewed for a period of time, neglected for several years, then manually repositioned as new cards to relearn since I basically needed to start over on my reviews. My Japanese kanji deck in particular has many cards like this, as I relearned kanji several times with years of neglected study in between, but it isn't limited to that one deck, as you can see in the screenshot below:

egregious-interval_list

If you need a copy of my collection to examine, I can provide one.

@L-M-Sherlock
Copy link
Member

I see a large gap between 2016 and 2023 in your review history. What happened? Why is there not a manual reset in your review history?

@mantixero
Copy link
Author

mantixero commented Mar 16, 2023

It could be that my way of resetting the card intervals was incorrect. I think what I did when manually resetting was select all relevant cards in the browser, right click, and choose "Forget". However, I don't remember what options were there in Anki at the time, and even if they're the same as now, I don't remember if I checked the box next to "Reset repetition and lapse counts". Whatever I chose, it looks like Anki correctly returned the cards to the "Learn" state, and kept the intervals at realistic values until I used the FSRS Helper. Here's another example of another card with more reviews in its history:

egregious-interval_2

@L-M-Sherlock
Copy link
Member

I think what I did when manually resetting was select all relevant cards in the browser, right click, and choose "Forget".

If you press' Forget', it will generate a review log with the type Manual in the browser. I try it in my collection:

image

FSRS4Anki helper has considered this case. But your cards' review logs didn't contain a Manual review, the rescheduling couldn't deal with it normally.

@mantixero
Copy link
Author

mantixero commented Mar 17, 2023

It looks like I must have used "Reschedule" for resetting the card intervals after all. According to the Anki changelogs, it seems that the "Forget" option didn't exist until 2.1.41: https://github.com/ankitects/anki/releases/tag/2.1.41

reschedule-split

Also mentioned in that same changelog is added support for searching for rescheduled cards. Whether or not this means that the "Manual" review type was introduced here as an addition to the "Learn" and "Review" types for this purpose is unclear. It does hint that the "Manual" type was added sometime after I performed my reschedulings.

Hopefully this doesn't mean that anyone who rescheduled cards as new via Anki before March 2021 is unable to successfully take advantage of FSRS. Perhaps using a different condition for resetting intervals like "card was returned to the Learn type without a 1 Rating" would be better than relying on a recently added review type like Manual.

@L-M-Sherlock L-M-Sherlock added the invalid This doesn't seem right label Mar 20, 2023
@L-M-Sherlock
Copy link
Member

I plan to fix it, so I need some revlog from your collection. Could you share them with me?

@L-M-Sherlock
Copy link
Member

Patch: fsrs4anki-helper.zip

@mantixero
Copy link
Author

mantixero commented Mar 22, 2023

Fantastic! I'll try the patch today, but here's my collection in case you still need it: https://www.mediafire.com/file/g1b88lkkcriqedx/collection-2023-03-22@06-07-31.colpkg/file

Update 1: I tried rescheduling with the zip you provided, and while I have 90 reviews in my kanji deck now instead of 2, there are still the long intervals on many cards, including the 50-year one. If you use my collection, you should be able to reproduce the issue.

Update 2: Looking at which cards in my kanji deck came back for review, a common theme seems to be that they all entered the "Relearn" state at some point due to a 1 Rating in my reviews. However, the thing that the FSRS Helper seems to be ignoring for cards without a Relearn state in their history is the part in brackets, where a card enters a "Learn" state without a 1 Rating (i.e. Rescheduling cards as new via older Anki versions that didn't set the status type to "Manual").

Screenshot 2023-03-22 075103

@L-M-Sherlock
Copy link
Member

there are still the long intervals on many cards, including the 50-year one

It would be another problem. Could you share the analysis generated by the optimizer with me?

@mantixero
Copy link
Author

mantixero commented Mar 22, 2023

Are those the w values generated from the Google Colab? I haven't run the optimizer since December 2022. Should I run it again?

These are the w values I got in December: 3.6766, 5.0, 3.6745, -1.4617, -3.2559, 0.0071, 2.7289, -0.1992, 1.3385, 2.2487, -0.0375, 0.5479, 0.1641

@L-M-Sherlock L-M-Sherlock linked a pull request Mar 22, 2023 that will close this issue
@L-M-Sherlock
Copy link
Member

Oh, I find out the problem. The optimizer also doesn't know about the resetting of cards. I recommend setting revlog_start_date with '2020-01-01' in the optimizer to avoid this problem.

@mantixero
Copy link
Author

I ran the optimizer again with the new revlog_start_date setting of "2020-01-01", and got this result: 1.9755, 2.285, 2.495, -0.1222, -2.2217, 0.0009, 2.0, -0.0532, 1.5, 2.5552, -0.0921, 0.5222, 1.3403
I'm still using the fsrs4anki-helper.zip you linked, and maybe I need to update that, but after rescheduling with the new w values, I still have 9878 cards with intervals of 10 years or more, with the most egregious intervals shown here:

Screenshot 2023-03-23 092838

@L-M-Sherlock
Copy link
Member

I will track this problem in a new issue.

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Mar 23, 2023

Update 2: Looking at which cards in my kanji deck came back for review, a common theme seems to be that they all entered the "Relearn" state at some point due to a 1 Rating in my reviews. However, the thing that the FSRS Helper seems to be ignoring for cards without a Relearn state in their history is the part in brackets, where a card enters a "Learn" state without a 1 Rating (i.e. Rescheduling cards as new via older Anki versions that didn't set the status type to "Manual").

I don't know why this card's first review log type is Review.

image

@L-M-Sherlock
Copy link
Member

And why this card has a Relearn after a Learn?

image

@mantixero
Copy link
Author

Yes, both of those look strange. Those reviews were from back in the Anki 1.0 days, so it's hard to say what Anki was thinking at the time. If you need to ignore any review information from Anki 1.0, I'm personally fine with that, although I can't speak for everyone else who has used Anki this long.

@L-M-Sherlock
Copy link
Member

OK, I decided to reset the card's interval and due if the revlog is strange.

Could you help me to test it?

fsrs4anki-helper.zip

@mantixero
Copy link
Author

I tried rescheduling all cards in my collection with the .zip file you linked, and there are now 433 kanji cards up for review (compared to 130 or so before), but there are still many cards with long intervals, as you can see:

Screenshot 2023-03-23 111715

@L-M-Sherlock
Copy link
Member

L-M-Sherlock commented Mar 23, 2023

I tried rescheduling all cards in my collection with the .zip file you linked, and there are now 433 kanji cards up for review (compared to 130 or so before), but there are still many cards with long intervals, as you can see:

I found that the ratings of these cards' all reviews are greater than or equal to 3. Maybe the long intervals are reasonable for these cards.

@mantixero
Copy link
Author

mantixero commented Mar 23, 2023

Whether it's reasonable to assume that I can recall the said information in 10-50 years is something I can only guess at, but I will try to put it in perspective.

I think the chances are increased if my flashcards were only testing recognition, but most if not all of my cards are production (cloze deletion) style cards, which are much more difficult to answer than cards that merely ask if I recognize the information on the front of the cards. I have to recall the correct information from memory and place it in the blanks, not just nod my head in understanding of what is already presented.

All I can say for sure is, using Anki's v2 scheduler with an interval of 0.82 for many years until December 2022, I had a recall rate of ~93% for mature (cloze deletion) cards. Targeting that same retention rate (0.93) with the FSRS scheduler, I can reasonably guess that the chances of me forgetting the information on those cards with 10- to 50-year intervals is significantly greater than 7%, which means my recall rate will end up being significantly worse than with Anki's v2 scheduler.

I'm happy with Anki's v2 scheduler results, but the number of reviews during the initial intervals made it hard to study more than 5 to 10 new cards per day, which is why I decided to try FSRS: I was hoping that FSRS could give me similar results with fewer reviews, allowing me to add more new cards per day. I have a hunch that the majority of my reviews could be decreased by adjusting the 3rd-8th review intervals rather than the 9th+ intervals on very mature cards. This is just my guess, and it would take many years of interval testing to prove.

I guess my question would be: does the FSRS algorithm treat cloze deletion cards differently than recognition cards? If not, I think it needs to account for the difference in difficulty between the two styles. Me adjusting Anki's v2 interval to 0.82 was my way of adjusting for the added difficulty of such cards.

@L-M-Sherlock
Copy link
Member

does the FSRS algorithm treat cloze deletion cards differently than recognition cards?

FSRS treats all card types consistently.

@mantixero
Copy link
Author

mantixero commented Mar 24, 2023

That should be fine as long as the intervals are optimized based on my actual historical patterns of remembering/forgetting those same cards. Maybe having to ignore the 10+ years of reviews occurring before 1-1-2020 is why the mature intervals don't seem realistic (greater than 7% chance of forgetting). I'd be happy to work with you in testing various tweaks, because I'd like FSRS to get it right, and there are probably very few research subjects who have 10+ years of SRS data for cloze deletion cards specifically.

@mantixero
Copy link
Author

I'd like to go back to Anki's v2 scheduler, but even after setting that preference in Anki, I still have all the cards with unrealistically long intervals. How can I get their intervals back down to v2 scheduler values?

@L-M-Sherlock
Copy link
Member

You can use set due date in the card browser.

@mantixero
Copy link
Author

Wouldn't that just set the due date to something within the next 7 days? I don't want to study 7000 cards this week, I just want the intervals to go back to what they would have been if I had used the v2 scheduler for the past 5 months instead of FSRS.

@L-M-Sherlock
Copy link
Member

So, you want a feature that could reset the interval and return to the last normal scheduling. I will deal with it in a new issue.

@user1823
Copy link
Contributor

user1823 commented Apr 7, 2023

@mantixero

I have a possible solution for the problem of FSRS not working correctly for you.

The issue results from the fact that some (or most) of your cards have abnormal review history, due to which the optimizer is not working correctly.

Also, setting the revlog_start_date to 2020-01-01 in the optimizer doesn't work, probably because it means having to ignore the 10+ years of reviews occurring before 2020-01-01.

So, one possible solution I can think of is that you can move the cards with abnormal review history to another deck and then use only the cards with "normal" review history to train the optimizer. When you do this, keep the revlog_start_date setting at the default value. Probably, this would mean that the optimizer produces a reasonable w value that would produce the expected intervals.

@user1823
Copy link
Contributor

user1823 commented Apr 7, 2023

If the w value produced by the optimizer is correct, I think that even the cards with abnormal review history would be assigned correct intervals by the helper because of the patches developed by @L-M-Sherlock (#51 and #53).

@mantixero
Copy link
Author

Thanks for the suggestion. I think your idea would indeed result in more accurate w values, but finding all cards with abnormal review history is quite a daunting task given that I have 58,613 cards in my collection. I would hope that the Helper fixes (#51 and #53) could also be applied to the Optimizer so that incomplete revlogs would be ignored, and only results after a card's last Review to Learning shift would be taken into account.

@L-M-Sherlock
Copy link
Member

I have a simple solution to filter these cards in the optimizer. Just ignore those cards reviewed before a specific date.

@user1823
Copy link
Contributor

user1823 commented Apr 8, 2023

I have a simple solution to filter these cards in the optimizer. Just ignore those cards reviewed before a specific date.

This approach doesn't work. The reason is that it is not the case where @mantixero reset all their cards on a single day.

Also, when @mantixero tried with revlog_start_date in the optimizer set to 2020-01-01, the intervals were still very long. This was probably because more than 10 years of review history was ignored when calculating the w values.

In the patches #51 and #53, you made the helper ignore the problematic part of the revlogs.

Probably, if the optimizer also ignores the problematic part of the revlogs but still trains on the "normal" reviews made during the same time period, it would come up with better w values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
3 participants