-
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] More accurate default parameters using Anki user's data and help from Dae #493
Comments
Even if the data isn't very useful for obtaining better default parameters, it can be useful for spaced repetition research. By the way, here is the link to Dae's comment: open-spaced-repetition/fsrs-rs#95 (comment) |
@L-M-Sherlock once you and Dae aren't so busy, I suggest working on this. Aside from finding more accurate default parameters, this can also help to benchmark FSRS and other algorithms more accurately. Currently, the benchmark repo has around 70 collections. If that number increased to 1000, that would be amazing. |
I wanted to mention the sample size that we need to achieve statistically significant results: Assuming 10 million Anki users, with a 95% confidence level and 5% margin of error, we need 385 collections. With 3% margin of error, we need 1067 collections. At the current sample size (70) and 95% confidence interval, the margin of error is 11.71%. If you want to play with the values, you can use this online calculator: http://www.raosoft.com/samplesize.html |
I have prepared a sample set of 20k collections. You can extract it with 'tar xaf ...'. It is a random sample of collections with 5000+ revlog entries, so it should contain a mix of older (still active) users, and newer users. Entries are pre-sorted in (cid, id) order. Please download a copy, as I'd like to remove it from the current location in a few weeks. You are welcome to re-host it elsewhere if you wish, but please preserve the LICENSE file if you do so. |
That's great, thank you! |
Great! I will update the benchmark tomorrow. |
I downloaded and unzipped it. Its size is 56.6GB. The main problem is I don't know the |
Dang. I was too focused on ensuring privacy, and forgot about that part. I will need to rebuild the archive. |
Ok, I've replaced the archive with a new version. example.py has been updated, and you can now access next_day_at, which can be used to derive the cutoff hour (see RevlogEntry::days_elapsed) |
What about the |
next_day_at can be used to determine the day a review log falls on without ever considering timezone or rollover hour. If the Python optimizer requires a timezone + rollover hour, I presume you could feed it UTC, and then determine the rollover hour in UTC based on next_day_at. |
Maybe rewrite all algorithms (and benchmarking code) in Rust? Of course, the Rust version of FSRS will be slightly different, and the Rust version of LSTM can he different too, but I think with a dataset this big speed is more important. |
@L-M-Sherlock, for reducing the size of the data, I think that you would have filtered out many revlog entries such as Manual entries, entries before a Forget, outliers, etc. There is no doubt that this was important for doing this benchmark experiment. However, I think that we should preserve a copy of the dataset without filtering any revlog entries for future research. |
But my google drive doesn't have enough storage space to preserve the copy. |
@L-M-Sherlock since Dae is now working on Anki 23.12 beta and since you have finished benchmarking FSRS v4, please give Dae the new default parameters that are based on 700+ million reviews. |
Which module is related to your feature request?
Scheduler, Optimizer
Is your feature request related to a problem? Please describe.
I can't find the exact comment by @dae, but I'm sure I saw a comment saying that due to the way Anki licensing works, it's possible to use review data from way more users than just those who submitted their collections for research via the Google Form. So it's possible to run the optimizer on hundreds or even thousands of collections. This could help find the best default parameters. Of course, it's hard to say whether it's practically worth it because of diminishing returns as the number of collections increases.
The text was updated successfully, but these errors were encountered: