Achieve parity with the Python optimizer #88

dae · 2023-09-28T05:17:10Z

Currently the generated weights from this crate are slightly behind the Python optimizer. Not urgent, but in the long run, it would be nice if this crate could perform as well.

https://github.com/open-spaced-repetition/fsrs-benchmark#weighted-by-number-of-reviews

Any thoughts on what might be driving the differences?

L-M-Sherlock · 2023-09-28T05:34:38Z

The current outlier filter only remove those outliers in pretrain-set. The py version removes the them in the train-set, too: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/94be1a22d93e84e121186ea7181f55bc826639d3/src/fsrs_optimizer/fsrs_optimizer.py#L654-L664
The py version split the trainset and optimizer the weights for each split and calculate the average weights as the final result: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/94be1a22d93e84e121186ea7181f55bc826639d3/src/fsrs_optimizer/fsrs_optimizer.py#L1050-L1065
The py version applies additive smoothing to the pretrain: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/94be1a22d93e84e121186ea7181f55bc826639d3/src/fsrs_optimizer/fsrs_optimizer.py#L847-L849

user1823 · 2023-09-28T09:11:04Z

The current outlier filter only remove those outliers in pretrain-set. The py version removes the them in the train-set, too

In my opinion, the current outlier filter is too aggressive (it removes too many reviews). This is fine for the pretrain function (because the pretrain method is fragile). However, the train function should have access to more reviews. So, I think that we should build a less aggressive outlier filter for the train function.

There is no hurry in doing this. But, I wanted to make this point here so that you don't waste time and effort in applying the current outlier filter to the trainset.

user1823 · 2023-10-02T05:37:48Z

The python optimizer selects the parameters from the epoch with the minimum loss. I don't think that the rust optimizer behaves similarly.

Related commit: open-spaced-repetition/fsrs4anki@1719dae

L-M-Sherlock · 2023-10-02T06:08:57Z

The python optimizer selects the parameters from the epoch with the minimum loss. I don't think that the rust optimizer behaves similarly.

I also find out the difference, but the framework also doesn't also me to implement a similar feature.

dae · 2023-10-02T06:14:34Z

We could always switch to our own training loop if need be. @nathanielsimard I presume something like this is not supported out of the box at the moment?

nathanielsimard · 2023-10-02T18:10:08Z

I'm going to prioritize adding early stoping within burn-train, I think it's time we support that feature!

nathanielsimard · 2023-10-02T18:14:15Z

Burn Issue: tracel-ai/burn#841

user1823 · 2023-10-21T16:16:18Z

In case you aren't already aware, I wanted to let you know that Burn now supports early stopping when training, implemented in tracel-ai/burn#878. Probably, now is the time to implement early stopping in the Rust version of the FSRS optimizer.

By the way, the comparison between the Python and the Rust versions of the optimizer is somewhat unfair.

The Python version filters several reviews before (training and) evaluation which are not filtered out by the Rust version. Examples include:

Reviews of cards that are filtered by the outlier filter in pre-train.
Reviews of cards with incomplete revlogs.

In my opinion, evaluation should happen for these reviews, which means that the behavior of the Rust version is desirable. However, we should keep in mind that it is inconsistent with the Python version.

dae · 2023-10-22T00:12:18Z

tracel-ai/burn#882 will need to be merged before we can update burn

L-M-Sherlock · 2023-10-22T13:19:38Z

The Python version filters several reviews before (training and) evaluation which are not filtered out by the Rust version. Examples include:

The python version doesn't filter any reviews before training and evaluation in the benchmark. Here is the code:

https://github.com/open-spaced-repetition/fsrs-benchmark/blob/8aacff613116d70ab21325628235a38024d57d48/script.py#L120-L148

dae mentioned this issue Sep 30, 2023

Run retention calculation in parallel #91

Merged

asukaminato0721 mentioned this issue Oct 1, 2023

Laplace smoothing #94

Merged

L-M-Sherlock mentioned this issue Oct 1, 2023

Feat/stratified k fold #95

Merged

L-M-Sherlock linked a pull request Nov 20, 2023 that will close this issue

Feat/filter outlier in trainset #119

Merged

L-M-Sherlock closed this as completed in #119 Nov 21, 2023

This was referenced Nov 21, 2023

Feat/filter outlier in trainset #119

Merged

Better outlier filter for trainset #121

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Achieve parity with the Python optimizer #88

Achieve parity with the Python optimizer #88

dae commented Sep 28, 2023

L-M-Sherlock commented Sep 28, 2023 •

edited

Loading

user1823 commented Sep 28, 2023

user1823 commented Oct 2, 2023

L-M-Sherlock commented Oct 2, 2023

dae commented Oct 2, 2023

nathanielsimard commented Oct 2, 2023

nathanielsimard commented Oct 2, 2023

user1823 commented Oct 21, 2023

dae commented Oct 22, 2023

L-M-Sherlock commented Oct 22, 2023

Achieve parity with the Python optimizer #88

Achieve parity with the Python optimizer #88

Comments

dae commented Sep 28, 2023

L-M-Sherlock commented Sep 28, 2023 • edited Loading

user1823 commented Sep 28, 2023

user1823 commented Oct 2, 2023

L-M-Sherlock commented Oct 2, 2023

dae commented Oct 2, 2023

nathanielsimard commented Oct 2, 2023

nathanielsimard commented Oct 2, 2023

user1823 commented Oct 21, 2023

dae commented Oct 22, 2023

L-M-Sherlock commented Oct 22, 2023

L-M-Sherlock commented Sep 28, 2023 •

edited

Loading