Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnkiMorphs V3 megathread #222

Closed
mortii opened this issue Apr 15, 2024 · 6 comments
Closed

AnkiMorphs V3 megathread #222

mortii opened this issue Apr 15, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@mortii
Copy link
Owner

mortii commented Apr 15, 2024

There are some improvements to the current features and design that will break backwards compatibility, and they should be released at the same time:

  • alternative algorithm alternative algorithm #191
  • settings design adjustments (settings redesign/adjustments #218)
  • improve discard changes message box
  • improve new configs detected message box
  • Add option to disable keyboard shotcuts (Add option to disable keyboard shotcuts #244)
  • Toolbar
    • Rename "U" and "A" in the toolbar -> "L" and "I" (lemma and inflection).
    • Add options to hide either of those stats from the toolbar ^
    • Add option to hide the recalc link
  • add new tag (Fresh vocab note TAG request #235)
  • Tags reset (option to remove tags that ankimorphs uses. #209)
    • create background operation tag remover
    • display warning dialog with option to reset tags when
      • changing morphemizer
      • changing field
      • switching from lemma to inflection evaluation
        • when switching same session
        • when switching after saving 'lemma' option
  • don't trigger tags reset after discarding changes
  • Future proof the config variable names by making them more abstract. Prefixing the tab name (e.g. recalc_offset_new_cards) makes them too rigid. Fixing this would cause a very hard backwards compatibility break.
  • changes to extra fields (Changes to extra-fields #221)(Suspend Short Sentences #195)
    • add "all morphs" field
    • add "all morphs count" field
    • redo "unknown morphs" field
    • redo "unknown morphs count" field
    • rename am-unknowns -> am-unknown-morphs
    • rename am-unknowns-count -> am-unknown-morphs-count
    • add "am-study-morphs" (previously am-unknowns) AnkiMorphs V3 #224 (reply in thread)
    • update browse am-unknowns -> browse am-study-morphs
  • card data
    • update only when state is 'new'
      • due and tags
      • am-all-morphs
      • am-all-morphs-count
      • am-study-morphs
      • am-score
      • am-score-terms
    • update when state is 'review'
      • potentially remove 'am-not-ready'
      • potentially remove 'am-ready'
      • potentially add 'am-fresh-morphs'
      • potentially remove 'am-fresh-morphs'
    • always update
      • am-unknown-morphs
      • am-unknown-morphs-count
      • am-highlighted
  • always update both lemma and inflection learning intervals
  • check for lemma in text_highlighting.py
  • update known morphs reader
  • readability report uses learning interval based on user config
  • study plan builder:
    • remove morph-inflection header from only lemma file
    • remove *-priority columns
  • known morphs exporter
    • optional inflection column
    • optional occurrences column
    • update headers
  • rework tests
    • activate mypy
    • give fake configs dynamic keys
    • recalc
      • lemma score == inflection score
      • lemma known -> inflection known
      • offset due
        • lemma
        • inflection
      • known morphs
        • successfully raised exception
        • successfully read
          • old known morphs format
          • new known morphs format
      • ignore names
      • monolith collection
      • default morph priority (Fixed morph priority bug #253)
    • morph priority
      • new frequency files format
        • lemma and inflection columns
          • evaluate on lemma
          • evaluate on inflection
        • only lemma column
          • evaluate on lemma
      • new study plan format
        • minimal -> evaluate on lemma
        • full -> evaluate on inflection
      • raise exception when
        • no headers
        • minimal formats -> evaluating on inflection
        • full study plan -> evaluating on lemma
      • card collection priority
        • evaluate on lemma
        • evaluate on inflection
    • review
    • text highlighting
    • readability report
      • evaluate based on lemma
      • evaluate based on inflection
    • generating known morphs file
      • lemma + count
      • lemma + inflection + count
    • generating study plan
      • lemma only
      • lemma + inflection
    • add and integrate pytest-cov
  • remove fields from the Cards sql-table
  • bump anki version to 24.06.3 (7839431)
  • anonymize arguments instead of deleting them
  • clean up caching.py
  • clean up morph_priority_utils.py
    • make the error messages more specific
    • splinter
  • Add link to "(none) note filter" error message #227
  • add config tests
  • misc algorithm tweaking
    • weights
      • add weights for all the terms
        • $W_{\text{total}}^{\text{all}}$
        • $W_{\text{total}}^{\text{unknown}}$
        • $W_{\text{total}}^{\text{learning}}$
        • $W_{\text{average}}^{\text{all}}$
        • $W_{\text{average}}^{\text{learning}}$
        • $W_{\text{target}}^{\text{all}}$
        • $W_{\text{target}}^{\text{learning}}$
      • increase the weight spinboxes max value to 999
      • rename the weight configs to include the word 'weight'
      • add a note about why there is no $W_{\text{average}}^{\text{unknown}}$
    • coefficients
      • make algorithm coefficients double spinboxes
      • change the step size to '0.1'
      • make the respective configs floats
      • add a new algorithm test with floats
    • targets
      • rename 'target distance' -> 'target difference'
    • change the default values
      • learning morphs distance a coefficients -> 6
      • both target weights -> 10
      • update test collections
  • add dependabot ignore
    • PyQt6
    • PyQt6-Qt6
    • PyQt6-WebEngine
    • PyQt6-WebEngine-Qt6
    • PyQt6_sip
    • numpy
  • update code comments in card_score.py
  • update shortcuts
    • recalc
    • settings
    • progression
  • update min size of settings window
  • update termonology: frequency file -> priority file
    • change directory name in anki profile
    • replace in the guide
    • change frequency file generator
      • rename
      • change default name of output file
    • update tests
  • tweak progression window layout
  • update guide
    • algorithm explanation
    • settings
      • index
      • general tab
      • note filters
      • extra fields
        • am-all-morphs
        • am-all-morphs-count
        • am-score
        • am-score-terms
        • am-unknowns -> am-unknown-morphs
        • am-unknowns-count -> am-unknown-morphs-count
        • am-study-morphs
        • update card template with am-study-morphs
      • tags
      • preprocess
      • card handling
      • algorithm
      • toolbar
      • shortcuts
      • remove recalc tab
    • changes to anki
    • prioritizing
      • frequency files have an increased max size
      • generate new frequency files
        • Cantonese
        • Catalan
        • Chinese
        • Croatian
        • Danish
        • Dutch
        • English
        • Finnish
        • French
        • German
        • Greek (Modern)
        • Italian
        • Japanese
          • anime
          • news
        • Korean
        • Lithuanian
        • Macedonian
        • Norwegian
        • Polish
        • Portuguese
        • Romanian
        • Russian
        • Slovenian
        • Spanish
        • Swedish
        • Ukrainian
    • known morphs
    • generators
    • tweak progression window guide
    • reset tags
  • manual testing on anki 24.06.3
    • macos
      • install from update
        • new configs detected
      • pop-up windows
        • settings
        • reset tags
        • progression
        • generators
        • recalc error: note filter guide
        • invalid priority file
    • windows 11
      • install from update
        • new configs detected
      • pop-up windows
        • settings
        • reset tags
        • progression
        • generators
        • recalc error: note filter guide
        • invalid priority file
  • update version string
  • remove all print statements
  • update contributors
  • rebase/squash and merge
  • add arabic diacritical marks to space morphemizer (Arabic diacritical marks splitting single words into two separate morphs #250)
  • release updated card collection
@mortii mortii added the enhancement New feature or request label Apr 15, 2024
@mortii mortii added this to the v3.0.0 milestone Apr 15, 2024
@mortii mortii self-assigned this Apr 15, 2024
@mortii mortii pinned this issue Apr 15, 2024
@mortii
Copy link
Owner Author

mortii commented Apr 16, 2024

This issue is mostly a placeholder/tracker for the roadmap. Any feedback/questions/discussions can be had here: #224, or in any of the relevant discussions/issues linked above.

Repository owner locked and limited conversation to collaborators Apr 16, 2024
@mortii
Copy link
Owner Author

mortii commented May 18, 2024

Okay, I have a (very badly coded) test version that has implemented the new algorithm and an option to choose between lemma priority and inflection priority!

This version can be found in the latest algorithm branch, or you can download it from google drive (github doesn't like .addon files): ankimorphs-v3-0-0-testing-1

The algorithm "morph targets" stuff looks complicated, but all you do is define a range with no punishment, and then either side of that range you can specify a punishment curve (ax^2+bx+c), the default is this:

Screenshot from 2024-05-18 11-49-09

(graph link)

Any and all feedback would be very much appreciated!

Originally posted by @mortii in #191 (comment)

Changelog

Frequency files

Screenshot from 2024-05-20 11-35-58

Frequency files now have two acceptable versions, a minimal version and a full version:

  • Minimal: only contains a column of morph lemmas. This is useful for inflection heavy languages like Korean where inflections are basically just noise, so this option will result in significantly smaller file sizes.
  • Full: contains morph lemma, inflection, and their respective priorities. This format allows for easy switching between using lemmas or inflection priorities, recommended for most languages.

Algorithm settings tab

Screenshot from 2024-05-20 11-38-07

It seems rather complicated, but hopefully it's simpler than it appears. Let's break it down.

Morph Priority

  • lemma: all morphs that have the same lemma have the same priority, e.g.:
    • "walk": 53
    • "walked": 53
    • "walking": 53
  • inflection: different inflections have different priorities, e.g.:
    • "walk": 53
    • "walked": 345
    • "walking": 763

Weights

The first pass of the algorithm contains 5 adjustable terms:

    score = (
        unknown_morphs_total_priority_score
        + all_morphs_avg_priority_score
        + all_morphs_total_priority_score
        + leaning_morphs_target_difference_score
        + all_morphs_target_difference_score
    )

The respective weights allows you to adjust how much impact the specific terms have on the score. If you don't want a term to have any impact then you can turn the weight all the way down to zero.

Morph Targets

This allows you to define an acceptable range where no punishment is given at all, i.e. the ideal sentence length. On either side of that ideal length you can customize how much punishment will be given, by providing the coefficients of a basic quadratic equation: ax^2+bx+c

The default morph target punishments graph looks like this:
Screenshot from 2024-05-18 11-49-09
Where:

  • the ideal morph length is between 4-6 (no punishment)
  • if the morph length is greater than 6, give an exponential punishment for each extra morph (1x^2+0x+0)
  • if the morph length is less than 4, give an constant punishment for each missing morph (0x^2+1x+0)

"am-score-terms" extra field

Screenshot from 2024-05-20 12-20-15

This new extra-field shows all the individual terms of the score given to cards, making it easier to debug/adjust the algorithm to fit your needs.

Screenshot from 2024-05-20 12-22-53

@mortii
Copy link
Owner Author

mortii commented May 20, 2024

New testing build: 3.0.0-testing-2 (google drive)

Changelog

  • "lower morph targets" settings are now capped to always be lower than the "upper morph targets"

@mortii
Copy link
Owner Author

mortii commented Jun 20, 2024

New testing build: 3.0.0-testing-3 (google drive)

Alterantively, checkout the known-lemma branch.

Changelog

  • if "Morph Prioirity: Lemma" is selected:
    • cards can now be skipped on review if the morphs lemmas are already known
    • all inflections are set to known during recalc if their lemma is known

In future versions the "Morph Prioirity: Lemma" option will be renamed, and it will be moved to a new "General" tab in the settings.

EDIT:
This version has a bug in the calculation of the morph priorities. This has been fixed in the v3 branch which will hopefully be released shortly.

@mortii
Copy link
Owner Author

mortii commented Jul 31, 2024

V3 will be released on friday at ~20:00 GMT+1

If you are upgrading directly from any of the test versions then you will probably experience a crash because of a change in the configs, here is how you fix/prevent that crash:

"Please do the following:\n"
"1. Restart Anki without add-ons (hold shift key while opening Anki)\n"
"2. Restore the default configs of 'AnkiMorphs' (and 'ankimorphs' if you have that)\n\n"
" Tools -> add-ons -> select add-on -> config -> restore defaults\n"
"3. Delete the 'ankimorphs_profile_settings.json' file in the Anki profile folder\n"
"4. Restart Anki\n\n"

@mortii
Copy link
Owner Author

mortii commented Aug 2, 2024

@mortii mortii closed this as completed Aug 2, 2024
@mortii mortii unpinned this issue Aug 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
Archived in project
Development

No branches or pull requests

1 participant