Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prakriya] Optimize the tripadi module #12

Closed
akprasad opened this issue Dec 28, 2022 · 1 comment
Closed

[prakriya] Optimize the tripadi module #12

akprasad opened this issue Dec 28, 2022 · 1 comment
Labels
performance Enhancement that improves performance without changing functionality

Comments

@akprasad
Copy link
Contributor

Profiling indicates that the tripadi module is slow.

Many of the rules in the tripadi need to iterate over every character in the string so that they can apply various sandhi changes. Currently, we create a new CompactString for each of these rules. My rough guess is that we create a dozen such strings for each word we derive, even if none of the rules have scope to apply. CompactString shouldn't stack allocate in most cases, but the copy work required here is still slow.

Once we confirm that this is a problem with profiling, we should avoid the extra copies here. Two approaches that come to mind:

  1. Instead of creating a new string, iterate over the Term strings and manage indices carefully.

  2. Store one copy of the string and rebuild it only if a rule applies. The code would follow the basic pattern of ItPrakriya, e.g., by extending the Prakriya struct with new data and helper methods.

I think (2) is generally cleaner, and it has the side effect of improving our APIs.

@akprasad akprasad added the performance Enhancement that improves performance without changing functionality label Dec 28, 2022
@akprasad
Copy link
Contributor Author

This has been fixed locally. I don't see a performance improvement, sadly, but the resulting API is cleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Enhancement that improves performance without changing functionality
Projects
None yet
Development

No branches or pull requests

1 participant