[prakriya] Optimize the `tripadi` module #12

akprasad · 2022-12-28T19:10:20Z

Profiling indicates that the tripadi module is slow.

Many of the rules in the tripadi need to iterate over every character in the string so that they can apply various sandhi changes. Currently, we create a new CompactString for each of these rules. My rough guess is that we create a dozen such strings for each word we derive, even if none of the rules have scope to apply. CompactString shouldn't stack allocate in most cases, but the copy work required here is still slow.

Once we confirm that this is a problem with profiling, we should avoid the extra copies here. Two approaches that come to mind:

Instead of creating a new string, iterate over the Term strings and manage indices carefully.
Store one copy of the string and rebuild it only if a rule applies. The code would follow the basic pattern of ItPrakriya, e.g., by extending the Prakriya struct with new data and helper methods.

I think (2) is generally cleaner, and it has the side effect of improving our APIs.

The text was updated successfully, but these errors were encountered:

akprasad · 2023-10-19T04:18:28Z

This has been fixed locally. I don't see a performance improvement, sadly, but the resulting API is cleaner.

akprasad added the performance Enhancement that improves performance without changing functionality label Dec 28, 2022

akprasad closed this as completed Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prakriya] Optimize the `tripadi` module #12

[prakriya] Optimize the `tripadi` module #12

akprasad commented Dec 28, 2022

akprasad commented Oct 19, 2023

[prakriya] Optimize the tripadi module #12

[prakriya] Optimize the tripadi module #12

Comments

akprasad commented Dec 28, 2022

akprasad commented Oct 19, 2023

[prakriya] Optimize the `tripadi` module #12

[prakriya] Optimize the `tripadi` module #12