feat: improve performance when detecting country codes #274
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description π
This pull request adds a 10x speed up when the country for a phone number is unknown.
Ran tests locally, and they all pass.
Background π
Currently, when parsing international phone numbers this library allocates a significant amount of regexes (256 countries), and will unnecessarily match against all 256, although in practice it can only match a maximum of 1.
Country codes have 1, 2, or 3 digits, and have the interesting property that shorter codes are not prefixes of longer codes.
The
global_phone
library takes advantage of this to optimize country code detection.By taking the first three prefixes of digits, it's possible to do a hash-based lookup instead of cycling through all countries.
The Fix π¨
Applying the techniques mentioned above to optimize
detect_and_parse
.As a result, instead of creating 256 regexes and matching all of them every time a phone with an unknown country code was parsed, it will now perform only 3 hash lookups.
Benchmarks π
This optimization yields a 10x speed up when the country code is unknown! π
Added a new benchmark in
spec/phonelib_ips_bench.rb
, which can be run withrspec
.Before
After
Now the library will perform similarly when a country code is provided than when it needs to be detected.
If we combine this with the work in:
it should make both cases even faster, and make both cases comparable in performance (only 1.03x slower).
Memory Usage π
After this pull request, this use case allocates 5x less memory, so GC pressure will be mitigated as well.
Before
After