Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic_str_validation cache #4555

Merged
merged 2 commits into from
Apr 18, 2019

Conversation

jgkamat
Copy link
Member

@jgkamat jgkamat commented Jan 30, 2019

Currently, this is the most expensive part of string validation, especially on longer strings. Unfortunately, this method is called quite frequently (eg: user agent), during intercept request, and needs to be cheap.

The 'proper' fix for this would be to subclass String (and all other immutable types), add a flag to set when fully validated, and use that flag instead. Ideally, we could do this for dicts too, so we wouldn't need to do the traversal which is extremely costly, and gets even worse if people have bound a lot of keys.

If you'd like, I could try to solve this a different way. This shouldn't take up much more memory because these strings are sitting in the config system anyway, but I really don't like sticking caches on everything instead of the proper fix.

I also made a profile for looking up default bindings 😢 .

Before:

-------------------------------------- benchmark: 4 tests -------------------------------------
Name (time in us)                    Min                    Max                Median          
-----------------------------------------------------------------------------------------------
test_get_str_benchmark[0]         3.6000 (1.0)          12.8400 (1.0)          3.7600 (1.0)    
test_get_str_benchmark[1]        12.8600 (3.57)         31.0200 (2.42)        13.1100 (3.49)   
test_get_str_benchmark[2]       762.0310 (211.68)      982.2980 (76.50)      791.0010 (210.37) 
test_get_dict_benchmark       5,548.5810 (>1000.0)  25,766.7170 (>1000.0)  5,615.2400 (>1000.0)
-----------------------------------------------------------------------------------------------

After

-------------------------------------- benchmark: 4 tests -------------------------------------
Name (time in us)                    Min                    Max                Median          
-----------------------------------------------------------------------------------------------
test_get_str_benchmark[2]         2.8500 (1.0)          25.3700 (2.08)         3.0200 (1.0)    
test_get_str_benchmark[1]         2.9090 (1.02)         23.0900 (1.89)         3.0500 (1.01)   
test_get_str_benchmark[0]         3.0100 (1.06)         12.2000 (1.0)          3.1600 (1.05)   
test_get_dict_benchmark       4,865.0920 (>1000.0)  25,787.3310 (>1000.0)  4,934.3300 (>1000.0)
-----------------------------------------------------------------------------------------------

It's weird that the dict benchmark improved from the string cache, but idk...


This change is Reviewable

@jgkamat jgkamat added the component: performance Issues with performance/benchmarking. label Jan 30, 2019
@jgkamat jgkamat self-assigned this Jan 30, 2019
A very expensive part of getting string config values is the string
validation. Thankfully, not many values seem to use the 'forbidden'
option.

The 'proper' fix for this would be to subclass String (and all other
immutable types), add a flag to set when fully validated, and use that
flag instead.

This is more of a band-aid until we can get a proper solution for the
config system.
@The-Compiler The-Compiler merged commit 2b85aae into qutebrowser:master Apr 18, 2019
@The-Compiler
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: performance Issues with performance/benchmarking.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants