Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up generation of (some) high cardinality data #288

Merged
merged 2 commits into from
May 7, 2024

Conversation

cavokz
Copy link
Collaborator

@cavokz cavokz commented May 6, 2024

No description provided.

@cavokz cavokz force-pushed the speed-up-high-cardinality-generation branch 2 times, most recently from 39e1b9b to 1a2aa5a Compare May 6, 2024 11:54
@cavokz cavokz changed the title Seed up util.has_wildcards by regex Speed up generation of (some) high cardinality data May 6, 2024
@cavokz cavokz force-pushed the speed-up-high-cardinality-generation branch from 1a2aa5a to 52dbb6c Compare May 6, 2024 14:59
cavokz added 2 commits May 7, 2024 14:26
Before:
         2104505117 function calls (2104365124 primitive calls) in 1001.439 seconds

   Ordered by: internal time
   List reduced from 117 to 23 due to restriction <0.2>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
211510511  240.397    0.000  521.635    0.000 solution_space.py:83(<genexpr>)
199990000  156.258    0.000  250.252    0.000 __init__.py:37(has_wildcards)
211488530  150.962    0.000  232.323    0.000 fnmatch.py:64(fnmatchcase)
211488530   76.622    0.000   76.622    0.000 {method 'match' of 're.Pattern' objects}
399980000   76.067    0.000   76.067    0.000 {method 'find' of 'str' objects}
   139999   71.592    0.001  321.844    0.002 solution_space.py:37(<setcomp>)
229879884   56.540    0.000   56.540    0.000 solution_space.py:82(<genexpr>)
    45247   52.977    0.001  631.150    0.014 {built-in method builtins.any}
422977060   48.915    0.000   48.915    0.000 {method 'lower' of 'str' objects}
   200000   22.535    0.000  996.415    0.005 __init__.py:204(__call__)
200851259   18.118    0.000   18.118    0.000 {built-in method builtins.isinstance}
   120000   11.643    0.000  973.160    0.008 type_keyword.py:94(<listcomp>)
   119999    2.380    0.000    2.380    0.000 {method 'copy' of 'set' objects}
   120000    1.841    0.000  327.690    0.003 solution_space.py:96(__sub__)
   120000    0.769    0.000  633.150    0.005 solution_space.py:216(__generate)
39998/19999    0.735    0.000    1.654    0.000 _parser.py:507(_parse)
   100000    0.611    0.000 1000.982    0.010 events_emitter.py:114(events_from_branch)
   300000    0.552    0.000    1.312    0.000 __init__.py:273(split_path)
   300000    0.542    0.000    2.027    0.000 __init__.py:53(emit_field)
   100000    0.467    0.000    0.467    0.000 {method 'isoformat' of 'datetime.datetime' objects}
   139999    0.464    0.000  322.386    0.002 solution_space.py:33(__init__)
   800000    0.397    0.000    0.635    0.000 __init__.py:274(<genexpr>)
   120000    0.384    0.000    0.576    0.000 solution_space.py:43(__copy__)

After:

          1031192907 function calls in 467.467 seconds

   Ordered by: internal time
   List reduced from 69 to 14 due to restriction <0.2>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
199990000  149.827    0.000  238.083    0.000 __init__.py:37(has_wildcards)
    23257   95.604    0.004  118.987    0.005 solution_space.py:77(__contains__)
399980000   71.344    0.000   71.344    0.000 {method 'find' of 'str' objects}
   139999   65.798    0.000  303.881    0.002 solution_space.py:37(<setcomp>)
219138599   23.375    0.000   23.375    0.000 {method 'lower' of 'str' objects}
   200000   22.846    0.000  462.636    0.002 __init__.py:204(__call__)
200130000   16.950    0.000   16.950    0.000 {built-in method builtins.isinstance}
   120000    8.238    0.000  439.090    0.004 type_keyword.py:94(<listcomp>)
   119999    2.250    0.000    2.250    0.000 {method 'copy' of 'set' objects}
   120000    1.766    0.000  309.394    0.003 solution_space.py:101(__sub__)
   120000    0.802    0.000  120.826    0.001 solution_space.py:221(__generate)
   100000    0.596    0.000  467.024    0.005 events_emitter.py:114(events_from_branch)
   300000    0.544    0.000    1.270    0.000 __init__.py:273(split_path)
   300000    0.525    0.000    1.971    0.000 __init__.py:53(emit_field)
Before:
         1031192907 function calls in 467.467 seconds

   Ordered by: internal time
   List reduced from 69 to 14 due to restriction <0.2>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
199990000  149.827    0.000  238.083    0.000 __init__.py:37(has_wildcards)
    23257   95.604    0.004  118.987    0.005 solution_space.py:77(__contains__)
399980000   71.344    0.000   71.344    0.000 {method 'find' of 'str' objects}
   139999   65.798    0.000  303.881    0.002 solution_space.py:37(<setcomp>)
219138599   23.375    0.000   23.375    0.000 {method 'lower' of 'str' objects}
   200000   22.846    0.000  462.636    0.002 __init__.py:204(__call__)
200130000   16.950    0.000   16.950    0.000 {built-in method builtins.isinstance}
   120000    8.238    0.000  439.090    0.004 type_keyword.py:94(<listcomp>)
   119999    2.250    0.000    2.250    0.000 {method 'copy' of 'set' objects}
   120000    1.766    0.000  309.394    0.003 solution_space.py:101(__sub__)
   120000    0.802    0.000  120.826    0.001 solution_space.py:221(__generate)
   100000    0.596    0.000  467.024    0.005 events_emitter.py:114(events_from_branch)
   300000    0.544    0.000    1.270    0.000 __init__.py:273(split_path)
   300000    0.525    0.000    1.971    0.000 __init__.py:53(emit_field)

After:
         829635614 function calls in 402.590 seconds

   Ordered by: internal time
   List reduced from 69 to 14 due to restriction <0.2>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
199990000  116.556    0.000  176.679    0.000 __init__.py:39(has_wildcards)
    23293   92.694    0.004  115.693    0.005 solution_space.py:77(__contains__)
   139999   65.935    0.000  242.613    0.002 solution_space.py:37(<setcomp>)
199990000   43.505    0.000   43.505    0.000 {method 'search' of 're.Pattern' objects}
217571286   22.991    0.000   22.991    0.000 {method 'lower' of 'str' objects}
   200000   22.898    0.000  397.796    0.002 __init__.py:204(__call__)
200130000   16.653    0.000   16.653    0.000 {built-in method builtins.isinstance}
   120000    8.169    0.000  374.210    0.003 type_keyword.py:94(<listcomp>)
   119999    2.176    0.000    2.176    0.000 {method 'copy' of 'set' objects}
   120000    1.747    0.000  248.001    0.002 solution_space.py:101(__sub__)
   120000    0.812    0.000  117.431    0.001 solution_space.py:221(__generate)
   100000    0.574    0.000  402.157    0.004 events_emitter.py:114(events_from_branch)
   300000    0.530    0.000    1.295    0.000 __init__.py:274(split_path)
   300000    0.518    0.000    1.987    0.000 __init__.py:53(emit_field)
@cavokz cavokz force-pushed the speed-up-high-cardinality-generation branch 2 times, most recently from f20bcac to 2cfcdfa Compare May 7, 2024 14:18
@cavokz cavokz merged commit d0ad493 into elastic:main May 7, 2024
71 checks passed
@cavokz cavokz deleted the speed-up-high-cardinality-generation branch May 7, 2024 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant