Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Extremelly slow checks #9

Closed
JensRantil opened this issue Aug 7, 2012 · 9 comments
Closed

Extremelly slow checks #9

JensRantil opened this issue Aug 7, 2012 · 9 comments

Comments

@JensRantil
Copy link
Contributor

I've just written a unittest that checks my pep257 compliance using the API created in issue #7. However, the issue is the checks are taking extremelly long to execute. Here's the output from profiling the test:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   3/1    0.000    0.000   30.775   30.775 /Users/jens/Development/src/my_projects/rewind/src/nose-1.1.2-py2.7.egg/nose/suite.py:175(__call__)
   3/1    0.000    0.000   30.775   30.775 /Users/jens/Development/src/my_projects/rewind/src/nose-1.1.2-py2.7.egg/nose/suite.py:196(run)
     2    0.000    0.000   30.774   15.387 /Users/jens/Development/src/my_projects/rewind/src/nose-1.1.2-py2.7.egg/nose/case.py:44(__call__)
     2    0.000    0.000   30.774   15.387 /Users/jens/Development/src/my_projects/rewind/src/nose-1.1.2-py2.7.egg/nose/case.py:115(run)
     2    0.000    0.000   30.773   15.386 /Users/jens/Development/src/my_projects/rewind/src/nose-1.1.2-py2.7.egg/nose/case.py:142(runTest)
     2    0.000    0.000   30.773   15.386 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/unittest/case.py:375(__call__)
     2    0.000    0.000   30.773   15.386 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/unittest/case.py:286(run)
     1    0.000    0.000   30.136   30.136 /Users/jens/Development/src/my_projects/rewind/src/rewind/test/test_code.py:30(testPep257Conformance)
     1    0.000    0.000   30.134   30.134 /Users/jens/Development/src/my_projects/rewind/lib/python2.7/site-packages/pep257.py:320(check_files)
 336/6    0.004    0.000   30.133    5.022 /Users/jens/Development/src/my_projects/rewind/lib/python2.7/site-packages/pep257.py:114(<lambda>)
   118    0.011    0.000   30.133    0.255 /Users/jens/Development/src/my_projects/rewind/lib/python2.7/site-packages/pep257.py:304(check_source)
    96    0.001    0.000   23.562    0.245 /Users/jens/Development/src/my_projects/rewind/lib/python2.7/site-packages/pep257.py:223(parse_contexts)
  1768    0.055    0.000   17.256    0.010 /Users/jens/Development/src/my_projects/rewind/lib/python2.7/site-packages/pep257.py:208(parse_methods)

My test is written like so:

def testPep257Conformance(self):
    """Test that we conform to PEP257."""
    pyfiles = self._get_all_pyfiles()
    errors = pep257.check_files(pyfiles)
    if errors:
        print("There were errors:")
        for error in errors:
            print(error)
    self.assertEquals(len(errors), 0)

and the total number of lines of code for the six Python modules are 1957 LOC.

@JensRantil
Copy link
Contributor Author

Oh -- I am running Python 2.7.1.

@JensRantil
Copy link
Contributor Author

This can be reproduced from command line. Execution took 17 seconds with:

$ ../bin/pep257 rewind/*.py rewind/test/*.py

@JensRantil
Copy link
Contributor Author

Oh, and one last thing, if you would like me to publish my current pep257 branch for you to reproduce this, I can do that.

@JensRantil
Copy link
Contributor Author

Just looking at the code it seems like parse_methods(...) might be called many times with the same arguments in parse_contexts(...) (and takes account for 17 seconds of execution). Refactoring so that each source is parsed only once and having the results cached, one might decrease the time spent in that function a lot. This could possibly be done by implementing something similar to this or this.

@JensRantil
Copy link
Contributor Author

This is what the profiling result looks like after the changes that pull request #12 introduces:

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      1    0.002    0.002    6.025    6.025 pep257.py:67(<module>)
      1    0.003    0.003    5.996    5.996 pep257.py:366(main)
  190/6    0.001    0.000    5.991    0.999 pep257.py:130(<lambda>)
    118    0.008    0.000    5.991    0.051 pep257.py:323(check_source)
     96    0.000    0.000    4.988    0.052 pep257.py:242(parse_contexts)
    533    0.180    0.000    4.269    0.008 pep257.py:188(parse_top_level)
   1264    1.169    0.001    3.480    0.003 StringIO.py:168(readlines)
   1040    0.007    0.000    2.894    0.003 pep257.py:137(abs_pos)
     65    0.000    0.000    2.495    0.038 pep257.py:208(parse_classes)
1152457    1.685    0.000    2.222    0.000 StringIO.py:139(readline)
 331865    1.060    0.000    2.033    0.000 tokenize.py:264(generate_tokens)
 314711    0.080    0.000    2.022    0.000 {next}
     78    0.000    0.000    1.774    0.023 pep257.py:204(parse_functions)
     78    0.000    0.000    0.944    0.012 pep257.py:118(cached_func)
    135    0.003    0.000    0.944    0.007 pep257.py:226(parse_methods)
 309500    0.771    0.000    0.771    0.000 {built-in method match}
    112    0.002    0.000    0.707    0.006 pep257.py:278(__init__)
    224    0.060    0.000    0.703    0.003 pep257.py:144(rel_pos)
1152693    0.362    0.000    0.362    0.000 {method 'find' of 'str' objects}
   2108    0.018    0.000    0.200    0.000 pep257.py:172(parse_docstring)

@JensRantil
Copy link
Contributor Author

Every file checked generates two calls to parse_functions(...), parse_classes(...) and parse_methods(...) through parse_contexts(...). By caching those results whenever possible a one would be able to gain 2.495/2+1.774/2+ 0.944/2=2.59 seconds more in the example above.

@JensRantil
Copy link
Contributor Author

Every file checked generates two calls to parse_functions(...), parse_classes(...) and parse_methods(...) through parse_contexts(...). By caching those results whenever possible a one would be able to gain 2.495/2+1.774/2+ 0.944/2=2.59 seconds more in the example above.

This fix was added in pull request #12.

@keleshev
Copy link
Contributor

#12 is merged now. pep257 is probably still 10 times slower than it could be, but good enough for now.

If you come up with more speed improvements—you are welcome!

@JensRantil
Copy link
Contributor Author

#12 is merged now. pep257 is probably still 10 times slower than it could be, but good enough for now.

I agree, but works for now!

If you come up with more speed improvements—you are welcome!

Sure! But for now it doesn't hinder me in running my tests.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants