Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize previous cog output handling #30

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

gavriil-deshaw
Copy link
Contributor

@gavriil-deshaw gavriil-deshaw commented Mar 4, 2024

Using StringIO for input and output files in all cases

Storing the previous code present in the file as a list instead of a string

Fixes #29

@gavriil-deshaw
Copy link
Contributor Author

FWIW, all tests pass:

$ make test
tox -q
py37: SKIP ⚠ in 0 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.36s
py38: OK ✔ in 1.53 seconds
py39: SKIP ⚠ in 0 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.37s
py310: OK ✔ in 1.51 seconds
...........................................................................................................................................                                                                                   [100%]
139 passed in 0.36s
py311: OK ✔ in 1.47 seconds
py312: SKIP ⚠ in 0.12 seconds
Name                        Stmts   Miss Branch BrPart   Cover   Missing
------------------------------------------------------------------------
cogapp/__init__.py              1      0      0      0 100.00%
cogapp/__main__.py              3      3      0      0   0.00%   3-7
cogapp/cogapp.py              487      0    212      0 100.00%
cogapp/makefiles.py            22      0     16      0 100.00%
cogapp/test_cogapp.py         854      3     88      3  99.36%   1300->1302, 2102, 2240-2241
cogapp/test_makefiles.py       68      0     10      0 100.00%
cogapp/test_whiteutils.py      68      0      0      0 100.00%
cogapp/utils.py                37      0      8      0 100.00%
cogapp/whiteutils.py           43      0     34      0 100.00%
------------------------------------------------------------------------
TOTAL                        1583      6    368      3  99.54%
  py37: SKIP (0.00 seconds)
  py38: OK (1.53 seconds)
  py39: SKIP (0.00 seconds)
  py310: OK (1.51 seconds)
  py311: OK (1.47 seconds)
  py312: SKIP (0.12 seconds)
  coverage: OK (0.40 seconds)
  congratulations :) (5.50 seconds)

@nedbat
Copy link
Owner

nedbat commented Mar 6, 2024

Thanks for the pull request! I've never used on cog on such large files so haven't suffered the performance problem.

Looking at your changes, I'm not sure we need to deal with the input and output files differently. I think the speed problems are all due to how previous is managed (string concatenation instead of list appending). Would you mind trying just that change and see if the speed is still good?

@gavriil-deshaw
Copy link
Contributor Author

Hey Ned, thanks for the quick reply! You're right - the speed is the same only with the change to previous! I'll push an updated commit shortly.

Thanks,
Panayiotis

P.S.: Something weird is going on and I'm not sure why...
Here are some results for cog -xc:
Normal cog:

real    0m34.789s
user    0m14.104s
sys     0m20.461s

Cog with io.StringIO() as fIn:

real    0m9.625s
user    0m9.039s
sys     0m0.524s

Cog with io.StringIO() as fIn and previous as list:

real    0m0.172s
user    0m0.123s
sys     0m0.021s

Cog with previous as list:

real    0m0.136s
user    0m0.108s
sys     0m0.021s

I ran each one multiple times and got roughly the same results. It is not clear to me why if previous is a string then the handling of fIn matters, but if previous is a list then it doesn't. Just out of curiosity, can you think of any reasons?

Storing the previous cog output present in the file as a list instead of a string
@gavriil-deshaw gavriil-deshaw changed the title Optimize file and previous output handling Optimize previous cog output handling Mar 7, 2024
@nedbat
Copy link
Owner

nedbat commented Mar 7, 2024

Hmm, that is odd, I can't think why a stringIO for reading would have an effect at all, but I like the simplicity of the latest code, and it gets us the full benefit!

@nedbat nedbat merged commit ca514d7 into nedbat:main Mar 7, 2024
20 checks passed
@gavriil-deshaw
Copy link
Contributor Author

Exactly! Thanks for merging :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling of old output slows down cog
2 participants