Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep load speed fast #9

Closed
CalebBell opened this issue May 23, 2020 · 0 comments
Closed

Keep load speed fast #9

CalebBell opened this issue May 23, 2020 · 0 comments

Comments

@CalebBell
Copy link
Owner

I am opening an issue to track the load speed of chemicals. I had already forgotten from last weekend how I was measuring load speed, so documenting it seems like a good idea.

I put the following code in a file called load_one_library.py

import cProfile
import os
import numpy as np
from scipy import special
from scipy import interpolate
from scipy import optimize
import pandas as pd
import sys
import json
import io
import datetime
from time import time
import fluids.constants
import fluids.numerics
import fluids
import ht
original_modules = set(sys.modules.keys())

pr = cProfile.Profile()
t0 = time()
pr.enable()
import chemicals
pr.disable()
print('Elapsed time: %f seconds' %(time() - t0))
pr.dump_stats('load_one_library.out')
after_modules = set(sys.modules.keys())
print('Loaded libraries')
print(after_modules.difference(original_modules))

Then I run that script with

python3 -OO load_one_library.py

You have to run it a second time after the first time to ensure all the python bytecode is up to date.

Then I look at where the time is spent with

python3 -m snakeviz load_one_library.out

Then I find the elements.py file, currently the longest to load.

image

Let's leave this issue open indefinitely for now and I'll update it with timings periodically - maybe get some development docs going and move this there at some point.

One side note - the -OO flag optimizes the compiled byte code so docstrings, asserts, and a few other things are not loaded. This is the meaningful number I am targeting. I refuse to be interested in increasing load speed by having less documentation.

This is typically used when building an actual application out of libraries, or on a server when processes are starting up and shutting down often. Because of this, it is important to remember that assert statements should not be used for control flow; they should be development-only checks.

The rest of the script above outputs something like this:

Elapsed time: 0.005867 seconds
Loaded libraries
{'chemicals.solubility', 'chemicals.acentric', 'chemicals.dippr', 'chemicals.elements', 'chemicals.miscdata', 'chemicals', 'chemicals.dipole', 'chemicals.temperature', 'chemicals.critical', 'chemicals.utils', 'chemicals.refractivity', 'chemicals.exceptions', 'chemicals.vapor_pressure', 'chemicals.data_reader', 'chemicals.environment', 'chemicals.virial', 'chemicals.triple', 'chemicals.lennard_jones', 'chemicals.phase_change'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant