In [1]:
from pathlib import Path

from astropy import table
from astroquery.jplhorizons import Horizons
import pandas as pd
from pympler.process import _ProcessMemoryInfoProc

from lhorizon import LHorizon
from lhorizon.tests.utilz import MockResponse

`lhorizon` is a strongly divergent fork of `astroquery.jplhorizons`.
When compared to `jplhorizons`, it offers:
* fuller coverage of many Horizons API areas
* a wide array of helper functions
* an optional targeting module that provides SPICE integration and bodycentric coordinate transformations
* a more consistent, modular, and extensible interface, and
* increased performance.

This notebook illustrates several performance differences between `jplhorizons` and `lhorizon`.

[Extra files not distributed with this repository are necessary to use this notebook](https://drive.google.com/file/d/1o3R7EEZt06mbqhh8sAoH6TgFsIcvAIov/)
Please grab and decompress this file, creating a directory named 'samples', and place that 'samples' directory in this
directory ('benchmark').

In [2]:
# grab cached http responses from JPL Horizons
def make_mock_response(cached_response_fn):
    def respond_mockingly(*_, **__):
        with open(cached_response_fn, 'rb') as mock_response_stream:
            mock_response_bytes = mock_response_stream.read()
        return MockResponse(content=mock_response_bytes)
    return respond_mockingly


cached_responses = {
    r.name.split("_")[-1]: r
    for r in Path('samples/').iterdir()
    if 'response' in r.name
}

In [3]:
# insert these responses into jplhorizons.Horizons and lhorizon.LHorizon objects
mocked_horizons, mocked_lhorizons = [], []
for response_ix in sorted(cached_responses.keys()):
    mock_response = make_mock_response(cached_responses[response_ix])
    horizon = Horizons()
    horizon.cache_location = None
    horizon.ephemerides_async = mock_response
    horizon.query_type='ephemerides'
    mocked_horizons.append(horizon)
    lhorizon = LHorizon()
    lhorizon.response = mock_response()
    mocked_lhorizons.append(lhorizon)

In [4]:
%%time
s_mem = _ProcessMemoryInfoProc().rss
lhorizon_dataframes = [
    l.dataframe() for l in mocked_lhorizons
]
full_lhorizon_table = pd.concat(lhorizon_dataframes)
e_mem = _ProcessMemoryInfoProc().rss
print(f"{round((e_mem - s_mem) / 1024 ** 2)} MB used")

318 MB used
CPU times: user 1.98 s, sys: 290 ms, total: 2.27 s
Wall time: 2.27 s


In [5]:
%%time
# compare parser speed and (crudely) memory usage of jplhorizons and lhorizon
s_mem = _ProcessMemoryInfoProc().rss
horizons_tables = [
    horizon.ephemerides() for horizon in mocked_horizons
]
full_horizon_table = table.vstack(horizons_tables)
e_mem = _ProcessMemoryInfoProc().rss
print(f"{round((e_mem - s_mem) / 1024 ** 2)} MB used")

611 MB used
CPU times: user 23 s, sys: 197 ms, total: 23.2 s
Wall time: 23.2 s


**notes**

* Generally, `lhorizon` will perform the above operations about 10x as fast as `jplhorizons` in about 50% as much memory. However, this may vary widely depending on the environment. Also, astropy `Table` objects are difficult to introspect directly, and  `_ProcessMemoryInfoProc().rss`, which returns resident set size of the containing process, may be unreliable depending on how your particular environment allocates memory to Notebook processes. For a clearer look at this, run the memory_profile* scripts in this directory (note that in some environments, these functions may run quite slowly under this profiler; don't be in a hurry).
* `lhorizon` is _often_ faster than `jplhorizons` at fetching data from Horizons, and the bulk query helpers in `lhorizon.handlers` can also greatly expedite some queries. However, there is no consistent way to compare these steps because Horizons has so much serverside variability in response time, which is why we use cached/mocked responses in these examples. 
* the `horizon.query_type='ephemerides'` statement disables `astroquery` default caching. `astroquery` caches many queries as .pickle files, (by default in the user's home directory). Otherwise, if this benchmark is run multiple times in the same environment, even across sessions, `jplhorizons` will load an `astropy.Table` object from a pickle file rather than querying and parsing Horizons' outputs. `lhorizon` does not, and does not plan to, offer this type of automated caching, due to concerns about cross-environment stability, data freshness, transparency of local storage usage, and so on. If you would like to repeatedly run identical queries against Horizons and do not wish to make intermediate data products, `jplhorizons` may be superior for your use case.