Environment variables #7

hannorein · 2023-01-14T16:58:40Z

I suggest not using environment variables. I think this can be confusing to users who might not be familiar with the concept (one rarely comes across environment variables when using standard python packages).

Right now environment variables are only used in two locations to specify the path for data files. I suggest having an explicit init function that has to be called by the user before using assist. Something like

#include "assist.h"
assist_init("path/to/planet/ephem", NULL);
// or
assist_init("path/to/planet/ephem", "path/to/smallbody/ephem");

and similarly in python

import assist
assist.init("path/to/planet/ephem")
# or 
assist.init("path/to/planet/ephem", "path/to/smallbody/ephem")

If this function call results in an error for the user, then it's immediately clear what the issue is and how to fix it. It removes the stumbling block of learning what an environment variable is and how to set it.

Somewhat related to this, the metadata and the pointer to the data structures get stored in a static variable, which is effectively a global variable. If you only every read from that variable after it's initialized (which I think is the case), then the reading part is thread safe. However, the initialization is not. Imagine two threads both initializing this structure at the same time. It might work in practice, because it's a simple data structure, but there really should be a mutex around the init function if we want to claim it's thread safe (and to prevent we're accidentally loading in the data and allocating memory more than once).

The text was updated successfully, but these errors were encountered:

matthewholman · 2023-01-16T15:25:41Z

Should this be part of the existing assist_initialize() function or something separate?

hannorein · 2023-01-16T17:13:09Z

I'm not sure yet how to do this best...

matthewholman · 2023-01-23T17:19:11Z

I made a first attempt at this. There is now an assist_ephem_init("/path/to/planets/file", "path/to/asteroids/file") function. This has to be called before assist_integrate() is called.

There is probably a more graceful solution, but this is a start.

I need to add a corresponding python function.

matthewholman · 2023-01-23T17:36:09Z

Hanno, you'll need to add the paths to your ephemeris files for the unit tests to pass.

hannorein · 2023-01-23T18:54:02Z

Cool. Something like this is what I had in mind. I'm teaching today, but I'll think about how to make it more graceful/optional. I think it would make sense to look for the datafile in ASSIST_DIR/data by default.

This is much easier in python where you can just have an optional argument.

aryaakmal · 2023-01-23T20:05:35Z

Matt- it looks like your changes are specific to 'mholman' ... does the user need to modify the code ad hoc, or do you intend to make this more general? Update: Nevemind - looks like that is only in the unit tests (and the problem.c) .

aryaakmal · 2023-01-23T20:30:50Z

Would it make sense to look for the ephemeris files in the default location and only call assist_ephem_init if they are not found? Or modify the function in forces.c to look in the default location ...

matthewholman · 2023-01-23T20:37:13Z

The main ideas are (1) to avoid using environment variables, because users are not always familiar with those; and (2) to have the users explicitly call something like assist_ephem_init() with paths to the ephemeris files, so that any errors are very clear.

aryaakmal · 2023-01-23T20:43:40Z

So, by default assist looks for rebound installation in the same directory unless the REB_DIR environment variable is set. By the same token, cannot assist look for the files in assist/data by default before calling assist_ephem_init? The version with environment variables would look there, unless the environment variable were set to something else - so this would follow the same logic without requiring the user to set them, or to call assist_ephem_init, unless they choose to use some non-default location.

hannorein · 2023-01-23T21:56:57Z

I like the idea of @aryaakmal. We could have a char[] in assist_extras which is set NULL by default (look for files in default directory). If the user calls assist_ephem_init (or maybe assist_set_path) then it would set the char[] to something other than null and ASSIST would look there for the files.

hannorein · 2023-01-23T21:58:26Z

(with a graceful exit and message if the files are not in the default location)

aryaakmal · 2023-01-23T22:48:04Z

The same logic should apply to the wrapper. Currently a pip install of assist will not find the ephemeris files unless the environment variable is set - or unless the files are in the default location and the following is set

sys.path.append(r'/Users/mholman/assist')

matthewholman · 2023-01-23T23:44:35Z

I like the idea of storing the file names (or possibly pointers to the ephemeris objects) in assist_extras, but I am getting a little stuck on the order of things. With the current design, memory for a version of assist_extras is allocated when assist_integrate() calls assist_attach(), which then calls assist_initialize(). But that happens after assist_ephem_init().

matthewholman · 2023-01-24T15:25:12Z

I pushed another attempt at this, but it seems to have broken the benchmark unit test (and perhaps others).

hannorein · 2023-01-24T15:28:14Z

I agree, that order of function calls and memory management are a bit confusing. We struggled a lot with that with REBOUND and REBOUNDx too. One issue with having one function assist_integrate() is that everything is happening in one function call. So any new functionality needs to be a new argument which can quickly become overwhelming for someone who only uses a fraction of the functionality. If you look at some other C libraries, the workflow is typically something like:

struct object = smth_create();
object->setting = 0;
smth_do(object);
smth_free(object);

This is what I tried to replicate using the "plain interface". When it comes to memory management, the important question to always ask is: who owns the memory? The answer almost always helps with figuring when to allocate and free memory. For example, right now the space for the ephemeris data is allocated in assist_jpl_init. But it's never freed. That's not completely wrong, because the memory is allocated as a static variable. But I think it would make sense to add the data to the assist_extras struct. Then it's clear who it belongs to and it can be freed by assist_free.

matthewholman · 2023-01-24T15:47:11Z

This makes a lot of sense, but I think I am still missing something. Does this mean that there can only be one assist_extras struct?

hannorein · 2023-01-24T15:50:07Z

No quite the opposite. You could have multiple assist_extras in one simulation. For example, say, one which has GR turned on and one where it's off. Then you could integrate both of them and study the effect of GR on the trajectory.

matthewholman · 2023-01-24T15:53:14Z

That's what I hoped! But how do we share the memory allocated by assist_jpl_int and then free it when all the runs are finished?

hannorein · 2023-01-24T16:01:03Z

I'm not sure about the best approach! I can see two options:

Keep that memory static, but add a "reference counter" (also static) that gets increased when assist_extras is called and decreased when assist_free is called. When the last assist_free sees a reference counter of 0, it can free the static memory.
Just not bother and have the memory allocated multiple times so that each assist_extras struct owns their own memory can can free it when itself is freed.

I think 2) would be the cleanest approach. It also works on multiple threads without any extra work. Do you think performance would be an issue? It depends how short or long individual integrations typically are.

hannorein · 2023-01-24T16:54:42Z

Somewhat related: I'm trying to understand a bit better what assist_jpl_init does. It seems to read in names and values of constants in the ephemeris file. But they are never used. Instead, the constants that are hard coded in const.h are used. That file was generated with dev_tools/constants/main.c.

Wouldn't it make sense to just use the constants in the ephemeris file without this intermediate step?

matthewholman · 2023-01-24T16:56:29Z

Option 2 is definitely cleaner, but there is some overhead in the initialization step, as you just saw. Option 1 seems easy enough, but I don't know what would be entailed in making that thread safe.

aryaakmal · 2023-01-24T17:02:39Z

It may be vestigial. I think the .h file may have superseded the original jpl_init. Matt can you confirm?

…

On Tue, Jan 24, 2023 at 11:54 AM Hanno Rein ***@***.***> wrote: Somewhat related: I'm trying to understand a bit better what assist_jpl_init does. It seems to read in names and values of constants in the ephemeris file. But they are never used. Instead, the constants that are hard coded in const.h are used. That file was generated with dev_tools/constants/main.c. Wouldn't it make sense to just use the constants in the ephemeris file without this intermediate step? — Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOMJBQZADX5MN7ZSCI4ALUDWUACNZANCNFSM6AAAAAAT3LUMMQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

matthewholman · 2023-01-24T17:04:02Z

The const.h file gives us easy access to the various constants, but it be done other ways. I believe @aryaakmal is correct that the version in jpl_init is vestigial.

hannorein · 2023-01-24T17:47:07Z

(I think it would be more consistent to read the constants from the ephemeris file directly. If the file ever changes, one would need to manually recreate const.h. And given that the ephemeris file is constantly queried for all the actual ephemeris data, the ~20 constants that would be read only once at the beginning should not affect the performance. But I think that leads us away from the main topic.)

Regarding the bigger picture. How about the following:

We could create a new structure assist_ephem which represents the ephemeris data. The syntax could look something like:

struct assist_ephem* ephem = assist_ephem_alloc("path to file");
...
xyz = assist_ephem_get_position(ephem, PLANET_MARS, 12345.55);
...
assist_ephem_free(ephem);

After creating assist_ephem, it's read only. Nothing ever needs to be changed. Because of that, thread safety is easy. Functions that just query the position of a body at a given time such as assist_ephem_get_position above would only need access to assist_ephem. They don't need access to an assist_extras. Maybe users actually like the ability to simply query the ephemeris file for positions without doing an integration. Internally, the force routines would also call something like assist_ephem_get_position.

When one wants to do an integration then one creates an assist_extras structure. This would need two arguments, the REBOUND simulation to attach itself to, and an assist_ephem to query the ephemeris. So the entire syntax could look something like this:

struct reb_simulation* sim = reb_create_simulation()
struct assist_ephem* ephem = assist_ephem_alloc("path to file");
struct assist_extras* extras = assist_extras_init(sim, ephem);

assist_integrate(extras, ...);
// or
reb_integrate(sim, 1234.56);

...
rebound_simulation_free(sim);
assist_extras_free(extras);
assist_ephem_free(ephem);

One assist_ephem can be used by multiple assist_extras (because it's read only), so only one (slow) initialization is needed. The syntax is more verbose than before. But I think this would make sense logically. ASSIST is basically a glue code providing a bridge between REBOUND and the JPL ephemeris data. The JPL ephemeris functionality would be a feature of their own. And maybe attract a few more users who are only interested in accessing the ephemeris data and don't want to do an integration. Doing it this way would also get rid of all the static variables (which people in general don't like too much).

matthewholman · 2023-01-24T20:16:02Z

This is a nice solution, @hannorein. Thank you for sketching it out so clearly.

I agree that the ephemeris access routines will be useful by themselves. I've been using all_ephem() for various things; it's accessible from python. We can make sure the new version is too.

I think I can get started on this code, unless you or @aryaakmal would prefer to.

hannorein · 2023-01-25T02:11:53Z

Either way is fine. I should also have time tomorrow. One thing I’m not super happy with yet is the naming: init, create, alloc, attach all kind of do the same thing. I think I’m mostly to blame for this. It has never been consistent in Rebound …

matthewholman · 2023-01-25T13:48:24Z

unit_tests/benchmark has an issue where it appears that memory is being modified about being freed. The error says it's in rebound, but my bet is that it's something to do with the changes I made to assist_ephem_init(). Unfortunately, I haven't identified the problem yet. @hannorein, could you take a look?

hannorein · 2023-01-25T13:50:16Z

Will do.

I'm implementing the stuff we talked about above right now. Let me know if that interferes with what you're doing and I should stop.

matthewholman · 2023-01-25T13:53:28Z

No, it doesn't, but I will pause anyway.

hannorein mentioned this issue Jan 25, 2023

assist_ephem framework #30

Merged

matthewholman closed this as completed Jan 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Environment variables #7

Environment variables #7

hannorein commented Jan 14, 2023

matthewholman commented Jan 16, 2023

hannorein commented Jan 16, 2023

matthewholman commented Jan 23, 2023

matthewholman commented Jan 23, 2023

hannorein commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 •

edited

Loading

aryaakmal commented Jan 23, 2023

matthewholman commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 •

edited

Loading

hannorein commented Jan 23, 2023

hannorein commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 •

edited

Loading

matthewholman commented Jan 23, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023 •

edited

Loading

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023 •

edited

Loading

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

aryaakmal commented Jan 24, 2023 via email

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 25, 2023

matthewholman commented Jan 25, 2023

hannorein commented Jan 25, 2023

matthewholman commented Jan 25, 2023

Environment variables #7

Environment variables #7

Comments

hannorein commented Jan 14, 2023

matthewholman commented Jan 16, 2023

hannorein commented Jan 16, 2023

matthewholman commented Jan 23, 2023

matthewholman commented Jan 23, 2023

hannorein commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 • edited Loading

aryaakmal commented Jan 23, 2023

matthewholman commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 • edited Loading

hannorein commented Jan 23, 2023

hannorein commented Jan 23, 2023

aryaakmal commented Jan 23, 2023 • edited Loading

matthewholman commented Jan 23, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023 • edited Loading

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023 • edited Loading

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

aryaakmal commented Jan 24, 2023 via email

matthewholman commented Jan 24, 2023

hannorein commented Jan 24, 2023

matthewholman commented Jan 24, 2023

hannorein commented Jan 25, 2023

matthewholman commented Jan 25, 2023

hannorein commented Jan 25, 2023

matthewholman commented Jan 25, 2023

aryaakmal commented Jan 23, 2023 •

edited

Loading

aryaakmal commented Jan 23, 2023 •

edited

Loading

aryaakmal commented Jan 23, 2023 •

edited

Loading

hannorein commented Jan 24, 2023 •

edited

Loading

hannorein commented Jan 24, 2023 •

edited

Loading