Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running example commands produces error for "halo.c" #104

Closed
ealione opened this issue Oct 7, 2022 · 27 comments
Closed

Running example commands produces error for "halo.c" #104

ealione opened this issue Oct 7, 2022 · 27 comments

Comments

@ealione
Copy link

ealione commented Oct 7, 2022

I have built the project on a "Linux 4.15.0-193-generic #204-Ubuntu" machine, following the exact instructions outlined in the README, without any apparent errors. My goal is to see run perf on it and see what is the most active piece of code (hot kernel).

I thought I should start with running a few of the examples, including:

and the Orbit example.

mkplummer p10.dat 10

I'm constantly getting this error:

`~/nemo$ mkplummer p10.dat 10

nemo Debug Info: [bodytrans_new: invoking cc +saving .o]

Fatal error [mkplummer]: bodytrans(): could not compile expr=r2

`

@teuben
Copy link
Owner

teuben commented Oct 7, 2022

If you run:

  mkplummer p10.dat 10 debug-9

you shiould see a lot of output, and one of the lines I get to see is this:

### nemo Debug Info: [loadobjDL.c:41]: loadobj: /home/teuben/NEMO/nemo/obj/bodytrans/btr_r2.so

Perhaps during the install the btr_r2.c did not compile properly? Can you check this in the directory

 $NEMO/src/nbody/cores/bodysub/

and try

     make btr_r2.so

i suspect this failed for some reason.

One reason perhaps could be that you have a somewhat old system, the 4.15.0 kernel seems old to me. Is that Ubuntu 14 or 16? Should still work though,

@ealione
Copy link
Author

ealione commented Oct 8, 2022

Good news, and bad news. There was no Makefile in the directory you indicated. So I decided to re compile nemo again. This time I'm getting some additional errors;

`/nemo$ source nemo_start.sh
:
/nemo$ mkplummer p10.dat 10 debug-9

Fatal error [mkplummer]: getdparam(mlow=debug-9) parsing error -12, assumed 0.999

:~/nemo$ rotcurves name1=halo pars1=0,1,1 radii=0:8:0.1

Fatal error [rotcurves]: get_potential: no potential halo.c found

:~/nemo$ mkorbit - 1 0 0 0 1 0 potname=halo | orbint - - 10000 0.01 | orbplot -

nemo Debug Info: Dvel=-1

Fatal error [mkorbit]: get_potential: no potential halo.c found

Fatal error [orbint]: error in reading input orbit

nemo Debug Info: [bodytrans_new: invoking cc +saving .o]

Fatal error [orbplot]: bodytrans(): could not compile expr=x

`

@teuben
Copy link
Owner

teuben commented Oct 8, 2022 via email

@teuben
Copy link
Owner

teuben commented Oct 8, 2022

btw, there is a Makefile in $NEMO/src/nbody/cores/bodysub/ for sure. I suspect in that directory something didn't install right. You should do a "make clean install" in that directory and carefully watch any possible error messages. The procedure in that directory hasn't changed in 10+ years (legacy software!)

@ealione
Copy link
Author

ealione commented Oct 10, 2022

Hi Teuben,
this is my exact version of Ubuntu

LSB Version: core-9.20170808ubuntu1-noarch:printing-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch Distributor ID: Ubuntu Description: Ubuntu 18.04.6 LTS Release: 18.04 Codename: bionic

After running /configure I'm afraid this is all I see in the directory:

~$ cd nemo/src/nbody/cores/bodysub/ :~/nemo/src/nbody/cores/bodysub$ ls BTclean bti_1.c bti_key.c btr_0.c btr_ar.c btr_ax.c btr_az.c btr_dens.c btr_eps.c btr_glat.c btr_i.c btr_jx.c btr_jz.c btr_m.c btr_mul.c btr_r2.c btr_r.c btr_v2.c btr_vp.c btr_vr.c btr_vt.c btr_vy.c btr_x.c btr_y.c btr_z.c bti_0.c bti_i.c BTNAMES btr_1.c btr_aux.c btr_ay.c btr_dec.c btr_ekin.c btr_etot.c btr_glon.c btr_jtot.c btr_jy.c btr_key.c btr_mub.c btr_phi.c btr_ra.c btr_t.c btr_v.c btr_vr2.c btr_vt2.c btr_vx.c btr_vz.c btr_xsky.c btr_ysky.c Makefile

@teuben
Copy link
Owner

teuben commented Oct 10, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 10, 2022

It seems that things indeed are not going as expected. I had a look at install.log.

I am executing the following steps:

pasted here to save some space: https://pastebin.com/eCeeR89z

And then having a look at install.log, a fairly large file I get indeed a few errors.

stored here due to size: https://drive.google.com/file/d/1hQf4yUmontufjHFXkHyPfE37ffLMXu8P/view?usp=sharing

@teuben
Copy link
Owner

teuben commented Oct 10, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 10, 2022

I am not sure what's the reason for the issues, maybe there is something different with my system. But it seems that, most probably, the other issues span from this one:

  • I don't have ldso because the command you suggested throws the error
~/nemo/src/scripts$ make install
Makefile:3: /makedefs: No such file or directory
make: *** No rule to make target '/makedefs'.  Stop.

@ealione
Copy link
Author

ealione commented Oct 11, 2022

Not entirely sure why, but running the installation script provided, as described in the projects readthedocs page, instead of performing the process manually, worked fine.

@ealione ealione closed this as completed Oct 11, 2022
@teuben
Copy link
Owner

teuben commented Oct 11, 2022

There is also an abbreviated version of the install on the README.md page in the main repo, which procedure did you follow the first time? Perhaps you forgot a step. Might be good to see if you can reproduce it. I do this so often, that I know it works. But it's easy to forget the "source nemo_start.sh" line if you want to use NEMO in the shell, instead of the installation via the Makefile

@ealione
Copy link
Author

ealione commented Oct 11, 2022

What I tried executing was the following:

./configure --with-yapp=pgplot --without-csh
 make build check bench5
 source nemo_start.sh

btw, what would be a long running example. As I said I want to profile the application and have a look at what is the most important (frequently called function). Usually most examples I had a chance to test, were extremely fast though.

@teuben
Copy link
Owner

teuben commented Oct 11, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 11, 2022

Yup, I realized that tcsh is a must in the end. OK thank you, I will have a look at the benchmark you suggested.

@teuben
Copy link
Owner

teuben commented Oct 11, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 11, 2022

Here you can find the complete output during my latest build: https://pastebin.com/Aa1M3qXU

or do you mean the output from my unsuccessful attempt?

@teuben
Copy link
Owner

teuben commented Oct 11, 2022

That looks quite allright now.

@teuben
Copy link
Owner

teuben commented Oct 13, 2022

now that NEMO is running for you, what would you like to do with it. If you're looking for a challenge, I have some interesting issues to discuss.

@ealione
Copy link
Author

ealione commented Oct 13, 2022

For the most part I am interested in seeing if I can offload parts of nemo to a hardware accelerator as a benchmark for a toy compiler I am building. I am more than interested to hear what you might have in mind, but I'm afraid that my "astro"physics knowledge will disappoint.

@teuben
Copy link
Owner

teuben commented Oct 13, 2022

The mysterious (what i think is) compiler error is described in #98

@teuben
Copy link
Owner

teuben commented Oct 13, 2022

I've been playing with OpenMP but not had a lot of luck speeding things up significantly

@ealione
Copy link
Author

ealione commented Oct 14, 2022

So the way I see it, the function void potential_double (ndim, pos, acc, pot, time) is the most costly for operations like mkplummer with a large number of bodies.
image

Because of the time variable it needs to run in a serial manner for each body so can't be parallelized there. But there is nothing stopping it from being run in parallel for all bodies, right?

There seem to be a whole lot of potential_double functions so I'm trying to see which one is used in this case and where, in order to check if there is any possibility for parallelizing the calls for each body.

@teuben
Copy link
Owner

teuben commented Oct 14, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 14, 2022

Apologies, you are correct, it was obviously orbit. I tested a bunch of stuff.
I'll have a look at snapscale

@teuben
Copy link
Owner

teuben commented Oct 14, 2022 via email

@ealione
Copy link
Author

ealione commented Oct 15, 2022

It doesn't seem that I am able to find "snapscale' in the codebase. Additionally yes, I had a look at potential and it definitely is not parallelizable!

@teuben
Copy link
Owner

teuben commented Oct 15, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants