Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU low load #55

Closed
IgorK7 opened this issue Mar 3, 2023 · 7 comments
Closed

CPU low load #55

IgorK7 opened this issue Mar 3, 2023 · 7 comments

Comments

@IgorK7
Copy link

IgorK7 commented Mar 3, 2023

Hi,

I am not sure whether it is a bug or a feature of this package.

I've noticed that CPU cores are loaded for no more than 6% at most. I provide all the cores available on the system and it uses all the cores but the load is very low. It happens both on my local Intel Mac and on GCC (see a screenshot below).

Here is the config.json file (everything else if default). Data is random with 10000 obs and 2 predictors.

{ "experiment": {
"logdir": None
},
"task" : {
"task_type" : "regression",
"metric": "inv_nmse",
"metric_params": (0,),
"function_set" : ["add", "sub", "mul", "div", "exp", "log", "sqrt" ,"const"] #
} ,
"training" : { #"epsilon" : 0.05,
"n_cores_batch" : -1 # "epsilon" : 1,
},
"prior": {
"length": {
"min_": 3,
"max_": 15,
"on": True
}}}

image

@IgorK7
Copy link
Author

IgorK7 commented Mar 5, 2023

It seems like it is still loading one core at time although I specify to use all available cores. I would appreciate any advice here.
image

@brendenpetersen
Copy link
Collaborator

brendenpetersen commented Mar 10, 2023

Your CPUs should definitely be pegged at 100% using this configuration.

As you know, setting n_cores_batch to -1 should be shorthand to use all available cores. From core.py:

if n_cores_batch == -1:
    n_cores_batch = cpu_count()
if n_cores_batch > 1:
    pool = Pool(n_cores_batch, initializer=set_task, initargs=(self.config_task,))

I'm wondering if there could be some OS/machine-specific issue with cpu_count? Could you try the following in a Python shell?

import multiprocessing
print(multiprocessing.cpu_count())

Also, how are you running this config? If it's via command-line, is it the simple python -m dso.run config.json or do you have other command-line arguments in that command?

One other thing to try that might help narrow things down is if the problem persists when removing the const token from function_set? On that note do you know if your Cython executable successfully installed? (Sometimes this fails but if so DSO falls back on a pure Python version)?

@brendenpetersen
Copy link
Collaborator

brendenpetersen commented Mar 11, 2023

@IgorK7 FYI we just updated the repo to a new release. It had a lot of refactoring, so apologies if it breaks any existing configs; you might have to shuffle some things around in your configs following the new template config_common.json.

Not sure if any of the edits will address the two issues you have raised.

@IgorK7
Copy link
Author

IgorK7 commented Mar 11, 2023

Hi Brenden,

Thank you very much for getting back to me.

I am using your package in my research project where I investigate properties of symbolic regressions and DSR in particular in the set up with very high noise (true R2 at 5%). So all calculations require a lot of computation capacity and time.

I can confirm that with
print(multiprocessing.cpu_count())
the core count is correct. It loads all the cores but only for a fraction of capacity.
Yes, I tested it via command-line with
python -m dso.run config.json
and the effect is the same. The load per core is very low. Tracking average load on the build-in datasets is not reasonable (unless lopping) as they have no noise and the program just runs too fast. I supply my own data of 10000 rows and 2 explanatory variables with noise and the average load per core is low (like 5%).
The inclusion/exclusion of const does not affect the load.
Cython is installed.
I run the algorithm on 3 different machines: Intel Mac with 10 physical cores, 2 instances of 48-cores on Google Compute Engine and Windows Subsystem for Linux (WSL) on an Intel PC with 8 physical cores. The results are the same - low load per core although all cores are loaded.

By the way, running the program on WSL should be a faster alternative to Docker if one needs to run it on Windows PC.

I am not sure how to make it run on CPUs at the full capacity.
Also, is there an option in the package set up to make it run on GPU? I could not find it but I am not a programmer.

Thank you very much!

@IgorK7
Copy link
Author

IgorK7 commented Mar 11, 2023

I will also test the new version of the package. Thank you!

@IgorK7
Copy link
Author

IgorK7 commented Mar 19, 2023

Regarding using GPU, I was able to make it utilize GPU but it still does not load to max capacity. The behavior is the same regardless of whether I use a small dataset (10,000 by 3) or a large one (10,000,000 by 9).
I was able to install tensorflow==1.14 + CUDA==10.2+cuDNN==8.6.0 on Ubuntu18.04 with some intel and Nvidia TeslaT4. (resolving compatibility issues on the go as tf==1.14 should only work with CUDA=10.0).
Still, any advice for speeding up computations would be appreciated.

image

@brendenpetersen
Copy link
Collaborator

brendenpetersen commented Mar 25, 2023

GPU is not going to help. GPU will be used on the neural network (which is on the TensorFlow compute graph) but not computing the MSE (which is done off the compute graph). Since the DSO LSTM is a very small network, GPU just doesn't help. CPU (e.g., computing MSE on the dataset for the regression task) is going to be the computational bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants