Skip to content

Conversation

@ekouts
Copy link
Contributor

@ekouts ekouts commented Aug 30, 2021

When we detect automatically the topology we used to set directly only the _info attribute of the processor of the partition. As a result the other attributes of the ProcessorType were not set and we would get for example self.current_partition.processor.num_cpus as None while self.current_partition.processor.info is a non empty dictionary. The other attributes were only set in the _init_: https://github.com/eth-cscs/reframe/blob/34ee3d0bc91f7c9f5e56db1c6b46a49e8e5ee623/reframe/core/systems.py#L38

Same for the devices, we should set self.current_partition._devices to a list of DeviceType objects and not dictionaries.

@ekouts ekouts added this to the ReFrame Sprint 21.08.2 milestone Aug 30, 2021
@ekouts ekouts requested review from rsarm and vkarak August 30, 2021 15:57
@ekouts ekouts self-assigned this Aug 30, 2021
@pep8speaks
Copy link

pep8speaks commented Aug 30, 2021

Hello @ekouts, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2021-09-04 21:02:21 UTC

@codecov-commenter
Copy link

codecov-commenter commented Aug 30, 2021

Codecov Report

Merging #2168 (cac4606) into master (7d6b831) will increase coverage by 0.02%.
The diff coverage is 95.74%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2168      +/-   ##
==========================================
+ Coverage   86.29%   86.31%   +0.02%     
==========================================
  Files          53       53              
  Lines        9571     9557      -14     
==========================================
- Hits         8259     8249      -10     
+ Misses       1312     1308       -4     
Impacted Files Coverage Δ
reframe/frontend/autodetect.py 62.41% <83.33%> (+0.54%) ⬆️
reframe/core/systems.py 89.38% <96.87%> (+0.62%) ⬆️
reframe/utility/jsonext.py 86.04% <100.00%> (+1.23%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d6b831...cac4606. Read the comment docs.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a serious one! Thanks for catching it @ekouts! I would suggest and alternative design altogether, though, for both the ProcessorType and the DeviceType. I would just make them simple wrappers of their underlying dictionary. All the properties should simply do return self._processor_info.get('propname', None) and both ProcessorType and DeviceType should provide an update(info) method that would update the underlying dictionary.

@ekouts
Copy link
Contributor Author

ekouts commented Aug 31, 2021

So just to make sure I get this right:

All the properties should simply do return self._processor_info.get('propname', None)

I will change the __getattr__ in order to try to get all the properties from the dictionary.

  1. Should we allow certain attribute names or leave the user ask for anything and just return None? It might be a small problem for typos if the user doesn't know he is just putting the wrong name.
  2. Should __getattr__ handle the attributes like num_cores_per_socket that are inferred (this one is self.num_cores // self._num_sockets) or they will stay as they are now?
  3. We also would not be getting the properties in the documentation, although the names are mostly self explanatory I guess.

both ProcessorType and DeviceType should provide an update(info) method that would update the underlying dictionary.

Should this update work like a dictionary update, so it will be "additive" or overwrite the old dictionary? I guess we want the second one, right?

@vkarak
Copy link
Contributor

vkarak commented Aug 31, 2021

@ekouts You raise good points:

I will change the getattr in order to try to get all the properties from the dictionary.

That's a good idea.

Should we allow certain attribute names or leave the user ask for anything and just return None? It might be a small problem for typos if the user doesn't know he is just putting the wrong name.

I would vote for returning None only for the known properties (those defined in the docs/json schema). For the rest, you should delegate to super().__getattr__() so as to get an AttributeError. To achieve this, the class must keep a list of the known properties.

Should getattr handle the attributes like num_cores_per_socket that are inferred (this one is self.num_cores // self._num_sockets) or they will stay as they are now?

I would keep them as properties that do self.known_property.

We also would not be getting the properties in the documentation, although the names are mostly self explanatory I guess.

We should point to the configuration description for the standard properties and document here only the derived ones.

Should this update work like a dictionary update, so it will be "additive" or overwrite the old dictionary? I guess we want the second one, right?

What if works in an additive way but the target dictionary (the one that comes from the conf file) takes precedence? If the inserted/autodetected value is already there, then we simply use the existing one. Something like a setdefault() for the whole dictionary.

@ekouts
Copy link
Contributor Author

ekouts commented Aug 31, 2021

What if works in an additive way but the target dictionary (the one that comes from the conf file) takes precedence? If the inserted/autodetected value is already there, then we simply use the existing one. Something like a setdefault() for the whole dictionary.

Now if the processor dictionary in the configuration is not an empty dictionary in the configuration we don't try the auto-detection at all. For me it's better (and obviously much easier) to keep this behavior since you might prefer to have little "reliable" information than ReFrame's guess. But if you think it's worth it I can try that.

@vkarak
Copy link
Contributor

vkarak commented Aug 31, 2021

Ok, let's keep it as is for the moment and bring it up for discussion.

@ekouts
Copy link
Contributor Author

ekouts commented Sep 3, 2021

@jenkins-cscs retry all

@vkarak
Copy link
Contributor

vkarak commented Sep 4, 2021

I changed a bit the implementation, because the __slots__, as we had them, didn't have an effect, since the ProcessorInfo was inheriting a __dict__ from the JSONSerializable. So I introduced the __slots__ to that one and then made the encoding/decoding to understand also objects that define __slots__. The problem I had afterwards, though, was that the ProcessorInfo could not be restored, since the logged attributes were not writeable when the object was being restored (by --restore-session). So eventually, I just used the __slots__ to store only the internal _info object, so that this will be the only attribute to be dumped and read from the JSON representation. I used another class attribute, _known_attrs to treat specially the known attributes that we used to have in __slots__.

@vkarak vkarak changed the title [bugfix] Fix bug in topology autodetection [bugfix] Fix incomplete processor info object when processor topology is auto-detected Sep 4, 2021
@vkarak vkarak merged commit 79bedf8 into reframe-hpc:master Sep 4, 2021
@ekouts ekouts deleted the bugfix/autodetect_processor_info branch October 1, 2021 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants