Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ARM issues #1515

Merged
merged 7 commits into from
Dec 15, 2020
Merged

Fix ARM issues #1515

merged 7 commits into from
Dec 15, 2020

Conversation

georgebisbas
Copy link
Contributor

@georgebisbas georgebisbas commented Nov 20, 2020

Supersedes 1510
Fix ARM issue, improve platform autodetection.

TODO:
Improve par-nested override for a global solution

Tested on Isambard's Marvell ThunderX2

@codecov
Copy link

codecov bot commented Nov 20, 2020

Codecov Report

Merging #1515 (3eefea2) into master (78c8a6f) will decrease coverage by 0.67%.
The diff coverage is 62.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1515      +/-   ##
==========================================
- Coverage   87.17%   86.50%   -0.68%     
==========================================
  Files         203      206       +3     
  Lines       28950    29009      +59     
  Branches     3910     3914       +4     
==========================================
- Hits        25238    25094     -144     
- Misses       3274     3469     +195     
- Partials      438      446       +8     
Impacted Files Coverage Δ
devito/core/cpu.py 98.92% <ø> (-0.17%) ⬇️
devito/core/arm.py 30.95% <30.95%> (ø)
devito/archinfo.py 51.77% <50.00%> (-0.56%) ⬇️
devito/core/__init__.py 100.00% <100.00%> (ø)
devito/core/intel.py 100.00% <100.00%> (ø)
devito/core/power.py 100.00% <100.00%> (ø)
tests/test_gpu_openacc.py 37.34% <0.00%> (-62.66%) ⬇️
tests/test_gpu_openmp.py 73.48% <0.00%> (-25.97%) ⬇️
devito/core/gpu_openacc.py 45.06% <0.00%> (-25.93%) ⬇️
devito/compiler.py 48.56% <0.00%> (-6.71%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78c8a6f...3eefea2. Read the comment docs.

@FabioLuporini
Copy link
Contributor

FabioLuporini commented Nov 22, 2020

should the title be updated?

@georgebisbas georgebisbas changed the title [NOT FOR MERGE][TESTING ONLY] Fix arm issues Fix arm issues Nov 22, 2020
@georgebisbas georgebisbas force-pushed the fix-arm-issues-bisbas branch 2 times, most recently from c2894cf to 29967fc Compare November 22, 2020 21:45
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/core/cpu.py Outdated Show resolved Hide resolved
@georgebisbas georgebisbas force-pushed the fix-arm-issues-bisbas branch 3 times, most recently from 0b7e78d to b106ca2 Compare November 23, 2020 17:30
flags = [i for i in lines if i.startswith('flags')][0]
# ARM Thunder X2 is using 'Features' instead of 'flags'
flags = [i for i in lines if (i.startswith('Features')
or i.startswith('flags'))][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking: the or i.startswith('flags)' should be aligned with i.startswith('Featuers')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then flake8 complains (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

devito/archinfo.py Outdated Show resolved Hide resolved
@@ -64,12 +66,16 @@ def get_cpu_brand():

cpu_info['flags'] = get_cpu_flags()
cpu_info['brand'] = get_cpu_brand()
if cpu_info['brand'] is None:
cpu_info['brand'] = cpuinfo.get_cpu_info().get('raw_arch_string')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this should be moved here in place of the return None right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a new try-except pair I would say (?)

devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
@georgebisbas georgebisbas changed the title Fix arm issues Fix ARM issues Nov 27, 2020
flags = [i for i in lines if i.startswith('flags')][0]
# ARM Thunder X2 is using 'Features' instead of 'flags'
flags = [i for i in lines if (i.startswith('Features')
or i.startswith('flags'))][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will not

devito/archinfo.py Show resolved Hide resolved
cpu_info['flags'] = ci.get('flags')
cpu_info['brand'] = ci.get('brand')
try:
if 'flags' not in cpu_info:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it true that the only possibility for flags not to be in cpu_info is that lines is empty? because if we enter the if lines: at line 34, then we go through lines 70-71, hence cpu_info['flags'] should be defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part where I do not know what happens in OSX, tests fail only there.

if 'flags' not in cpu_info:
# Fallback
ci = cpuinfo.get_cpu_info()
cpu_info['flags'] = ci.get('flags')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so going back to my original comment. What I was actually suggesting was to move the content of the try-except in place of current line 48 -- instead of returning None, we return the cpuinfo.get_cpu_info().get('flags') ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I did that the previous days and was failing on OSX


# Detect number of logical cores
try:
if cpu_info['brand'] == 'aarch64':
# In some ARM processors psutils and lscpu fail to detect cores correctly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment doesn't make much sense does it? you say that lscpu fails, but then you actually use it below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg nice catch, I do not remember what I was thinking at that time.

# In some ARM processors psutils and lscpu fail to detect cores correctly
logical = psutil.cpu_count(logical=True)
physical = psutil.cpu_count(logical=False)
if physical != (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)']):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not A:
   B = A

=>

B = A

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. This is a leftover from using it to print a warning message thaty was later removed.
warning...core autodetection is not reliable....using lscpu....

devito/archinfo.py Outdated Show resolved Hide resolved
devito/archinfo.py Outdated Show resolved Hide resolved
try:
physical = lscpu()['Core(s) per socket'] * lscpu()['Socket(s)']
physical = (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding parentheses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -107,15 +119,16 @@ def get_cpu_brand():
# Fallback 1: it should now be fine to use psutil
physical = psutil.cpu_count(logical=False)
if not physical:
# Fallback 2: we might end up here on more exotic platforms such a Power8
# Hopefully we can rely on `lscpu`
# Fallback 2: we might end up here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid breaking this line into three?

also, typo: 'such a' -> 'such as'

should we now drop the "or due to erroneous autodetection" part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

devito/archinfo.py Show resolved Hide resolved
@@ -258,7 +271,8 @@ def lscpu():
if output:
lines = output.decode("utf-8").strip().split('\n')
mapper = {}
for k, v in [tuple(i.split(':')) for i in lines]:
# Using split(':', 1) to avoid splitting lines with security issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"with security issues" . Can you be more precise? Nobody will understand it

# Using split(.....
# Example: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:

        # Using split(':', 1) to avoid splitting lines where lscpu shows vulnerabilities
        # on some CPUs: https://askubuntu.com/questions/1248273/lscpu-vulnerabilities

return flags.split(':')[1].strip().split()
except:
return None
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? this will return None anyway, so it's clearer the other way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why will this return None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get your point. TO my understanding if the last pair of try-except in each function
get_cpu_xxx
is returning None, then we are fine.

devito/archinfo.py Outdated Show resolved Hide resolved
cpu_info['flags'] = ci.get('flags')
cpu_info['brand'] = ci.get('brand')
try:
if cpu_info['flags'] is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point for this try-except. The following should suffice?

if not cpu_info.get('flags'):
    cpu_info['flags'] = cpuinfo.get_cpu_info().get('flags')

same for the other try-except below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

except KeyError:
cpu_info['brand'] = cpuinfo.get_cpu_info().get('brand')

# Detect number of physical and logical cores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this comment and replace with the comment below "Special case: in ... " (note that "In" -> "in" . Colons don't want capital letters afterwards)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

# correctly so we use lscpu()
logical = psutil.cpu_count(logical=True)
physical = (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)'])
cpu_info['logical'] = logical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that the issue with ARM is the correct detection of number of physical cores, not the logical cores. So I would:

  • drop the logical detection here
  • move this try-except after the "logical core detection" part, ie below line 110

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but getting the logical here helps as return cpu_info and avoid all the other lines after 110

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting to move it right between lines 103 and 104, so you would still avoid "those other lines"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do so, I think, it'd become a bit neater

# correctly so we use lscpu()
logical = psutil.cpu_count(logical=True)
physical = (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)'])
cpu_info['logical'] = logical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting to move it right between lines 103 and 104, so you would still avoid "those other lines"

# correctly so we use lscpu()
logical = psutil.cpu_count(logical=True)
physical = (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)'])
cpu_info['logical'] = logical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do so, I think, it'd become a bit neater

devito/archinfo.py Outdated Show resolved Hide resolved
@mloubout
Copy link
Contributor

mloubout commented Dec 4, 2020

Considering the amount of try/except and if/else, would it be good to separate archinfo in arch specific case and just check them one by one like

for arch in [intel, arm, ....]
     if found(arch):
         do what's needed
         break

try:
if 'arm' in cpu_info['brand']:
physical = (lscpu()['Core(s) per socket'] * lscpu()['Socket(s)'])
cpu_info['physical'] = physical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be a single line or does it not fit the 90 chars limit?

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think I'm quite happy with the current state of this PR

@georgebisbas georgebisbas force-pushed the fix-arm-issues-bisbas branch 2 times, most recently from 9bceb52 to 3b1ab1a Compare December 4, 2020 13:37
CustomOperator)
from devito.core.intel import (Intel64Operator, Intel64OpenMPOperator,
Intel64FSGOperator, Intel64FSGOpenMPOperator)
from devito.core.arm import (ArmOperator, ArmOpenMPOperator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

from devito.core.intel import (Intel64Operator, Intel64OpenMPOperator,
Intel64FSGOperator, Intel64FSGOpenMPOperator)
from devito.core.arm import (ArmOperator, ArmOpenMPOperator)
from devito.core.power import (PowerOperator, PowerOpenMPOperator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

__all__ = ['ArmOperator', 'ArmOpenMPOperator']


ArmOperator = CPU64Operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HHmm... not needed but was it needed before?



ArmOperator = CPU64Operator
ArmOpenMPOperator = CPU64OpenMPOperator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

_specialize_iet = CPU64OpenMPOperator._specialize_iet


PowerOperator = CPU64Operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped

PowerOperator = CPU64Operator
PowerOpenMPOperator = CPU64OpenMPOperator

ArmOperator = CPU64Operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

devito/core/intel.py Show resolved Hide resolved
optimize_halospots, hoist_prodders, relax_incr_dimensions)
from devito.tools import timed_pass


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one extra blank line?

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK now, thanks

Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good now, thanks. Merging

@FabioLuporini FabioLuporini merged commit a18d10a into master Dec 15, 2020
@FabioLuporini
Copy link
Contributor

Merged.

@FabioLuporini FabioLuporini deleted the fix-arm-issues-bisbas branch December 15, 2020 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Platform autodetection fails on ARM
4 participants