Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA Jetson Orin support #1135
base: master
Are you sure you want to change the base?
NVIDIA Jetson Orin support #1135
Changes from all commits
05cc093
04a3cd0
99979d4
ce8191c
67c4803
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that breaks verification currently. It would be relatively easy to update, but I'd like to understand better what we are really doing here. Why are we usually not setting
inner shareable
, and could setting it unconditionally have any adverse effect on older platforms?Is there any potential interaction with device memory?
The explanation in the commit message was useful. I think it would be good to have more of it as a comment in the code (maybe where INNER_SHARABLE is defined).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None, ARM states in RPYFVQ (DDI 0487J.a) that the value doesn't matter for device memory, which it will just treat as Outer Shareable.
I think the current reasoning is that it's not really needed for UP systems. Having a more permissive shareability such as
inner shareable
might mean wasted cycles trying to be coherent with the other PEs when it's not really needed. However, I believe that even while running seL4 in UP mode, another agent in the system may still need to be coherent with the PE/node. I have not had a deeper look as to which part of the Orin SoC might be requiring this.I'll try and copy the explanations into the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: We've had some existing problems with the Orin's System Cache. It is not an architectural cache and is shared by the CPU/GPU, making it effectively a CPU L4 cache. It may be the case that this is not within the non-shareable domain.
I use this example because we added a cache line invalidate by VA on a range of memory, which seemed to half-fix some of the instabilities we were encountering. Setting the shareability to IS fully fixed this issue and made the fix redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that makes sense. It would be good to see if there is any measurable performance impact then, even if we're not expecting any. I guess we could just kick off a sel4bench run and see what happens on the existing platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I believe shareability and cacheability attribute values for user frame mapping is policy that needs to be delegated to user level.
It can also mean that the system has to fall back to uncached memory if it doesn't provide hardware cache-coherency. So increasing the shareability from none, to inner to outer can degrade performance if it's not needed.
Wouldn't this be an issue with inner vs outer cacheability attributes rather than shareability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being cache coherent with other cores should be the default behaviour for seL4, to avoid very hard to debug surprises. So I am strongly in favour of this change.
Those pages would be marked
cacheable
and this change doesn't change anything for them on SMP.These SoCs are made for coherency between cores, I don't think downgrading non-SMP performance to the same level as with SMP for some obscure corner case is a problem.
And the proper fix for that would be to extend the API and provide more fine grained cacheability and shareability attribute values to user space, like you propose.
It does sound like that, so I'm also a bit surprised this change fixed that. Maybe the GPU is in the same cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this commit be split out into a separate PR and discussed separately. It's scope is more than the Jetson Orin support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems Orin specific then? So the comment should clearly say this, or we could even have a platform-specific setting then for Orin besides the A78AE's default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that should be more clear, while the A78AE supports 48 bits, I haven't figured out a way to enabke this without causing a chain reaction of assertion failures within the kernel.
My understanding is that the ARM64 kernel only has 47 useable bits of physical memory unless hypervisor mode is enabled, if anyone knows the context to this please share :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the Jetson Orinmodules don't support more than 64GB, let alone 1TB, why not treat is as a Cortex-A78?
Or are you also adding lockstep support to seL4 later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some discussion on 48 bit physical address space in #1175 and #1157, but the result is basically that the full 48 bit are currently not supported and would require further changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a good idea to add a
compile_assert(CONFIG_MAX_NUM_NODES * GICR_PER_CORE_SIZE <= GICR_SIZE);
to reduce future frustration.