Skip to content

Conversation

@shamisp
Copy link
Contributor

@shamisp shamisp commented May 24, 2016

XPMEM already defines the max address value XPMEM_MAXADDR_SIZE,
which has special handling on the kernel level. Therefore we should
avoid usage of VADER_MAX_ADDRESS, which actually causes failures
on some systems.

Signed-off-by: Pavel Shamis (Pasha) pasharesearch@gmail.com

XPMEM already defines the max address value XPMEM_MAXADDR_SIZE,
which has special handling on the kernel level. Therefore we should
avoid usage of VADER_MAX_ADDRESS, which actually causes failures
on some systems.

Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
@shamisp
Copy link
Contributor Author

shamisp commented May 24, 2016

FYI @hjelmn

@hjelmn
Copy link
Member

hjelmn commented May 24, 2016

@shamisp The problem is XPMEM_MAXADDR_SIZE in the Cray version of xpmem is wrong. If xpmem attempts to attach to an address over that value hard-coded in vader an error is returned. I plan to update my version of xpmem to expose the real maximum not just -1. This was affecting the bound not the attached region.

@shamisp
Copy link
Contributor Author

shamisp commented May 24, 2016

On the other hand VADER_MAX_ADDRESS does not work for all systems. Do you have any other idea how to fix this ?

@hjelmn
Copy link
Member

hjelmn commented May 24, 2016

Not sure. We may have to define VADER_MAX_ADDRESS by the architecture I fear. Otherwise we may see attach failures. I wonder if SGI ever solved this problem... I will look at their xpmem source code.

@shamisp
Copy link
Contributor Author

shamisp commented May 25, 2016

@hjelmn how did you come up with the number that you use now ?

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

I got originally got the value from trial and error with the api. Confirmed it from the kernel source later.

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

For reference the largest address that can be attached to is TASK_SIZE which is (1 << 47) - 4096 on linux x86_64. See http://lxr.free-electrons.com/source/arch/x86/include/asm/processor.h#L747. This line in xpmem enforces the max attach address: https://github.com/hjelmn/xpmem/blob/master/kernel/xpmem_attach.c#L372.

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

I wonder if it will be fine to just use the VADER_MAX_ADDRESS on x86. The problem is that linux x86 reserves the last 4096 bytes of the user virtual address space. The causes an attach in the top 2MB of the stack to fail because vader by default tries to attach to 2MB aligned/sized regions. This by definition includes the reserved page. If the linux kernel doesn't reserve the last page on other platforms we could simply use (size_t) -1 on those platforms. There may still be problems if the user attempts to further increase the attach size but that might be ok.

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

Doing a spot check around different archs. ARM, PPC, MIPS, etc all use a power of two for TASK_SIZE.

@shamisp
Copy link
Contributor Author

shamisp commented May 25, 2016

@hjelmn - not a bad idea. In UCX we only register the memory that we need, so we don't have this issue.

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

Yeah, this comes from trying to keep the rcache small so lookups remain fast.

@hjelmn
Copy link
Member

hjelmn commented May 25, 2016

I can make a new commit to define VADER_MAX_ADDRESS as XPMEM_MAXADDR_SIZE on non-x86 systems if you want or you can submit a new patch. If other platforms run into issues we can always special-case those platforms.

@shamisp
Copy link
Contributor Author

shamisp commented May 25, 2016

No problem. I will update the patch.

@shamisp
Copy link
Contributor Author

shamisp commented May 26, 2016

Based on the discussion I'm closing this PR, and will be submitting a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants