Skip to content

Fix awk locale and enable SSE#92

Merged
anttikantee merged 1 commit intorumpkernel:masterfrom
fltt:patches
Mar 17, 2017
Merged

Fix awk locale and enable SSE#92
anttikantee merged 1 commit intorumpkernel:masterfrom
fltt:patches

Conversation

@fltt
Copy link
Copy Markdown
Contributor

@fltt fltt commented Mar 13, 2017

This pull request fixes two unrelated issues.
The first is a problem with awk: if the current locale defines as the decimal dot something different from a dot (e.g., a comma), build-rr.sh fails to parse gcc and ld's version numbers.

The second issue is that at boot CR4's OSFXSR flag is left cleared, preventing the Rumprun unikernel from using the floating point routines contained in the compiler_rt library, when compiled for the 32 bit i386 architecture.
I've also enabled the OSXMMEXCPT flag (to keep the boot code in sync with the 64 bit version).

@anttikantee
Copy link
Copy Markdown
Member

Can you quickly remind me what will happen on CPUs which don't support that feature, and/or point out since when all CPUs do support that feature.

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 13, 2017

This came out when I tried to follow the instructions of the "Rumprun unikernel video series" part 1 -- when running the mpg123 unikernel on my 32 bit FreeBSD box I got an endless stream of "trap: #UD" on the screen.

Actually the problem is not in the CPU supporting that feature but in the OS supporting it. I don't know why, but to use the SSE instructions the OS must tell the CPU that it (the OS) has support for the FXSAVE and FXRSTOR instructions. From Intel's "Intel® 64 and IA-32 Architectures Software Developer's Manual -- Volume 3A: System Programming Guide, Part 1", section 2.5 "Control Registers":

OSFXSR
Operating System Support for FXSAVE and FXRSTOR instructions (bit 9 of CR4) -- When set, this flag: (1) indicates to software that the operating system supports the use of the FXSAVE and FXRSTOR instructions, (2) enables the FXSAVE and FXRSTOR instructions to save and restore the contents of the XMM and MXCSR registers along with the contents of the x87 FPU and MMX registers, and (3) enables the processor to execute SSE/SSE2/SSE3/SSSE3/SSE4 instructions, with the exception of the PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT.
If this flag is clear, the FXSAVE and FXRSTOR instructions will save and restore the contents of the x87 FPU and MMX instructions, but they may not save and restore the contents of the XMM and MXCSR registers. Also, the processor will generate an invalid opcode exception (#UD) if it attempts to execute any SSE/SSE2/SSE3and instruction, with the exception of PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT. The operating system or executive must explicitly set this flag.

@anttikantee
Copy link
Copy Markdown
Member

I understand that the OS needs to enable the instructions -- I did explicitly change that on x86_64 precisely for mpg123 (*).

However, while we can assume that all x86_64 CPUs support MMX/SSE, the question was what happens on 32bit x86 for CPUs which do not support those instructions. If it crashes on boot for all applications, that's a regression in the OS. If it crashes when an application tries to use instructions not available on the current CPU, that's an application problem.

*) eac6324#diff-13ce2f3233e9122a657f41ed772ccb22

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 14, 2017

What if I check for SSE support before setting those flags?

/* check if the cpuid instruction is available */
pushfl
popl    %eax
movl    %eax, %ecx
xorl    $0x200000, %eax
pushl   %eax
popfl
pushfl
popl    %eax
xorl    %ecx, %eax
je      no_sse

/* cpuid available, now test the sse feature flag */
movl    $1, %eax
cpuid
test    $0x2000000, %edx
jz      no_sse

/* enable sse */
movl    %cr4, %eax
orl     $0x600, %eax
movl    %eax, %cr4

no_sse:

@anttikantee
Copy link
Copy Markdown
Member

That's what a canonical OS does. However, do we need to do it for either of the reasons I mentioned earlier?

Also, note that x86_cpuid() already uses cpuid unconditionally, so you might as well skip the "is cpuid supported" part. According to wikipedia, cpuid is supported even in later 486's, and there are probably other assumptions in the 32bit code which prevent from running on pre-Pentium.

But, since the code checking for SSE is so short, we could just do that and not think about it.

Bonus points if you want to define some of the magic numbers in headers ;-)

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 14, 2017

I put the CPUID check on purpose, to tease you. :-)
Infact, if you were asking me if we should care about supporting CPUs older than the Pentium III (which was the first one to include the SSE instructions in 1999) I would have said -- is this proper english? -- that the chances to see the Rumprun kernel run on such relics is so near to zero that nobody could tell the difference.

However, for I don't known if all the models that followed the Pentium III do support the SSE instructions, I agree with you to add the SSE check (not the CPUID's) and stop thinking about it.
And consider giving those magic numbers proper names done.

@anttikantee
Copy link
Copy Markdown
Member

There might be some embedded x86 CPUs where the lack of SSE is relevant. Anyway, good plan.

@anttikantee
Copy link
Copy Markdown
Member

I cherry-picked the awk fix. Seems like a useful thing to have regardless of when you get around to revising the SSE patch.

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 16, 2017

I've put the CPUID and CRx values and flags in a new file: arch/x86/flags.h
Tell me if you prefer those defines to be put somewhere else.

@anttikantee
Copy link
Copy Markdown
Member

reg.h

Also, I'd name x86_64 long mode enable something else than IA32_ENABLE

Finally, comment is out of date: s/cpuid available, now //

Thanks for the 0x20 -> ' ' ;-)

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 17, 2017

It's not IA32_ENABLE but IA32E_ENABLE: IA32E is Intel's name for long mode.
I admit that some comment here and there would be handy. What if I add something like:

/* Extended Feature Enable Register MSR */
#define IA32_EFER 0xc0000080

/* Long Mode Enable */
#define IA32_EFER_IA32E_ENABLE_MASK 0x00000100

to the EFER defines and something similar to the other defines too?

OK for the other points.

@anttikantee
Copy link
Copy Markdown
Member

Intel didn't invent long mode ;-)

But, anyway, I did ask you to name the constants, so maybe it's easier if we just go with the names you chose.

Generally, you seem to know which way to hold an editor (hope it's vi!). If you have future plans for Rumprun, I could give you push access, so we can avoid more naming discussions...

@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 17, 2017

I didn't say Intel invented it -- they just have their own nomenclature.

At the moment I don't have future plan. I think the unikernel (whether Rumprun or others) has some place in the future of the cloud computing and as a software engineer worth his salt I cannot ignore it.

(sorry, I'm mostly an Emacs user but I don't dislike vi)

@anttikantee
Copy link
Copy Markdown
Member

Ok, adjust your branch for the non-naming parts, and I'll merge it.

For libcompiler_rt makes use of SSE instructions in its floating point
routines, CR4's OSFXSR flag must be enabled or an #UD exception will
ensue.
This patch also enables the OSXMMEXCPT flag, just to be in sync with
the amd64 version of the boot code.
@fltt
Copy link
Copy Markdown
Contributor Author

fltt commented Mar 17, 2017

I changed my mind -- after all, I don't like Intel's nomenclature.
I've also shortened the names.

@anttikantee anttikantee merged commit c7f2f01 into rumpkernel:master Mar 17, 2017
@anttikantee
Copy link
Copy Markdown
Member

Ok. Send another pull req if you change your mind again ;-)

Thank you!

@fltt fltt deleted the patches branch March 17, 2017 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants