Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aarch64 - enable large immediate offset #1250

Merged
merged 2 commits into from Aug 7, 2017

Conversation

swalk-cavium
Copy link
Contributor

Facebook is using ocaml version 4.03 in their hhvm project. Recently they hit a limit in the
ocaml code generation for ARM64. This blocked building hhvm on ARM64 hosts.

The error signature is given blow, but it hit a unique sequence of instructions. ocaml
was trying to generate an invalid immediate offset in a subtract instruction. This change
will use one of the temporary registers if the value is too large.

The testsuite was run before/after this change. No difference in the output was seen.

Since this is a limit problem, I'm not sure how to create a small test case.

hhvm/third-party/ocaml/build/bin/ocamlopt.opt -c -g -w A -w -3-4-6-29-35-44-48-50-52 -I parser -I fsnotify_linux -I watchman -I server -I ast -I hhbc -I hhi -I watchman_event_watcher -I typing -I libancillary -I dfind -I recorder -I deps -I heap -I s earch -I find -I decl -I format -I utils -I ide_rpc -I debug -I client -I hackfmt -I globals -I diff -I stubs -I naming -I monitor -I proc s -I socket -I options -I injection/default_injector -I utils/process -I utils/collections -I utils/errors -I utils/hh_json -I utils/lint -I utils/hg -I utils/disk -I hackfmt/error -I hackfmt/line_splitter -I hackfmt/debug -I third-party/avl -I third-party/inotify -I third-pa rty/core -I parser/coroutine -I parser/schema -o parser/full_fidelity_validated_syntax.cmx parser/full_fidelity_validated_syntax.ml
/tmp/camlasmc2926f.s: Assembler messages:
/tmp/camlasmc2926f.s:38621: Error: immediate out of range //<<---
File "parser/full_fidelity_validated_syntax.ml", line 1:
Error: Assembler error, input left in file /tmp/camlasmc2926f.s
Command exited with code 2.

38617 add x6, x6, #:lo12:camlFull_fidelity_validated_syntax__invalidate_list_with_6322
38618 str x6, [x7, #16]
38619 str x13, [x7, #24]
38620 str x11, [x7, #32]
38621 .L4580: sub x27, x27, #8728 //<<---
38622 cmp x27, x28
38623 add x0, x27, #8
38624 b.lo .L4581
38625 str x0, [sp, #96]
38626 movz x6, #17, lsl #16
parser/full_fidelity_validated_syntax.s

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the suggested fix I think this code is correct and fixes the issue. I can think of a slightly better fix (in terms of code size) but it can wait for later.

` sub {emit_reg reg_alloc_ptr}, {emit_reg reg_alloc_ptr}, #{emit_int n}\n`;
` cmp {emit_reg reg_alloc_ptr}, {emit_reg reg_alloc_limit}\n`;
` add {emit_reg i.res.(0)}, {emit_reg reg_alloc_ptr}, #8\n`;
if n > 4096 then begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be n >= 4096

Copy link
Contributor Author

@swalk-cavium swalk-cavium Jul 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @xavierleroy,

A quick experiment shows that 4096 is encodable with the shifting form of the instruction.
The assembler automatically selects this.

cat t.s
	sub x0, x0, #4096
swalk@arm64:~$ cc -c t.s
swalk@arm64:~$ objdump -d t.o
0000000000000000 <.text>:
   0:	d1400400 	sub	x0, x0, #0x1, lsl #12
 cat t1.s
	sub x0, x0, #4097
swalk@arm64:~$ cc -c t1.s
t1.s: Assembler messages:
t1.s:1: Error: immediate out of range

Continuation of GPR#1250.
Generate a shorter instruction sequence in the "large size" case.
Less code duplication.
Also update the "instr_size" function accordingly.
@xavierleroy
Copy link
Contributor

I took the liberty to push the different fix I had in mind directly to the cavium_dev_071817a branch of swalk-cavium/ocaml. If you would like to re-test on your big code, let me know how it goes.

@xavierleroy xavierleroy dismissed their stale review July 28, 2017 09:10

Code changed since review request.

@swalk-cavium
Copy link
Contributor Author

@xavierleroy - Your change looks tighter. Unfortunately, hhvm is using the 4.03 version.
When I bring your change over it looks like there's some type problems.

1201 -- Looking for time
1202 File "arm64/emit.mlp", line 444, characters 19-24:
1203 Error: Unbound record field words
1204 Makefile:748: recipe for target 'asmcomp/emit.cmo' failed
1205 make[6]: *** [asmcomp/emit.cmo] Error 2
1206 -- Looking for time - found
1207 -- Looking for rdtscll

It looks like asmcomp/arm64/arch.ml has changed significantly between
versions.

@xavierleroy
Copy link
Contributor

Thanks for the feedback. The problem you see is a small change of intermediate representation (by @mshinwell) between 4.04 and 4.05 that affects the various asmcomp/*/emit.mlp files.

At any rate we will not fix 4.03 (it's just too old), so you can go on with 4.03 by appying your initial fix, knowing that the "instr_size" function is not quite right.

I tried to reproduce the issue and couldn't get ocamlopt to produce a minor heap allocation of 8728 bytes, as in your problem report. Indeed the compiler and runtime system are quite careful to limit minor heap allocations to 256 words (= 2048 bytes on a 64-bit machine). Very large minor heap allocations (greater than the size of the minor heap, for example) would spell major trouble. So, I'm afraid the real bug is elsewhere and we're just removing its symptoms, not curing it.

Could you please tell me how to reproduce the original issue? It doesn't matter if there is a lot of code to compile, as long as the source code is publically available. (No NDAs.)

@mshinwell
Copy link
Contributor

I confirmed (by hacking at the repository below until enough of it built) on x86-64 that an Ialloc instruction with an out of range argument is generated. I think it's because in this file:

https://github.com/facebook/hhvm/blob/master/hphp/hack/src/parser/full_fidelity_validated_syntax.ml

there is a very large set of mutually-recursive functions under a functor. The resulting closure of those functions cannot be allocated on the minor heap. I think we need to modify Cmmgen to ensure it goes through some function kind of like make_alloc_generic in these cases. I will prepare a patch.

@xavierleroy
Copy link
Contributor

Thanks @mshinwell for looking into this. I wasn't suspecting closure construction, but it is suspicious indeed.

@mshinwell
Copy link
Contributor

Please see #1271.

@swalk-cavium
Copy link
Contributor Author

Hello @xavierleroy, building hhvm per the instructions here
https://docs.hhvm.com/hhvm/installation/building-from-source
used to generate the issue. Part of the screen scrape from the build is shown above
in the initial description. To get it to fail now, you'd have to remove the patch I submitted.

As noted by @mshinwell above, Facebook is adding a new full fidelity parser written in ml.
I haven't seen any documentation about what they are doing. I think Facebook might want to
be aware of the issues described above.

xavierleroy added a commit that referenced this pull request Jul 31, 2017
Fixes for Ialloc instructions allocating more than Max_young_wosize words in the minor heap

Out-of-range Ialloc instructions cause various problems, see in particular GPR #1250.
xavierleroy pushed a commit that referenced this pull request Jul 31, 2017
Cherry-pick of GPR#1271 which was merged on trunk.

Fixes for Ialloc instructions allocating more than Max_young_wosize words in the minor heap

Out-of-range Ialloc instructions cause various problems, see in particular GPR #1250.
@mshinwell
Copy link
Contributor

This looks correct to me, I am going to merge.

@mshinwell mshinwell merged commit 566ed41 into ocaml:trunk Aug 7, 2017
xclerc pushed a commit to xclerc/ocaml that referenced this pull request Nov 9, 2017
Cherry-pick of GPR#1271 which was merged on trunk.

Fixes for Ialloc instructions allocating more than Max_young_wosize words in the minor heap

Out-of-range Ialloc instructions cause various problems, see in particular GPR ocaml#1250.
EduardoRFS pushed a commit to EduardoRFS/ocaml that referenced this pull request Aug 8, 2020
Cherry-pick of GPR#1271 which was merged on trunk.

Fixes for Ialloc instructions allocating more than Max_young_wosize words in the minor heap

Out-of-range Ialloc instructions cause various problems, see in particular GPR ocaml#1250.
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Co-authored-by: Thibaut Mattio <thibaut.mattio@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants