Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Needless allocation for the last of multiple identical operations #10765

Closed
MoritzBrueckner opened this issue Jul 25, 2022 · 9 comments
Closed

Comments

@MoritzBrueckner
Copy link

MoritzBrueckner commented Jul 25, 2022

Hi,

the following example https://try.haxe.org/#5eFC48BC contains three identical lines of code

array[0] = tmp.add(tmp);

In the generated JS code, the last of those lines (independent of the number of identical lines!) gets an allocation where it isn't required. All but the last occurrences of this line get transpiled to

array.setFloat32(0,real + real);
array.setFloat32(4,imag + imag);

whereas the last one instead becomes

let value = new Complex(real + real,imag + imag);
array.setFloat32(0,value.real);
array.setFloat32(4,value.imag);

for some reason. It doesn't really make sense here to allocate memory just for a temporary variable, and it's slow. If you remove one original line or duplicate it once more, it is still only the last occurrence that results in an allocation.

If you force inlining of all functions in the example by making them extern, the last of the identical lines in Haxe produces the error

Test.hx:8: characters 14-26 : Forced inline constructor could not be inlined
Test.hx:8: characters 3-26 : ... Cancellation happened here

It's pretty frustrating because the original code (where the lines aren't completely identical, but very similar) is quite a performance bottleneck. Because the used ComplexArray does not contain Complex objects, I need to use new Complex() in the get() method if I want to have a complex number type instead of doing all the float inlining myself all the time.

Apart from that issue, I wish there was information in the error message that tells you why the inlining is not possible, so that one can try to find a workaround. Otherwise it's pretty useless in some situations.

Thanks :)


Full example code, in case the link to it gets invalid some day (click to expand)
class Test {
	static function main() {
		final array = new ComplexArray(10);

		final tmp = array[0];
		array[0] = tmp.add(tmp);
		array[0] = tmp.add(tmp);
		array[0] = tmp.add(tmp);
	}
}

@:forward
abstract ComplexArray(js.lib.DataView) {
	public inline function new(length: Int) {
		final buffer = new js.lib.ArrayBuffer(length * 2 * 4);
		this = new js.lib.DataView(buffer, 0, buffer.byteLength);
	}

	@:arrayAccess
	public static inline function get(impl: ComplexArray, index: Int): Complex {
		return new Complex(impl.getFloat32(index * 2 * 4), impl.getFloat32((index * 2 + 1) * 4));
	}

	@:arrayAccess
	public static inline function set(impl: ComplexArray, index: Int, value: Complex): Complex {
		impl.setFloat32(index * 2 * 4, value.real);
		impl.setFloat32((index * 2 + 1) * 4, value.imag);
		return value;
	}
}

class Complex {
	public var real: Float;
	public var imag: Float;

	public inline function new(real: Float, imag: Float) {
		this.real = real;
		this.imag = imag;
	}

	public inline function add(other: Complex): Complex {
		return new Complex(this.real + other.real, this.imag + other.imag);
	}
}

This produces (JS):

(function ($global) { "use strict";
class Test {
	static main() {
		let buffer = new ArrayBuffer(80);
		let array = new DataView(buffer,0,buffer.byteLength);
		let real = array.getFloat32(0);
		let imag = array.getFloat32(4);
		array.setFloat32(0,real + real);
		array.setFloat32(4,imag + imag);
		array.setFloat32(0,real + real);
		array.setFloat32(4,imag + imag);
		let value = new Complex(real + real,imag + imag);
		array.setFloat32(0,value.real);
		array.setFloat32(4,value.imag);
	}
}
class Complex {
	constructor(real,imag) {
		this.real = real;
		this.imag = imag;
	}
}
class haxe_iterators_ArrayIterator {
	constructor(array) {
		this.current = 0;
		this.array = array;
	}
	hasNext() {
		return this.current < this.array.length;
	}
	next() {
		return this.array[this.current++];
	}
}
{
}
Test.main();
})({});
@back2dos
Copy link
Member

Seems like somehow the last expression in a block is treated differently.

Two things work around this:

  1. adding a noop to the block: https://try.haxe.org/#14316955
  2. making set return nothing: https://try.haxe.org/#F26fCbeA

@R32
Copy link
Contributor

R32 commented Jul 25, 2022

I hope the Optimizer could be a little more perfect, until '-D analyzer-optimize' is enabled by default.

@Simn
Copy link
Member

Simn commented Jul 26, 2022

This is governed by the captured argument to analyze_aliases. It is initially set to false, but becomes true for non-final block elements.

I'm not sure if it's safe to always start off with this being true, but if I understand the logic correctly we can certainly do so for the body of a TFunction.

@Simn
Copy link
Member

Simn commented Jul 26, 2022

Closed by 3f13c47

@basro I'd appreciate if you'd take a quick look to confirm that this makes sense!

@basro
Copy link
Contributor

basro commented Jul 26, 2022

@Simn yes, I think this is correct.

The captured argument controls whether the expression being analyzed is allowed to result in a reference to an inlined object. (a bit of a misnomer, initially it was meant for things that end up 'captured' by a local variable but it can be used safely to just allow inline values that are never assigned but also never read)

@MoritzBrueckner
Copy link
Author

MoritzBrueckner commented Oct 31, 2023

Hi, it seems that this issue still happens in Haxe 4.3.2 when the code is in a loop: https://try.haxe.org/#91D89228.

public inline function copyFrom(other: ComplexArray) {
	// No allocation here
	abstract[0] = other[0];

	for (i in 0...this.byteLength >>> 3) {
		// Also no allocation
		abstract[0] = other[0];

		// ... but here there is
		abstract[0] = other[0];
	}

	// No allocation here
	abstract[0] = other[0];
}

JS Output:

let imag = arrayB.getFloat32(4);
arrayA.setFloat32(0,arrayB.getFloat32(0));
arrayA.setFloat32(4,imag);
let _g = 0;
let _g1 = arrayA.byteLength >>> 3;
while(_g < _g1) {
	++_g;
	let imag = arrayB.getFloat32(4);
	arrayA.setFloat32(0,arrayB.getFloat32(0));
	arrayA.setFloat32(4,imag);
	let value = new Complex(arrayB.getFloat32(0),arrayB.getFloat32(4));  // <--
	arrayA.setFloat32(0,value.real)
	arrayA.setFloat32(4,value.imag);
}
let imag1 = arrayB.getFloat32(4);
arrayA.setFloat32(0,arrayB.getFloat32(0));
arrayA.setFloat32(4,imag1);

@Simn Please let me know if I should open a new issue for this :)

Edit: Another observation: The same issue also happens at the end of if/else blocks when they're inside a loop: https://try.haxe.org/#8020BA9e. The noop workaround mentioned in #10765 (comment) needs to be included in the if/else blocks then in order to work.

@basro
Copy link
Contributor

basro commented Oct 31, 2023

The reason why inlining is being cancelled in this example is because the last statment of the while block is a reference to a potentially inlined object.

The constructor inliner logic doesn't have a special case for the body of while statements and the default behaviour of the inliner is to cancel the inlining if it sees a reference escape into expressions for which it doesn't have a special case.

@MoritzBrueckner if you want to work around this limitation you can just make the last expression in the while block be something else, for example:

	for (i in 0...this.byteLength >>> 3) {
		// Also no allocation
		abstract[0] = other[0];

		// ... but here there is
		abstract[0] = other[0];
		null; // This is a noop but it will make the constructor inliner know that the previous expression is not leaking the reference.
	}

@MoritzBrueckner
Copy link
Author

Thanks, just to be sure that I correctly understand the reason: it is because of the implicit "return" (or however you might call that in this case) of the last expression of {} blocks? So it wouldn't be possible to implement this specific optimization without potentially breaking valid code?

@basro
Copy link
Contributor

basro commented Oct 31, 2023

It's 100% possible to improve the constructor inliner so that these cases are handled without breaking any valid code.

To be more clear about why it's failing:

The constructor inliner doesn't understand while expressions and so when it sees

while(..condition...) referenceToPotentiallyInlinedObject;

It treats it in the same way as:

someNonInlinedFunction(referenceToPotentiallyInlinedObject);

Basically, any time it finds an expression in a situation that it doesn't have a special case for, it will cancel the inlining of any object the expression references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants