Port iris.x all to all from triton to gluon. Prepare a detailed report comparing performance and generated assembly For reference, checkout the Triton source code (pip show triton)