Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP 8 roadmap #140

Closed
niftyhack opened this issue Dec 16, 2020 · 7 comments
Closed

PHP 8 roadmap #140

niftyhack opened this issue Dec 16, 2020 · 7 comments
Labels
question Further information is requested

Comments

@niftyhack
Copy link

Hi,

parts of this project depend on https://github.com/RubixML/Tensor, which uses https://zephir-lang.com. Development of Zephir was put on hold and Zephir is not to be expected to be compatible with PHP 8, see https://blog.phalcon.io/post/the-future-of-phalcon.

What does the future of RubixML look like in terms of PHP 8?

@niftyhack niftyhack added the question Further information is requested label Dec 16, 2020
@andrewdalpino
Copy link
Member

andrewdalpino commented Dec 16, 2020

Hi @niftyhack PHP 8 support was added in 0.2.3 ... the optional Tensor extension is optional and like you say may or may not work with PHP 8. I haven't tested it myself - it would be nice to hear from someone who has tried it. The extension that will replace the Tensor extension 2.0.0 that is written in Zephir will be PHP 8 compatible (it's written in C). I can only give you a general estimate as to when the new extension will be available for general use - between 3 and 12 months. The new extension has features like CPU multiprocessing and cache-efficient Matrix blocking algorithms. The current extension in Zephir is about 2.5X faster than the PHP-land code - we are hoping to see a 10X - 100X improvement with the new extension.

The idea is to optimize our Tensor extension or provide an interface to the C Array extension through Tensor and then gradually add more features that will require the extension (not optional) such as LSTM and Convolutional neural networks (the Tensor API alone will not support these).

https://github.com/RubixML/Tensor

https://github.com/phpsci/phpsci-carray

Great question! Let me know if you had any more and feel free to join us in our Telegram Channel

https://t.me/RubixML

@niftyhack
Copy link
Author

Hi,

awesome, thanks for the heads up. 👍

@andrewdalpino
Copy link
Member

Just a quick update. The numbers we're seeing as of today are 35X speedup on a 500 x 500 x 2 matmul operation on an Intel CPU.

@marclaporte
Copy link
Member

FYI, we will test https://doc.tiki.org/Machine-Learning with PHP 8 and report any issue.

We will easily be able to compare to PHP 7 because Virtualmin supports easy switching of PHP versions, and v8 was added a few months ago:
virtualmin/virtualmin-gpl@5db536f

@andrewdalpino
Copy link
Member

Another update! ... We've removed the Zephir code from Tensor version 2.0.6+ and the extension now compiles directly from C source. Now that we are in C, we are beginning research into optimizations such as the ones aforementioned. This will help ensure that we have a path forward even if one of the projects fails.

https://github.com/RubixML/Tensor/releases/tag/2.0.6

Tensor extension 3.0 research ticket

RubixML/Tensor#5

@andrewdalpino
Copy link
Member

andrewdalpino commented Feb 1, 2021

Just an update on Tensor - we were able to get a 6X speedup on the MNIST example project in the latest update to the extension. This is up from 2.5X in the previous version. Most of the speed came from migrating Zephir code to C so that we can make additional optimizations. All the optimizations so far have spun out from the research here RubixML/Tensor#5.

https://github.com/RubixML/Tensor/releases/tag/2.1.1

Copied from our Telegram channel ...

Neural network training is about 6X faster (up from 2.5X) using the extension than PHP due to the optimization of matrix multiplication (matmul) in this module

https://github.com/RubixML/Tensor/blob/23790aa20c844e50bc014954dbe5f0b5c000a4fb/ext/include/linear_algebra.c

I think if we can find a way to get the compiler to vectorize this code, then we can get 10X with no problem ... I haven't figured out a way to get the compiler to vectorize yet though ... the problem is the data in the arrays are not scalars (they are zval structs). You'll notice on the following line, the zvals are being unpacked from structs into doubles or longs.

https://github.com/RubixML/Tensor/blob/23790aa20c844e50bc014954dbe5f0b5c000a4fb/ext/include/linear_algebra.c#L59

Since SIMD registers and operations require all the same data type, this may be preventing this loop from being vectorized by the compiler. I've tried using multiple acculumators, casting everything to doubles, and looping over a plain C array but no success yet. Admittedly, most of what I've been able to do has been with not much C programming experience.

There are opportunities for cache optimization as well. Ex. we can block the matrices such that each sub-computation fits entirely into cache memory.

http://www.netlib.org/utk/papers/autoblock/node2.html

Together, the SIMD and cache optimizations synergize as the cache optimization helps saturate the SIMD registers. I'm hoping we can get to 20X on a single thread if we can ever figure both out.

The nice thing is that we're still just using plain old PHP arrays (Hash Map) under the hood, so the data can be passed to and from PHPland without any overhead!

There also seems to be activity on the Zephir repo to make it PHP 8 compatible

https://github.com/phalcon/zephir

@andrewdalpino
Copy link
Member

The entire Rubix suite is now PHP 8.0 compatible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants