Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

anshuman23 · 2018-05-26T20:04:34Z

@josevalim, I have added a lot of functionality in this PR. I first give the high level overview of these functionalities and then the code follows at the end to exemplify them:

Matrix functions: Since Elixir doesn't have something like Numpy in Python, we needed some matrix capabilities. I was actually writing most of this myself but then saw some examples in the Blackberry Erlang OTP which I adopted into the Tensorflex code base after some changes.
Tensor capabilities: Numeral tensors are now fully supported. This makes use of the above matrix functions to achieve what is required
Improved return statements: Wherever a successful return is being made, now a tuple with an :ok atom is returned and whenever the error is a case of bad input arguments I have switched to returning enif_make_badarg
Removed code from POC days: A lot of code was just lying around that was not being used that was written by me pre-GSoC results, as part of the POC. I have removed this now.

CODE EXAMPLES

Matrix functions:
- Matrices are created using create_matrix which takes number of rows, number of columns and list(s) of matrix data as inputs
- matrix_pos help get the value stored in the matrix at a particular row and column
- size_of_matrix returns a tuple of the size of matrix as {number of rows, number of columns}
- matrix_to_lists returns the data of the matrix as list of lists

iex(1)> m = Tensorflex.create_matrix(2,3,[[2.2,1.3,44.5],[5.5,6.1,3.333]])
#Reference<0.1012898165.3475636225.187946>

iex(2)> Tensorflex.matrix_pos(m,2,1)
5.5

iex(3)> Tensorflex.size_of_matrix m
{2, 3}

iex(4)> Tensorflex.matrix_to_lists m
[[2.2, 1.3, 44.5], [5.5, 6.1, 3.333]]

Numeral Tensors:
- Basically float_tensor handles numeral tensors. It has two variants: one that takes in just 1 argument and the other which takes in 2 arguments
- The one which takes 1 argument is just for making a tensor out of a single number
- The 2 argument variant is actually more important and is used for multidimensional Tensors
- Here, the first argument is the values and the second consists of the dimensions of the Tensor. Both these are matrices

iex(1)> dims = Tensorflex.create_matrix(1,3,[[1,1,3]])
#Reference<0.3771206257.3662544900.104749>

iex(2)> vals = Tensorflex.create_matrix(1,3,[[245,202,9]])
#Reference<0.3771206257.3662544900.104769>

iex(3)> Tensorflex.float_tensor 123.12
{:ok, #Reference<0.3771206257.3662544897.110716>}

iex(4)> {:ok, ftensor} = Tensorflex.float_tensor(vals,dims)
{:ok, #Reference<0.3771206257.3662544897.111510>}

iex(5)> Tensorflex.tensor_datatype ftensor
{:ok, :tf_float}

Better return statements:
- Basically added the :ok atom to every successful return in a function, including all previously written functions
- For incorrect input arguments, switched to returning enif_make_badarg

iex(1)> Tensorflex.string_tensor 123
** (ArgumentError) argument error
    (tensorflex) Tensorflex.string_tensor(123)

iex(2)> Tensorflex.string_tensor "123"
{:ok, #Reference<0.3771206257.3662544897.113000>}

…removed unused POC functions

josevalim · 2018-05-28T10:52:36Z

Hi @anshuman23!

Let's ping @versilov since he has recently worked on matrex (see discussion.

If @versilov is already working on a matrex library based on NIFs, maybe we can unify the efforts?

@versilov how would the two NIF libraries interoperate? Is matrex keeping the matrix as a binary? If so, could the NIF side of tensforlow simply rely on this binary format?

It seems we are all working together improving Elixir for data science, so we should probably be in touch. :)

versilov · 2018-05-28T14:56:54Z

Hello @josevalim, @anshuman23!

Yes, matrex definitely uses the same binary format. The difference is in size. Matrex uses 32bit values (ints and floats), while in tensorflex I see 64 bit values. (BTW, will unsigned, used for the dimensions of the matrix, be of the same size on all platforms? May be it would be better to be more specific there with uint64_t?)

I think 32 bit values will be sufficient for the most tasks. Anyway, I consider to add support for different value types and different shapes to the Matrex.

@anshuman23, please, give matrex a try in tensorflex and feel free to send feature requests to me — I'll try to implement them ASAP, cuz that's what matrex needs now to polish the API.

PS: I'am not familiar with TF, but is it OK to feed array of doubles into this function with TF_FLOAT flag?

TF_NewTensor(TF_FLOAT, dims, ndims, mx1.p->data, (size_alloc) * sizeof(double), tensor_deallocator, 0);

anshuman23 · 2018-05-29T17:24:56Z

Hi @josevalim and @versilov!

Sorry for the delayed reply, I did not have internet connectivity as I was on vacation and have just returned.

Yes, matrex definitely uses the same binary format. The difference is in size. Matrex uses 32bit values (ints and floats), while in tensorflex I see 64 bit values. (BTW, will unsigned, used for the dimensions of the matrix, be of the same size on all platforms? May be it would be better to be more specific there with uint64_t?)

I completely agree, I will try and switch to uint64_t in the next PR.

@anshuman23, please, give matrex a try in tensorflex and feel free to send feature requests to me — I'll try to implement them ASAP, cuz that's what matrex needs now to polish the API.

I will play around with Matrex and try to integrate it into Tensorflex first thing. :)

PS: I'am not familiar with TF, but is it OK to feed array of doubles into this function with TF_FLOAT flag?

I did not understand your question. Right now mx1.p->data is actually an array of doubles itself. This is the internal Tensorflow function as seen in their C API code:

TF_CAPI_EXPORT extern TF_Tensor* TF_NewTensor(
    TF_DataType, const int64_t* dims, int num_dims, void* data, size_t len,
    void (*deallocator)(void* data, size_t len, void* arg),
    void* deallocator_arg);

versilov · 2018-05-29T17:49:02Z

@anshuman23, hi!

Right now mx1.p->data is actually an array of doubles itself.

Yes, and you give TF_FLOAT as the first argument to TF_NewTensor. May be you should use TF_DOUBLE there? Or switch to the array of floats:)

anshuman23 · 2018-05-29T19:18:49Z

Yes that will be done!

Added matrix functions, float tensors, better return statements, and …

dd57b57

…removed unused POC functions

anshuman23 added 2 commits May 30, 2018 16:07

Fixed issue

ecf26b6

Fixed issue

f80a43b

anshuman23 merged commit 782a3d4 into master May 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

anshuman23 commented May 26, 2018

josevalim commented May 28, 2018

versilov commented May 28, 2018

anshuman23 commented May 29, 2018

versilov commented May 29, 2018 •

edited

anshuman23 commented May 29, 2018

Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

Conversation

anshuman23 commented May 26, 2018

josevalim commented May 28, 2018

versilov commented May 28, 2018

anshuman23 commented May 29, 2018

versilov commented May 29, 2018 • edited

anshuman23 commented May 29, 2018

versilov commented May 29, 2018 •

edited