Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added matrix functions, numeral tensors, better returns & removed unnecessary POC code #8

Merged
merged 3 commits into from May 30, 2018

Conversation

anshuman23
Copy link
Owner

@josevalim, I have added a lot of functionality in this PR. I first give the high level overview of these functionalities and then the code follows at the end to exemplify them:

  • Matrix functions: Since Elixir doesn't have something like Numpy in Python, we needed some matrix capabilities. I was actually writing most of this myself but then saw some examples in the Blackberry Erlang OTP which I adopted into the Tensorflex code base after some changes.
  • Tensor capabilities: Numeral tensors are now fully supported. This makes use of the above matrix functions to achieve what is required
  • Improved return statements: Wherever a successful return is being made, now a tuple with an :ok atom is returned and whenever the error is a case of bad input arguments I have switched to returning enif_make_badarg
  • Removed code from POC days: A lot of code was just lying around that was not being used that was written by me pre-GSoC results, as part of the POC. I have removed this now.

CODE EXAMPLES

  • Matrix functions:
    • Matrices are created using create_matrix which takes number of rows, number of columns and list(s) of matrix data as inputs
    • matrix_pos help get the value stored in the matrix at a particular row and column
    • size_of_matrix returns a tuple of the size of matrix as {number of rows, number of columns}
    • matrix_to_lists returns the data of the matrix as list of lists
iex(1)> m = Tensorflex.create_matrix(2,3,[[2.2,1.3,44.5],[5.5,6.1,3.333]])
#Reference<0.1012898165.3475636225.187946>

iex(2)> Tensorflex.matrix_pos(m,2,1)
5.5

iex(3)> Tensorflex.size_of_matrix m
{2, 3}

iex(4)> Tensorflex.matrix_to_lists m
[[2.2, 1.3, 44.5], [5.5, 6.1, 3.333]]
  • Numeral Tensors:
    • Basically float_tensor handles numeral tensors. It has two variants: one that takes in just 1 argument and the other which takes in 2 arguments
    • The one which takes 1 argument is just for making a tensor out of a single number
    • The 2 argument variant is actually more important and is used for multidimensional Tensors
    • Here, the first argument is the values and the second consists of the dimensions of the Tensor. Both these are matrices
iex(1)> dims = Tensorflex.create_matrix(1,3,[[1,1,3]])
#Reference<0.3771206257.3662544900.104749>

iex(2)> vals = Tensorflex.create_matrix(1,3,[[245,202,9]])
#Reference<0.3771206257.3662544900.104769>

iex(3)> Tensorflex.float_tensor 123.12
{:ok, #Reference<0.3771206257.3662544897.110716>}

iex(4)> {:ok, ftensor} = Tensorflex.float_tensor(vals,dims)
{:ok, #Reference<0.3771206257.3662544897.111510>}

iex(5)> Tensorflex.tensor_datatype ftensor
{:ok, :tf_float}
  • Better return statements:
    • Basically added the :ok atom to every successful return in a function, including all previously written functions
    • For incorrect input arguments, switched to returning enif_make_badarg
iex(1)> Tensorflex.string_tensor 123
** (ArgumentError) argument error
    (tensorflex) Tensorflex.string_tensor(123)

iex(2)> Tensorflex.string_tensor "123"
{:ok, #Reference<0.3771206257.3662544897.113000>}

@josevalim
Copy link
Contributor

Hi @anshuman23!

Let's ping @versilov since he has recently worked on matrex (see discussion.

If @versilov is already working on a matrex library based on NIFs, maybe we can unify the efforts?

@versilov how would the two NIF libraries interoperate? Is matrex keeping the matrix as a binary? If so, could the NIF side of tensforlow simply rely on this binary format?

It seems we are all working together improving Elixir for data science, so we should probably be in touch. :)

@versilov
Copy link
Contributor

Hello @josevalim, @anshuman23!

Yes, matrex definitely uses the same binary format. The difference is in size. Matrex uses 32bit values (ints and floats), while in tensorflex I see 64 bit values. (BTW, will unsigned, used for the dimensions of the matrix, be of the same size on all platforms? May be it would be better to be more specific there with uint64_t?)

I think 32 bit values will be sufficient for the most tasks. Anyway, I consider to add support for different value types and different shapes to the Matrex.

@anshuman23, please, give matrex a try in tensorflex and feel free to send feature requests to me — I'll try to implement them ASAP, cuz that's what matrex needs now to polish the API.

PS: I'am not familiar with TF, but is it OK to feed array of doubles into this function with TF_FLOAT flag?

TF_NewTensor(TF_FLOAT, dims, ndims, mx1.p->data, (size_alloc) * sizeof(double), tensor_deallocator, 0);

@anshuman23
Copy link
Owner Author

Hi @josevalim and @versilov!

Sorry for the delayed reply, I did not have internet connectivity as I was on vacation and have just returned.

Yes, matrex definitely uses the same binary format. The difference is in size. Matrex uses 32bit values (ints and floats), while in tensorflex I see 64 bit values. (BTW, will unsigned, used for the dimensions of the matrix, be of the same size on all platforms? May be it would be better to be more specific there with uint64_t?)

I completely agree, I will try and switch to uint64_t in the next PR.

@anshuman23, please, give matrex a try in tensorflex and feel free to send feature requests to me — I'll try to implement them ASAP, cuz that's what matrex needs now to polish the API.

I will play around with Matrex and try to integrate it into Tensorflex first thing. :)

PS: I'am not familiar with TF, but is it OK to feed array of doubles into this function with TF_FLOAT flag?

I did not understand your question. Right now mx1.p->data is actually an array of doubles itself. This is the internal Tensorflow function as seen in their C API code:

TF_CAPI_EXPORT extern TF_Tensor* TF_NewTensor(
    TF_DataType, const int64_t* dims, int num_dims, void* data, size_t len,
    void (*deallocator)(void* data, size_t len, void* arg),
    void* deallocator_arg);

@versilov
Copy link
Contributor

versilov commented May 29, 2018

@anshuman23, hi!

Right now mx1.p->data is actually an array of doubles itself.

Yes, and you give TF_FLOAT as the first argument to TF_NewTensor. May be you should use TF_DOUBLE there? Or switch to the array of floats:)

@anshuman23
Copy link
Owner Author

Yes that will be done!

@anshuman23 anshuman23 merged commit 782a3d4 into master May 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants