Grow From Value to Array of Value #9
Replies: 3 comments 5 replies
-
Hi, I dont currently know how to solve the fork tracking. I'm planning on reading a book of parallel computing with Cuda (in june) to obtain more solid knowledge about that. I think that maybe creating tensors can solve this problem. Originally I wanted to build tensors over the Value class, because it was not in my plans to optimize it for GPU usage. Probably if we want to use GPU we should restructurate everything (or maybe not). I think that a Tensor with one in all the dimentions is practically a scalar, but it is still a N-Dimentional Tensor, and should be used like that. About the dimentionality of a scalar I think it is 0, so thats basically the diference between a scalar and a tensor with one in all the dimentions. I am learning now about ML, Cuda, etc and I think that building projects is a good way to do it. I started this project just for self-educational purposes so I didn't expected that someone will help me to improve it. So I want to thank you for your interest and help, if you have any wanted functionalities (apart from running Transformers or RetNets) you can comment them to me and we can create a plan to archieve that :) |
Beta Was this translation helpful? Give feedback.
-
I will need help with some backpropagation. For example, I don't know how to compute backpropagation for softmax or logsoftmax at first. Additionally, I need assistance with some "formal" concepts. As a scientist, I assume you try to mathematically prove your work. As an algorithm designer, I know that none of your computations are entirely realistic on a finite machine, but we can do our best to approximate real results as closely as possible. As mentioned, I'm really happy to have found your code. It contains everything needed to begin constructing a C# autograd system that's GPU-compliant. For many years, I've tried to read Python code written by researchers, but they are not developers at all. It's nearly as hard as reverse engineering from assembly language. Using C# with a DLL wrapper to call C++ is something I disagree with. If we need C++, we should use C++/CLI, not an intermediate assembly. This is the current strategy. I don't care about making it as "sexy" as possible to encourage data scientists to work with this system by making it Python-like. However, I'm sure that a data scientist with some knowledge of C# can easily do what they do in Python but without overcomplicating things. The current lack of development skills in ML is directly responsible for monstrous memory overhead, latency, and other issues. And... guys... Out of Memory (OOM)? Seriously? Neither TensorFlow nor PyTorch can unload a memory block and do swapping like we've been doing since the 80s for RAM? Are these guys serious? To me, it's just a disaster! |
Beta Was this translation helpful? Give feedback.
-
@Leonardo16AM It's a PoC of a verry basic tensor with shape using GPU tu execute addition, substraction, multiplication and division on GPU. Not sure. But I think we can expect have most of a model "kernalized" depending on hour architecture |
Beta Was this translation helpful? Give feedback.
-
I try to figure out since week about the architectural need. I've not found something as I hope. One thing, by example, I think that expose an array like object with Value type is to hard for me. At end, lot of MB should use same computation to gain efficiency with parallelisme. But as Value is so "mutable", I don't found way to track fork. I know that NVidia and other need to track this in silicium (warp fork). But I've not currently found how they have solve this computational issue. Currently I can't split a SIMD data block in unitary Value to gain array.
You spoken about Tensor And yes I think it should be a verry good thing. As you see, if you have some code, it's possible that I can write necessary class structure. But as a GPT... I use context to do this kind of thing ^^'.
And, it me or a tensor shape with all dimensions to one is a scalar ? And that a scalar have virtualy an infinity of dimension ?
FYI, I've no education. But I'm from the 80's geek kids. And from the one that lost something in life but nothing in myself :D (except my mind...)
Beta Was this translation helpful? Give feedback.
All reactions