-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Add Array
type
#2805
base: nightly
Are you sure you want to change the base?
[stdlib] Add Array
type
#2805
Conversation
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk If possible, I would recommend to avoid starting working on a new function/struct that's not present in Python and that has not been approved yet by the Mojo maintainers in the issues. While I'm sure that the new structs/functions contributors are writing have value, there is information internal to Modular that we don't have, it's possible that the Modular staff has already thought about some alternative, or has a plan for another API. Since we're in the dark here, let's proceed with caution. It know it's annoying as it can delay your work, but it may also avoid some wasted work on PRs. (well, hopefully not really wasted, since I'm sure you learned a lot while writing those PRs!) I'm currently waiting for the go of the maintainers to implement small buffer optimization in |
Thank you for the comment, I'm waiting for confirmation but it's really fun to tinker in this language. The main difference between this type and SBO is that this is 100% on the stack. It tries to follow Python's Array as well, but only using SIMD. My hope is that this becomes the type that is used for high performance IO or be the user friendly interface to SIMD, since List lives in the heap and SBO will only get you so far when you want Arrays of different widths and do vectorized ops on them. If, for example, someone wants to do high performance operations on strings that are parallel like uppercasing SBO will give you faster acces but not really faster ops, whereas SIMD does. I've seen your PR and do think it is necessary and will have a big impact on List perf., but I don't think List should be the end all be all. The main reason ppl. will come to Mojo is for out of the box support for the most innovative CPU and GPU ops, to get that, people will expect an Array to be there and be performant. If we stay only with SBO and iterating over arrays in a classic way... every high perf. C++ lib will be faster than the Mojo stdlib. Mojo should be the place for easy to use SIMD ops IMO. There are some things that I'm not even sure if they are possible, since I don't know the layout in memory of SIMD and DTypePointer's pointee and many other Mojo internals (I don't understand MLIR), so this PR is really just an experiment. I have no problem killing it if the Mojo team tells me to. Code is just code. |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Closes #2804
Since the proposal to rename
InlineList
toArray
fell through (#2773), I thought of adding a dynamic Array that follows most of Python's behavior but lets the developer decide if and how much small buffer optimization to use for DType subtypes.The current implementation has basically become a user friendly wrapper around SIMD.
The current implementation lets the programmer decide what amount of "capacity" to allocate for the Array. Under the hood it rounds up to the next upper power of two for the underlying SIMD. Though every method has then to take that delta into account (everything is parametrized).
Every operation on the Array is done in a vectorized manner except extending, appending, concatenating, etc. So using
array.__contains__
can be a lot faster depending on the hardware it's running on.You then have calculations like the dot product, cosine between arrays, applying a function to the array, and many other future ease of use features that can be added that vectorize the ops wherever possible.
Arrays of different lengths can interact with each other in many methods where it makes sense (concatenation, appending values from other to self, etc.).
Another important aspect for the future is the ease of use for going
List[T] <-> Array[T]
taking SBO into account.It would also be awesome to add some benchmarks.
Examples: