-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretty Print Tensors #257
Pretty Print Tensors #257
Conversation
burn-tensor/src/tensor/api/base.rs
Outdated
fn to_nested_vec(&self) -> Vec<Vec<B::IntElem>> { | ||
let data = self.to_data(); | ||
let mut result = vec![vec![B::IntElem::default(); self.dims()[1]]; self.dims()[0]]; | ||
for (i, val) in data.value.iter().enumerate() { | ||
let row = i / self.dims()[1]; | ||
let col = i % self.dims()[1]; | ||
result[row][col] = *val; | ||
} | ||
result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be made as a separate standalone function that can take the multi dimensional tensor and return formatted string string. This way you don't need to worry about nested vectors.
The challenging part is to convert the flat data into nested array. But I am sure it's doing. Here is an example output how numpy does:
>>> np.ones([15, 15,15, 15])
array([[[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]],
[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]],
[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]],
...,
[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current strategy is to try and recursively format the tensor as a 2D vector for each dimension, indenting based on the dimensionality level. It's still a work in progress because I need to figure out how to properly slice up tensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a recursive strategy is the only way to generalize it to any dimension, or at least the only strategy that I can think of 😅.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came across this project serde-dim that looks interesting. Maybe there is a something you can learn from it: https://github.com/RReverser/serde-ndim/blob/main/src/ser.rs . What this project is trying to accomplish is basically convert one dimensional data array into multi dimensional array. Pretty what you're trying to accomplish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antimora Looks interesting, I'll give it a look. But your description seems almost the opposite of what this PR is for, ie we're going from multi-dimensional array -> "1-dimensional" string representation and not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antimora Looks interesting, I'll give it a look. But your description seems almost the opposite of what this PR is for, ie we're going from multi-dimensional array -> "1-dimensional" string representation and not the other way around.
The tensor data is stored as one-dimensional vector and shape contains the dimensions so you know how to decompose into rows and columns. The link I shared basically is trying to take the serialized data (1d) and deserialize into n-dimensional array by recursion. By coincidence I came across this code and I thought there might be something to learn. I haven't dived deep to see if this direct helps you but in principle by its description it should. I hope it does not confuse you =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is starting to look good! I think we should make the implementation generic over the kind, and with the other minor requested changes we could merge the PR pretty soon.
burn-tensor/src/tensor/api/base.rs
Outdated
pub fn size(&self) -> usize { | ||
self.dims().iter().fold(1, |acc, &dim| acc * dim) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a similar method num_elements
in the Shape
struct, I would use it instead here, but I like the fold implementation, so we may update the num_elements
function implementation to use fold
instead of loops. size
is a bit unclear as a function name, since size often return the shape
in PyTorch, so we may update the name to num_elements
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I didn't mean to leave this in- I needed something like this for an earlier implementation I was trying out. We can remove it, rename it, or do whatever honestly 🤣
burn-tensor/src/tensor/api/base.rs
Outdated
@@ -212,6 +307,19 @@ pub trait BasicOps<B: Backend>: TensorKind<B> { | |||
rhs: Self::Primitive<D>, | |||
) -> Tensor<B, D, Bool>; | |||
fn equal_elem<const D: usize>(lhs: Self::Primitive<D>, rhs: Self::Elem) -> Tensor<B, D, Bool>; | |||
fn elem_type() -> &'static str { | |||
if TypeId::of::<Self::Elem>() == TypeId::of::<f32>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nathanielsimard Also here I assume I'm missing a bunch of other possibilities like f64 for float, or i8, i16, i128, etc. for ints. What else should I add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use directly core::any::type_name::<Seld::Elem>()
and all types will be supported :), I would also rename the function to elem_type_name
to be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want the more specific type that comes from core::any::type_name::<Self::Elem>()
, or should I group the possible returns and return int/float?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more specific type is needed, but we could also add the tensor kind:
data: [...],
kind: Float, // We could add a function `name` in the trait TensorKind.
elem: f32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, just threw kind in as well.
@nathanielsimard I think this is pretty much done unless you can think of anything else. I just merged main so the branch is all up to date as well. |
@agelas I think the only missing part is fixing the CI (mostly no_std support by importing stuff from alloc) and remove the |
@nathanielsimard Ok should be good now, although locally I can't seem to build because the compiler is complaining about the |
@agelas I'm curious, what is your local Rust version? Maybe I used a newly stabilized function, and I should update the minimum required Rust version in the readme. The CI is always using the latest stable Rust version. |
@nathanielsimard Oh I'm on 1.65.0, that might explain it. |
@agelas You can run clippy and cargo fmt with the lastest Rust to fix most problems. cargo clippy --fix
cargo fmt --all |
@nathanielsimard clippy and fmt didn't seem to change anything. I added a few more imports from |
@nathanielsimard Ok we're getting a little bit closer 😆. The test-burn-ndarray one is strange. I've noticed that sometimes the backend comes back as |
I think, the tests in ndarray will have ndarray as backend and the ones in tch will have tch. But backends can be used as dev dependencies just to run the tests, so it might change. I would sugest to update the test to use the |
Sorry, not quite sure what you mean by using the |
No I mean in the test, instead of hardcoding the name of the backend in the expected string, maybe use a |
@nathanielsimard I'm just gonna remove the doctest lol. |
@nathanielsimard alright this time I'm 99% sure it should pass |
@nathanielsimard I'm new to Rust so I figured I'd give #138 a shot, Let me know if this is somewhere in the ballpark of what you were thinking. I only implemented a pretty print for 2D Tensor structs composed of Ints (or at least that's what I think I did 🙃), and I can generalize to other types later.