Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can this idea be used for 3D voxel convolutional NN? #9

Closed
xuzhang5788 opened this issue Oct 14, 2020 · 9 comments
Closed

Can this idea be used for 3D voxel convolutional NN? #9

xuzhang5788 opened this issue Oct 14, 2020 · 9 comments

Comments

@xuzhang5788
Copy link

xuzhang5788 commented Oct 14, 2020

It will be great If it can be applied to 3D voxel convolutional NN. Any concerns? Thanks

@lucidrains
Copy link
Owner

Sure! It's the same concept, but with an added dimension

@xuzhang5788
Copy link
Author

@lucidrains
Thank you for your reply. Will you have any plans to implement it in your library?
By the way, I don't really understand how to deal with channels. For 2D images, they have RGB 3 channels. Thanks a lot

@lucidrains
Copy link
Owner

Not really for this library, as I'd like to keep it image specific. But I'd be happy to share the code snippet you need if you show me the shape of your input tensor. It won't amount to more than 10 lines, before it goes into a standard transformer

@xuzhang5788
Copy link
Author

Thank you so much. My data shape is (31, 20, 20, 20). Voxel size is (20, 20, 20) with 31 channels.

@lucidrains
Copy link
Owner

@xuzhang5788
Copy link
Author

xuzhang5788 commented Oct 15, 2020

@lucidrains Thank you very much. Does Linformer library need some changes for 3D? If I want to have 10 patches, is it okay to change
efficient_transformer = Linformer(
dim=128,
seq_len=49+1, # 7x7 patches + 1 cls-token
depth=12,
heads=8,
k=64
)

into

efficient_transformer = Linformer(
dim=128,
seq_len=1000+1, # 7x7 patches + 1 cls-token
depth=12,
heads=8,
k=64
)

@lucidrains
Copy link
Owner

@xuzhang5788 you just have to make sure the sequence length is correct

yup, 10 patches would be 10 ** 3 + 1 (for cls token)

@lucidrains
Copy link
Owner

for linformer, k is recommended to be around 256 at that length

@xuzhang5788
Copy link
Author

Thank you a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants