New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding in Hyperspy/Pyxem Support #27
Comments
Yes, we would be happy to add support for hyperspy. I think adding a Please do make a pull request. |
One gotcha is that calibrations in abTEM are allowed to be
|
Anything that can be considered a "measurement" would translate nicely to Hyperspy: projected potentials, simulated STEM images, simulated 4D-STEM data, simulated DP's. In this case you could also consider shaving off parts of code that are not core business of image simulation and leave that up to the analysis packages: making line profiles, integrating within a disk, ... just an idea. I would also be in favor of having an in-memory With regards to |
I will think about that. Although, some of the functions do have some minor features relevant to simulations, e.g. the Fourier space disc integration exactly matches the output of the abTEM is currently undergoing a fairly big transformation to make all the arrays dask. It might be possible for hyperspy to grab and extend the task graph from abTEM to write results in supported (parallel) formats during a simulation. It might require some work on the hyperspy side and I am not sure how hyperspy handles parallel write. The calibration is typically |
This is actually great. If your output can be cast to a dask array Hyperspy will happily work with it as the data of a "Lazy" signal. One just needs a light metadata + axis information wrapper around it and that's it. All hyperspy signals have a
Aha so basically one could think of it like a non-linear axis, with just e.g. (x, y) scan coordinates given. Maybe the following PR could be relevant hyperspy/hyperspy#2399 |
Any of this would slot into hyperspy quite easily! For writing data to the disk, hyperspy currently uses the From my experience I tend to not use the compression feature when saving my data (it adds a fair bit of overhead in saving and loading for large data sets >10 GB). There is probably a way to speed that up but that is near the bottom of my list of things to do. The one thing I've been meaning to fix in My experience with lazy operations are that they are easy to implement but sometimes difficult to implement well. I do think that lazy 4-D STEM simulations would be incredibly useful so if you need any help I would be willing to offer some.
In reference to allowing |
I merged the pull request. I will keep the issue around, as we should revisit it in the new version with dask. Given a lazy abTEM measurement object I hope it will be possible to do:
This should then trigger the simulation to start and save the result directly in the hyperspy format.
Thanks, that is very useful.
Indeed, my initial naive implementation of dask was 10 times slower than the thread parallelized version (on my 8-core test system). I have been working hard on it, so now the dask version, using processes or threads, should be faster, even on small systems. I will certainly take that help. There is already an After some testing, I am thinking of switching to I agree that lazy 4d-stem simulations are useful, particularly if the diffraction patterns are calculated from a precalculated PRISM scattering matrix, the reduction can be done much faster than required for real-time visualization. In this context, the scattering matrix could be thought of as a lossless compression of the data. |
Yep! I don't see any reason that wouldn't work!
I'll take a look at it if you ping me! Thanks!
I haven't ever really used it. I'm more familiar with
That is a cool idea! Curious to see how it works out. |
This is pretty exciting, finally a great simulation code that will integrate well and is built on the same software stack as the most popular analysis code! I've been using abTEM for high-throughput calculations on a small GPU cluster - I simulated 700 HAADF STEM images using full multi-slice of different 5x5x20 nm crystals and it only took a week and a half, so thanks a lot for this effort!
I'm not going to claim I'm a super expert at dask but I'd also be happy to have a look at it. Have you looked into dask-cuda? Might allow you to scale to multiple GPU's on a single node, which in my opinion is currently the only selling point of PRISM.
Is that the case? I think in the end the final file format doesn't matter so much, but it would be convenient if metadata is saved alongside, including simulation "microscope" conditions. I've really only used hdf5 but I can also see why some people don't like it. |
Still working on the dask implementation. In the meanwhile, you may have some input on this new issue(#30) I created, as it was somewhat prompted by the present issue.
I have used xarray, I believe zarr is quite different. xarray is more similar to hyperspy and zarr is quite similar to hdf5.
Really cool, very happy to hear it being used.
Yes, please 👍. I will ping you.
I have, although I have only used it with a single GPU for now, but I think parts of the code in the dask branch should scale to multiple GPU's already.
Yeah, I will try make metadata a feature. It is slightly more difficult due the modular nature of abTEM, there is not necessarily one simulation, so we have to be smart about propagating metadata through the objects. |
You prompted me to look into the issue a little bit more. I've recently be running into some problems saving dask datasets into hdf5. There is apparently support using mpi with parallel-hdf5, but I think that zarr has a much better framework for those kind of operations. I'm probably not telling you anything new... just catching up. I will open an issue in hyperspy about potentially loading/saving using zarr rather than hdf5. Overall it should be fairly easy to add in support for zarr loading and saving as the file format is very similar to hdf5. From there I think for larger datasets using the zarr format should be preferable to hdf5. I don't know if a complete switch from hdf5 to zarr would ever occur, but something different from the current implementation is clearly needed. |
Just so that things stay up to date, I am in the process of adding in the ability to save hyperspy signals in the Right now this seems to work excellently with |
Shall we close this, @jacobjma ? |
You can probably close this issue currently but I've been meaning to create some examples converting abtem --> hyperspy. Particularly there are a couple of cool things like the lazy plotting/ caching in hyperspy which mean that you should be able to create a lazy simulation and then plot it lazily using hyperspy which would be quite cool. That process would probably go something like.
Then you can visualize your simulation, check certain parts of the image all without significant computations occurring. These kind of end to end lazy operations are super cool but it would be good to have good examples of how to do some of these things. |
This should work in the main branch, the code for converting abTEM to HyperSpy now exist here. I just had to fix a bug, the testing of the You can create an abTEM focal series ensemble like below, the ensemble axes will become navigation axes when it is converted to Hyperspy.
will produce this lazy signal: You can also do a
which will give you this signal. Any |
@jacobjma yea I would imagine that converting a
@jacobjma This is fantastic! I have to admit when I originally added the One thing is I was wondering if there is a way to make a fast HRTEM simulation using a Plane wave that matches the real space sampling for a 4D STEM simulation? I've been trying with the new version of abtem but can't seem to get it to work. Just to give you an idea of what I am doing. In hyperspy you can set the For example: import ase
import abtem
probe = abtem.Probe(energy=80e3, semiangle_cutoff=20, defocus=100, sampling=0.01) # just one defocus
atoms = ase.build.bulk("Au", cubic=True) * (2, 2, 1)
atoms.center(vacuum=2)
detector = abtem.PixelatedDetector()
# setting the max_batch smaller is better for responsive plotting (not so much for task generation)
measurement = probe.scan(atoms, detectors=detector,max_batch=8)
hs_signal = measurement.to_hyperspy()
#hs_signal.compute_navigator # computes the entire signal (not very helpful)
hs_signal.navigator = bf_image # Much faster to compute for a basic representation (This can be anything as long as it is fast)
hs_signal.axes_manager.indices = (10,10) # jump to the middle of the plot
hs_signal.plot(vmax="99th") # This is fast(er) now as the navigator is already set! Not sure how useful this actually is but that probably depends on how fast your computer is. You still have to compute the current chunk before displaying it which can be slightly laggy and any time you move from one chunk to the next things are slow as it has to compute the next chunk. I assume if you use a GPU you are probably going to be a little bit happier about this and maybe if you use a Prism algorithm it might be even better. Anyways I should take another look at the |
Are you coming to M&M? We could discuss this in person. |
I just realized that I never responded to this :) Yes I'll try to find you to talk to you after your talk today! |
Some of the developers over at pyxem were talking about adding in support for dynamic type simulations/ 4-D STEM pyxem/diffsims#170. abTEM came up as a good alternative which we could reference and work alongside rather than trying to support dynamic type simulations.
I was wondering if you would be interested in adding support for hyperspy/pyxem to the package?
There are a couple of ways that this could be accomplished (non-exclusively):
to_hyperspy
to theMeasurement
class which converts someMeasurement
object to someBaseSignal
class. (This would require a pyxem/hyperspy dependency which might be more of a consideration than just writing the files)I think that this would be an easy way to increase the visualization/ analysis tools available some simulation without too much added overhead. I would be willing to create a small pull request to make these changes if you are interested in adding in this support.
@din14970 Based on the conversation we had earlier you might have some additional thought.
The text was updated successfully, but these errors were encountered: