-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
obs example: SDSS spectra #154
Comments
Has been added to www.tng-project.org/data/obs/ |
Is very slow, not much happens after awhile. What is a more efficient approach? |
Let me give it a try, but I would already have the following suggestions:
Fixing this will require you to pick one of two routes:
|
Can you convert the code into a more efficient version, that downsamples the array to e.g. ~1000x1000 pixels (plenty) for the imshow? |
Here's an example to get you started: path = '/virgotng/mpia/obs/SDSS/'
filename = 'sdss-dr17-spectra.hdf5'
ds = scida.load(path + filename, units=False)
classes = {0:'GALAXY', 1:'STAR', 2:'QSO'}
def average_bins(array, bin_size=1000):
"""Average bins in the first axis by a given bin_size."""
a, b = array.shape
remainder = a % bin_size
# If there is a remainder, pad the array
if remainder != 0:
pad_size = bin_size - remainder
padded_array = np.pad(array, ((0, pad_size), (0, 0)), mode='constant', constant_values=0)
else:
padded_array = array
# Now, reshape and compute the mean along the new axis
reshaped_array = padded_array.reshape(-1, bin_size, b)
averaged_array = reshaped_array.mean(axis=1)
return averaged_array
ims = []
for cl, label in classes.items():
# class and z are small arrays, so we just load them into memory as numpy arrays...
w = np.where(np.array(ds['class']) == cl)[0]
z = np.array(ds['z'])[w]
inds = np.argsort(z) # ... and argsort is not properly supported by dask anyway
flux = ds['flux'].rechunk((-1, 100)) # important to rechunk in first dimension where we use indices
im2d = flux[w[inds],:]
# reduce size by averaging 10000 spectra each
bin_size = 10000
im2d = average_bins(im2d, bin_size)
im2d = im2d.compute()
ims.append(im2d)
fig, axes = plt.subplots(ncols=3, nrows=1, figsize=(14,8))
for i, (ax, im2d) in enumerate(zip(axes,ims)):
im = axes[i].imshow(im2d, origin="lower", aspect="auto")
ax.set_xlabel("spectral direction")
if i==0:
ax.set_ylabel("redshift direction")
else:
ax.set_yticks([])
fig.colorbar(im, label="flux")
plt.savefig("sdss-dr17-spectra.png", dpi=150) See code comments for details of relevant changes. This was run in ~1 min in a distributed environment with a vera/p.large node on 8 cores. |
There is a new obs data file available:
/virgotng/mpia/obs/SDSS/sdss-dr17-spectra.hdf5
As with the GAIA file, the
description
attributes contain unit metadata.Can you make a code snippet to use scida to: create a 3-panel plot, each panel a 2D image of
spec
(color is flux), forz
(ordered) vswave
. The three panels separate byclass
(0 = galaxy, 1 = star, 2 = qso).Can start an "obs cookbook".
Nice addition to the docs gallery.
The text was updated successfully, but these errors were encountered: