Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch dc2 to multiband coadded #903

Merged
merged 2 commits into from
Jul 28, 2023
Merged

switch dc2 to multiband coadded #903

merged 2 commits into from
Jul 28, 2023

Conversation

XinyueLi1012
Copy link
Collaborator

  1. use coadded image instead of calibrated image
  2. use multiband (6 bands for dc2)

@XinyueLi1012 XinyueLi1012 linked an issue Jul 28, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jul 28, 2023

Codecov Report

Merging #903 (1167ce2) into master (417127e) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #903      +/-   ##
==========================================
- Coverage   95.94%   95.93%   -0.01%     
==========================================
  Files          25       25              
  Lines        2812     2830      +18     
==========================================
+ Hits         2698     2715      +17     
- Misses        114      115       +1     
Flag Coverage Δ
unittests 95.93% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
bliss/surveys/dc2.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

split_lim = self.image_lim[0] // self.n_split
image = torch.from_numpy(image)
split1_image = torch.stack(torch.split(image, split_lim, dim=1))
split2_image = torch.stack(torch.split(split1_image, split_lim, dim=3))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I didn't think of using torch.split here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is data loading much faster now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it take 5 mins to load 1 track data. The speed sometime slow down when loading the second track's data (may because of memory or something), but yeah the data loading is faster than using for loop.

@@ -157,13 +159,13 @@ def setup(self, stage="fit"):
self.test = self.dc2_data[train_len + val_len :]

def train_dataloader(self):
return DataLoader(self.data, batch_size=1)
return DataLoader(self.data, shuffle=True, batch_size=128)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to store batch_size in the config file, like how it's done in CachedSimulatedDataset:

self.data[self.slices[1]], batch_size=self.batch_size, num_workers=self.num_workers


def val_dataloader(self):
return DataLoader(self.valid, batch_size=1)
return DataLoader(self.valid, shuffle=True, batch_size=128)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to shuffle validation and test data. (No benefit, and it's slower.)

@@ -176,12 +178,17 @@ class Dc2FullCatalog(FullCatalog):

@classmethod
def from_file(cls, cat_path, wcs, height, width, band):
data = pd.read_pickle(cat_path + band + ".pkl")
data = pd.read_pickle(cat_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to give this variable a more descriptive name than data

image_list = []
bg_list = []
wcs = None
for b in range(6):
Copy link
Contributor

@jeff-regier jeff-regier Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not to hard code 6. Instead, make this function a method and change it to self.n_bands, like in the other classes that inherit from Survey.

Copy link
Contributor

@jeff-regier jeff-regier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please merge after making the minor changes above.

@jeff-regier jeff-regier merged commit ac7121a into master Jul 28, 2023
3 checks passed
@jeff-regier jeff-regier deleted the dc2-coadded branch July 28, 2023 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

switch to training with DC2 multiband coadds
2 participants