Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial inputs - Pakistan flood tutorial #154

Merged
merged 10 commits into from
Feb 22, 2024
Merged

Conversation

lillythomas
Copy link
Contributor

@lillythomas lillythomas commented Feb 9, 2024

This is a tutorial following the style of #149 to show how to generate embeddings from partial inputs and find a signal for a major monsoon flood in Padidan, Pakistan (August 2022) with PCA and t-SNE.

Modifications (aside from AOI and time):

  • Use of SWIR band group
  • Use of the correct mean and std for the nir, not nir08 band
  • Addition of t-SNE computation
  • modification to the datamodule that enables the non-normalized dates to be added to the batch dictionary (for plotting with the embeddings when there are 2 or more components involved...aka, date can't be one of the axes)

Comment on lines 63 to 71
date = chip.tags()["date"] # YYYY-MM-DD
year, month, day = self.normalize_timestamp(date)
(
year,
month,
day,
year_non_norm,
month_non_norm,
day_non_norm,
) = self.normalize_timestamp(date)
Copy link
Contributor

@weiji14 weiji14 Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already returning the non-normalized date as a YYYY-MM-DD string no? Instead of using ts.append(batch["timestep_non_norm"]) in your partial-inputs-flood-tutorial.ipynb notebook, could you use ts.append(batch["date"])?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this. It worked, thanks!

@@ -175,7 +184,7 @@ def train_dataloader(self):
self.trn_ds,
batch_size=self.batch_size,
num_workers=self.num_workers,
shuffle=True,
shuffle=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be shuffling the mini-batches in the train_dataloader, is there a reason to set it to False here?

Suggested change
shuffle=False,
shuffle=True,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rolled back changes to the datamodule in this PR, so it is the same as in main. I think this should be examined in a separate PR since no training is occurring for the purpose of this one.

@lillythomas lillythomas merged commit 34a655a into main Feb 22, 2024
4 checks passed
@lillythomas lillythomas deleted the flood_tutorial_pca_tsne branch February 22, 2024 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants