Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in OOD data distribution when Grouper is set to "regions" for FMoW #138

Closed
saraalemadi opened this issue Nov 9, 2022 · 3 comments
Closed

Comments

@saraalemadi
Copy link

Hi,

I am trying to change the groupby from "year" to "region". I have followed the instructions in the README page and currently using the following command:
python3 wilds/examples/run_expt.py --dataset fmow --algorithm ERM --groupby_fields region --root_dir wilds_fmow/

However, the issue is that the training dataset is not being separated in terms of distinct regions for ID and OOD manner. That is, all regions are included in ID as well as OOD. Here is a screenshot of the output:
Screenshot 2022-11-09 at 15 37 47

Therefore, I was wondering if that is a bug in the code or am I missing something?

Thanks
Sara A. Al-Emadi

@kohpangwei
Copy link
Collaborator

Hi Sara, this is expected: as described in our paper, the training, val, and test data come from (different time ranges in) the same regions.

@saraalemadi
Copy link
Author

Hi @kohpangwei,

Thanks for your reply. However, I would like to clarify that when running the experiment without changing the grouper (mentioned in https://github.com/p-lambda/wilds#domain-information using CombinatorialGrouper), I get the following output, which shows a clear separation between ID and OOD defined by year. Here is an example:

Screenshot 2022-11-10 at 14 41 42

But when I do specify the CombinatorialGrouper in terms of "region" rather than "year", I get a mixed output as shown in my previous comment. Therefore, shouldn't the CombinatorialGrouper split the data into distinct groups (ID and OOD) based on the specified group?

Would appreciate your clarification.

Thanks
Sara A. Al-Emadi

@kohpangwei
Copy link
Collaborator

Hi Sara,

This is expected behavior. The data splits are unaffected by the grouper. The grouper just groups the (fixed) data splits according to the specified group-by feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants