Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieve databundle downloads bundles without checking file sizes #866

Closed
2 tasks
Emre-Yorat89 opened this issue Sep 16, 2023 · 5 comments
Closed
2 tasks
Labels
bug Something isn't working

Comments

@Emre-Yorat89
Copy link
Contributor

Checklist

  • I am using the current main branch or the latest release. Please indicate.
  • I am running on an up-to-date pypsa-earth environment. Update via conda env update -f envs/environment.yaml.

Describe the Bug

Please provide a description of what the bug is and add a minimal example/command for reproducing the bug.

Error Message

If applicable, paste any terminal output to help illustrating your problem.
In some cases it may also be useful to share your list of installed packages: conda list.

<paste here>

Hello,
When running retrieve databundle script it downloads a cutout bundle that matches first. However, there could be also another bundle which has smaller size that satisfies the country requirement too. Therefore a size check of matched bundles would prevent burden on computer memory.

@Emre-Yorat89 Emre-Yorat89 added the bug Something isn't working label Sep 16, 2023
@ekatef
Copy link
Member

ekatef commented Sep 16, 2023

Hello @Emre-Yorat89!
Thank you for adding the issue. My feeling is that addressing it would be very helpful if a particular country or region is of interest, which is a frequent use case.

As we have discussed, the current settings do not allow to download cutouts focused on the specific regions. A databundle is currently selected cover as much countries as possible from the provided countries list:

The selected bundles shall adhere to the following criteria:
- The bundles' tutorial parameter shall match the tutorial argument
- The bundles' category shall match the category of data to download
- When multiple bundles are identified for the same set of users,
the bundles matching more countries are first selected and more bundles
are added until all countries are matched or no more bundles are available

You suggest a selected data bundle should be as small as possible which sounds like a nice further development of the existing approach.

@Emre-Yorat89
Copy link
Contributor Author

I think the issue can be solved by reordering cutout bundles according to their sizes in ascending manner in the config file. Since @yerbol-akhmetov has mentioned in the weekly developer meeting that the script downloads the first bundle that matches countries or areas of interest.

@ekatef
Copy link
Member

ekatef commented Sep 19, 2023

I think the issue can be solved by reordering cutout bundles according to their sizes in ascending manner in the config file. Since @yerbol-akhmetov has mentioned in the weekly developer meeting that the script downloads the first bundle that matches countries or areas of interest.

Hey @Emre-Yorat89, agree that it may be a quite elegant and straightforward solution. @davide-f what is you feeling about that?

@davide-f
Copy link
Member

Hello!
Completely agree on the point.
To address this issue, there is the need to revise this function:

def get_best_bundles_by_category(

Currently, when multiple cutouts contain all countries, a random one is picked.
The function works by creating the dictionary dict_n_matched that contains the numbers of matched countries (in values) for every cutout (in keys), we sort it and we pick as many bundles as needed till filling all countries.
To address the proposed issue, we should revise it and create a dataframe with two columns one being the number of matched countries by cutout (the current dictionary) and the second column being the number of countries by cutout.

Then, with pandas, we can sort the dataframe by the first column first and then the second column, and pick as many bundles as needed till matching the whole countries.

@davide-f davide-f moved this to Todo in pypsa-earth Oct 5, 2023
@davide-f davide-f moved this from Todo to Done in pypsa-earth Nov 30, 2023
@davide-f
Copy link
Member

davide-f commented Dec 2, 2023

This issue is now solved thanks to #911

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants