Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make t(link) (temporal aggregation of new process copies) flexible #5

Open
muelleram opened this issue Dec 14, 2023 · 3 comments
Open
Assignees
Labels
enhancement New feature or request not urgent Unessential improvement that can be postponed

Comments

@muelleram
Copy link
Collaborator

Currently, t(link) (temporal aggregation of new process copies) is set to 1 year. It would be great to let the users define the level of aggregation themselves, possibly within a reasonable temporal range. E.g. seconds/minutes might not be suitable as this would lead to a large quantity of new processes in the database and does not fit with common LCA problems.

muelleram added a commit that referenced this issue Jan 23, 2024
…s of new processes (works for years and months)

Instead of grouping everything hard-coded by year, we now can choose by what temporal resolution to group (temporal_grouping: str = 'year'). I have also made sure that the when creating the datapackages the ids are accessed correctly: the entire new datum, e.g. 20240123 should be added to the original ids, yet, this does not yet work because it results in too large integers. Now only YYYYMM is added.
@muelleram
Copy link
Collaborator Author

Full temporal flexibility is possible in the way the interpolation between timestamps is implemented, using timedelta in seconds. However, it's not possible to go to a lower resolution than months at the moment, because the new ids still need to be unique and if we extend an existing id, e.g. 245, with a the timestamp of the temporal copy, eg. 20240123 for 23.01.2024, we have a new id of 24520240123, which is too large to be stored in C long as a 32-bit, see error. We need to add the same time resolution as our temporal grouping so that the ids are still unique.

Maybe we need to change to a int64 id. Or you guys know of a better work around?

GetImage

@muelleram
Copy link
Collaborator Author

Another issue with the current implementation is that the aggregated process copies get assigned to the first timestamp within the aggregated time period in the timeline_df.
E.g. for temporal resolution = year:
P1 at 2024-12-31 and P2 at 2024-08-31 get grouped within one 2024 row and this row is reassigned to date= 2024-01-01. Correct would be to get some kind of (weighted) temporal average for the date that takes the timing of P1 and P2 into account but I wasn't able to achieve this with groupby and datetime/timedelta.

see code medusa_tools, l-115 ff

@jakobsarthur
Copy link
Collaborator

Ok, I checked with Chris and currently this indeed hardcoded to be an int32. Might be possible to change this but I suppose we could do a quick mapping as well, which might be faster.

What do we want to be able to save year-month-day-hour. I guess the season as well as discussed before, but it is already contained in the month.

@muelleram muelleram added the not urgent Unessential improvement that can be postponed label Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request not urgent Unessential improvement that can be postponed
Projects
None yet
Development

No branches or pull requests

2 participants