[Proposal] Usability: Change `Event.time` => `Event.datetime` #16

Miking98 · 2024-03-21T06:13:21Z

Suggestion to change Event.time => Event.datetime.

time is actually a datetime, so we should call it what it is to avoid confusion
time is confusing -- it could be a datetime (i.e. 3/2/2024 @ 11:49pm) or just the time (i.e. 11:49pm).
Currently, we have Event.time and Measurement.datetime_value. Both are of type datetime, so we should call them the same thing to avoid confusion.

Other suggestion: instead of just Event.datetime, can we put Event.start_datetime and Event.end_datetime into the actual schema for Event (with end_datetime Optional), rather than relegating end to some arbitrary field in the metadata?

A lot of labeling functions will depend on having an end_datetime, so it seems a bit confusing to hide this in metadata.

The text was updated successfully, but these errors were encountered:

EthanSteinberg · 2024-03-26T20:36:56Z

I'd be down to a rename. I don't think it's necessary (and it would make writing code against the meds standard take more time), but if people think it's confusing, Event.datetime is fine.

jason-fries · 2024-03-26T22:43:27Z

We should decide on the default representational assumption of Event -- is it an interval or a point? There are tradeoffs of course

Events are intervals, requiring Event.start_datetime and Event.end_datetime. When an event is a point, Event.end_datetime is null
Events are points, requiring only Event.start_datetime. When an event is an interval, we materialize as 2 separate point events.

Option 1 places more burden on the ETL side, since end_datetime may require making assumptions on assignment. Option 2 seems more aligned with modern methods of linearizing/tokenization data to discrete sequences.

I guess I favor option 2, since we're learning into representations to feed into a model vs. representations that are more amendable to operations like writing labeling functions or other queries over the data (where interval reasoning might be super useful).

Miking98 · 2024-03-26T23:55:58Z

A challenge with option 2 is how do you map between the start and end of a particular event if they’re now discrete objects separated by (potentially) 100s of other events (maybe even with the same code)? Ie you’d need to keep some “event UUID” in the metadata of both the start and end objects of an event (versus if they’re part of the same object, then there’s no ambiguity). which means that the user would be responsible for generating and tracking UUIDs across events during the ETL, and any time they added/removed events adjusting those accordingly.

…

On Mar 26, 2024 at 3:43 PM -0700, Jason Alan Fries ***@***.***>, wrote: We should decide on the default representational assumption of Event -- is it an interval or a point? There are tradeoffs of course 1. Events are intervals, requiring Event.start_datetime and Event.end_datetime. When an event is a point, Event.end_datetime is null 2. Events are points, requiring only Event.start_datetime. When an event is an interval, we materialize as 2 separate point events. Option 1 places more burden on the ETL side, since end_datetime may require making assumptions on assignment. Option 2 seems more aligned with modern tokenization-style methods of linearizing data to discrete sequences. I guess I favor option 2, since we're learning into representations to feed into a model vs. representations that are more amendable to operations like writing labeling functions or other queries over the data (where interval reasoning might be super useful). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

EthanSteinberg · 2024-03-26T23:58:36Z

Having separate start and end events would cause issues. I think the best approach is the current one, where we have optional end attributes in the metadata.

jason-fries · 2024-03-27T01:22:33Z

I think it depends on how we envision MEDS data being used. Do we actually need to track specific events with a UUID or is it enough to provide an indicator token in the sequence that some end event has occurred? I don't know. It's worth considering what scenarios actually require interval events, but I concede it introduces some complexity at unknown benefit.

Since "tokenization" (i.e., the process of transforming partially ordered clinical events into a sequence) is defined higher in the MEDS transformation stack, I'm fine with using an interval event. However, I sort of hate having to always check metadata to see if an end event is defined, so I'd prefer explicit Event.start_datetime and Event.end_datetime. I don't have a sense for how much this blows up the schema size though.

EthanSteinberg · 2024-03-27T17:41:33Z

However, I sort of hate having to always check metadata to see if an end event is defined

You don't need to do that. For 99% of usecases, you only need the start. End events are very special and are application specific. The two main examples I am aware of (visits and prescriptions) have very different semantics and should generally not be handled by the same code anyways.

Miking98 changed the title ~~time and datetime_value are confusing~~ Having both time and datetime_value is confusing Mar 21, 2024

Miking98 changed the title ~~Having both time and datetime_value is confusing~~ Feature: Change Event.time => Event.datetime Mar 26, 2024

Miking98 changed the title ~~Feature: Change Event.time => Event.datetime~~ Usability: Change Event.time => Event.datetime Mar 26, 2024

EthanSteinberg changed the title ~~Usability: Change Event.time => Event.datetime~~ [Proposal] Usability: Change Event.time => Event.datetime Apr 19, 2024

EthanSteinberg added enhancement New feature or request and removed enhancement New feature or request labels Apr 23, 2024

EthanSteinberg closed this as completed Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Usability: Change `Event.time` => `Event.datetime` #16

[Proposal] Usability: Change `Event.time` => `Event.datetime` #16

Miking98 commented Mar 21, 2024 •

edited

Loading

EthanSteinberg commented Mar 26, 2024

jason-fries commented Mar 26, 2024 •

edited

Loading

Miking98 commented Mar 26, 2024 via email

EthanSteinberg commented Mar 26, 2024

jason-fries commented Mar 27, 2024

EthanSteinberg commented Mar 27, 2024

[Proposal] Usability: Change Event.time => Event.datetime #16

[Proposal] Usability: Change Event.time => Event.datetime #16

Comments

Miking98 commented Mar 21, 2024 • edited Loading

EthanSteinberg commented Mar 26, 2024

jason-fries commented Mar 26, 2024 • edited Loading

Miking98 commented Mar 26, 2024 via email

EthanSteinberg commented Mar 26, 2024

jason-fries commented Mar 27, 2024

EthanSteinberg commented Mar 27, 2024

[Proposal] Usability: Change `Event.time` => `Event.datetime` #16

[Proposal] Usability: Change `Event.time` => `Event.datetime` #16

Miking98 commented Mar 21, 2024 •

edited

Loading

jason-fries commented Mar 26, 2024 •

edited

Loading