-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add index on timed_beliefs for faster search #167
add index on timed_beliefs for faster search #167
Conversation
…he same fields so searching is sped up Signed-off-by: Nicolas Höning <nicolas@seita.nl>
Now one test is failing, example below, seemingly because it enters a belief which violates the unique constraint. I'll take a look later.
Also a note from looking at results: we have 9867 warnings, some of which are This seems useful: |
Signed-off-by: Nicolas Höning <nicolas@seita.nl>
… as well Signed-off-by: Nicolas Höning <nicolas@seita.nl>
…red by primary key Signed-off-by: Nicolas Höning <nicolas@seita.nl>
Signed-off-by: Nicolas Höning <nicolas@seita.nl>
timely_beliefs/beliefs/classes.py
Outdated
"event_start", | ||
"source_id", | ||
"sensor_id", | ||
"belief_horizon", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understood it, the order matters here, and it should lead to some kind of a funnel from most unique values to fewest unique values? If that is the case, I'd expect the following ordering from most unique to fewest unique values in a typical database:
- event start: data covers a large period and slowly but steadily grows over time
- sensor: grows with the size of the system being serviced, but let's say with less than one (hourly) sensor per hour
- belief horizon: not many unique values with respect to the previous two, and likely a quite constant number
- source: may grow with the number of API users, but once you have the sensor, there are usually only a couple of sources
Just my two cents on the matter.
That said, I don't really understand why/how the order would matter and why the database doesn't take care of figuring out the best order of such things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I read somewhere that range columns (like event_start) are a good place to lead with, as well.
To be honest, I will simply add the query that Mike suggested in the end, but not because I see a performance difference. After #166, the performance for sensors in our dataset with little data is so fast, I suspect indexes are not visible. And for the sensor with > 50% of the data, Postgres seems to ignore the index by default.
Not listing belief_horizon in the index (but including it as column) makes sense, as we are using min()
on it.
This PR is an improvement over the status quo, but we might revisit indexing when we have much more data.
…ing min() on it) Signed-off-by: Nicolas Höning <nicolas@seita.nl>
Signed-off-by: Nicolas Höning <nicolas@seita.nl>
This will speed up
search_session()
, as that query and its subquery are looking for the same fields.Note that I removed the usage of
has_inherited_table()
, which blocked the existingUniqueConstraint
from being applied. Let me know if there was a strong argument for using it that I did not find. This function tests if one of the classes the model inherits from has a table assorted with it. I believe you might intend this as a protection of some sort? In FlexMeasures, we inherit fromdb.Model
and fromtb.TimedBeliefDBMixin
, both of which have no table specified.As to the unique constraint - it was never applied due to ,
has_inherited_table()
returningFalse
. If we apply it, we don't allow beliefs for the same event (and from the same source...) with different probabilities (confirmed in one test failing and telling us that, as well). So I decided we don't need this constraint. The combined PK is the same but with the probability in it, so it seems to me we are fine.