Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor forecast database access #46

Merged
merged 5 commits into from
Jul 19, 2021
Merged

Refactor forecast database access #46

merged 5 commits into from
Jul 19, 2021

Conversation

odow
Copy link
Collaborator

@odow odow commented Jul 6, 2021

Part of #40
Closes #49

This was referenced Jul 6, 2021
@odow odow changed the title Remove timezone_string and use FixedOffset everywhere Refactor forecast database access Jul 9, 2021
@odow odow requested a review from Matthew-Boyd July 9, 2021 05:59
@odow
Copy link
Collaborator Author

odow commented Jul 9, 2021

So this refactors things into a refresh_forecast_in_db function that must be called from the mediator to update the database.

Then the rest of the code just grabs the latest copy. I'm not sure about the logic of moving things out of forecasts.py. Things are pretty well self-contained, and splitting some into mediator and some into solar_plot might not be a win.

Copy link
Collaborator

@Matthew-Boyd Matthew-Boyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I fully understand what forecasts are being showed on the plots. I see that a horizon of forecasts ('forecasts_for') are created at each 'forecasts_made', but can 'forecasts_made' not be treated the same as the timestamps column in the other db tables? I.e., is 'forecasts_made' the x-axis, but only before the present time, and then 'forecasts_for' at the latest 'forecasts_made' the x-axis after the present time? I ask this in case handling the forecasts plot is more nuanced than usual.

Regarding the logic of what handles what, I don't think it's good to have the Bokeh server importing from Django apps (e.g., "from mediation import forecasts"). This may be causing some of the speed issues (and other weird errors) we're having as the Bokeh server is unnecessarily duplicating some of the Django backend calls in the process. So it would be ideal if we can reduce that. Therefore, I'd like the Bokeh server process (e.g., in file /solar_plot/main.py) to just be querying from the database.

Also, it's good to follow the single-responsibility principle, and forecasts.py should just be providing forecasts. If it needed it's own database for some reason, and it wasn't sharing it with other classes, than that's its own business and concern. But the mediator is the one handling all the different sub-model interaction and main database input. Having submodels do some of that too is confusing and likely error prone.

Specifically, the function to add forecasts data to the database should be in mediator.py (Forecasts.py should not know anything about mediator or the database). The function forecaster.getForecast() should be pulled out of get_weather_df() in mediator and put after that function call. Mediator should then handle the forecasts output, including validating it and then adding it to the database. This should also serve to simplify the circuitous code flow paths through _updateDatabase, _updateLatestForecast, latestForecast and their calls in the Bokeh plot file. I'm happy to collaborate on these changes, add to a pull request, etc.

Lastly, it looks like we're still localizing the datetimes in mediator.py to be local to the plant (fixed-offset). I was picturing most all datetimes now having the UTC timezone. Am I misunderstanding the latest changes?

@odow odow requested a review from Matthew-Boyd July 12, 2021 21:15
@odow
Copy link
Collaborator Author

odow commented Jul 12, 2021

How's this? All the database stuff is now in mediator. And the bokeh server only pulls from the database. No more calls to forecasts.py. The get_weather_df call is fast now because it no longer updates the database.

Copy link
Collaborator

@Matthew-Boyd Matthew-Boyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for following the Lore diagram literally. One thing though is that we could keep the returned forecasts data in memory instead of retrieving it from the database almost immediately after saving it. We'd eliminate one retrieval, but I don't think this is a priority to do now.

It looks like the forecasts returned from get_raw_data(), and saved in the database, are in 1 hr. timesteps? For the other database tables we're saving in the timestep specified by the 'time_steps_per_hour' parameter. To do this with the forecasts would mean sub-sampling the real 1 hr. forecasts (if I have that right), which would then also give more resolution in the plot (unless we filter the query to get the 'real' data again.) I think it's good as is, and probably something you've already thought about, but let's discuss if you might want to go to finer timesteps in the database.

@odow odow merged commit 74c64fa into develop Jul 19, 2021
@odow odow deleted the od/fixed-offset branch July 19, 2021 05:22
@odow
Copy link
Collaborator Author

odow commented Jul 19, 2021

I'll take a look at the things you mention in a separate PR. The 1hour reason is that's the resolution the forecasts originally come in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove database methods from forecasts file
2 participants