New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SG: don't fail when unable to OCR #1281
Conversation
Solar production is a very small part of Singapore electricity, so when it can't be retrieved, just log warning and move on. Add the OCR'd text to log message to possibly aid debugging. Currently OCR is stuck on parsing the datetime "2018-03-31 11:44" which it reads as "2018-03-31 1 1 :44". Maybe the regexp could be improved to handle that, but in meanwhile, log better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useful improvements!
@@ -1,11 +1,12 @@ | |||
#!/usr/bin/env python3 | |||
|
|||
from collections import defaultdict | |||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, since this only used once maybe it should be from logging import getLogger
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is a style thing (there is no perf difference). My very personal reason for liking logging.getLogger()
is because I don't have to scroll up to see if getLogger
was imported and from which module, or maybe a function somewhere within this file. To give an extreme example, if we were to do from re import search
, it's impossible for someone diving straight into the code to guess what search
is without reading all imports.
I don't think we have particular consistency in the codebase on this right now. At the same time I myself did the from collections import defaultdict
right above :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current codebase: import logging
9 times including SG (and in all cases it is only used for getLogger
), from logging import getLogger
7 times. So if I change it, it'll be a tie. Do we have a style guide? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought there was a difference, never mind then it doesn't matter. If there was an EM style guide it would simply read "Chaos is good". 😈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Carbon is bad, chaos is good"
black_white = gray.point(threshold_filter, '1') | ||
|
||
text = image_to_string(black_white, lang='eng') | ||
|
||
pattern = r'Est. PV Output: (.*)MWac' | ||
val = re.search(pattern, text, re.MULTILINE).group(1) | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better handling, we shouldn't throw away data just because solar is not working.
parsers/SG.py
Outdated
@@ -216,12 +219,16 @@ def fetch_production(zone_key='SG', session=None, target_datetime=None, logger=N | |||
'hydro': 0 | |||
}) | |||
|
|||
source = 'emcsg.com' | |||
if generation_by_type['solar']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it's too clever. When there is no solar production, the data that solar is 0 also comes from EMA...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe both should be included by default?
I am on mobile next couple of days, please merge if you're happy with it :) |
Solar production is a very small part of Singapore electricity, so when it can't be retrieved, just log warning and move on.
Add the OCR'd text to log message to possibly aid debugging.
Currently the image is stuck on data from ~12 hours ago. The parser has code to discard too old data, but unfortunately the time 11:44 confuses terract, which reads it as
2018-03-31 1 1 :44
. Maybe the regexp could be improved to handle that, but in meanwhile, fall through and log better.