Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haveibeenpwned sensor platform #3618

Merged
merged 8 commits into from Oct 8, 2016

Conversation

joyrider3774
Copy link
Contributor

Description:
The haveibeenpwned sensor platform creates sensors that check for breached email accounts on haveibeenpwned.

it will list every specified email address as a sensor showing the number of breaches on that email account as well as breach meta data (site title + date breach data added)

during hass startup it will request breach data for all email's specified with 5 seconds between each request. After hass has been started this delay is increased to 15 minutes. So it will check 1 email address per 15 minutes. This is done so to prevent abuse on the one hand and on the other hand the breach data almost never changes so there is no need to check frequently for new breach data. The author (troy hunt) of the service had no problems with this protection scheme (i emailed him before creating this pr)

Related issue (if applicable): fixes #

Pull request in home-assistant.io with documentation (if applicable): home-assistant/home-assistant.io#

home-assistant/home-assistant.io#1023

Example entry for configuration.yaml (if applicable):

# Example configuration.yaml entry using cloud based emoncms
sensor:
  platform: haveibeenpwned
  email: 
    - your_email1@domain.com
    - your_email2@domain.com

Checklist:

If user exposed functionality or configuration variables are added/changed:

If code communicates with devices, web services, or a:

  • [N/A] Local tests with tox run successfully. Your PR cannot be merged unless tests pass
    can't run tox on windows
  • [N/A] New dependencies have been added to the REQUIREMENTS variable (example).
    no new depencies
  • [N/A] New dependencies are only imported inside functions that use them (example).
    no new depencies
  • [N/A] New dependencies have been added to requirements_all.txt by running
    script/gen_requirements_all.py.
    no new depencies
  • New files were added to .coveragerc.

If the code does not interact with devices:

  • Local tests with tox run successfully. Your PR cannot be merged unless tests pass
  • Tests have been added to verify that the new code works.

@michaelkuty
Copy link
Contributor

Nice, but five seconds in default state is really needless and with combination timeout on request you can warm up your device if network connection will be lost.

@joyrider3774
Copy link
Contributor Author

joyrider3774 commented Sep 30, 2016

you mean if a user starts home assistant when there is no netwerk present ? And the problem being that it would keep trying doing requests at a 5 second interval untill network is back? do you suggest i should revert the 5 seconds back to the default scan interval from hass?

The problem is the more email addresses users specify in their configuration the longer it would take for getting intial data as with each update it only handles one email before it moves to the next when there is no error being thrown. If it looped through all emails then it would increase the timeouts. so for example 5 emails specified = 5x30 seconds before it had gotten the initial data for all email addresses thats why i lowered it to 5. I'm not certain if it's really a problem if users would have to wait longer to get the intial data when i revert back to default scan interval

_LOGGER.info("Checking for breaches for email %s", self.email)

req = requests.get(url, headers={"User-agent": USER_AGENT},
verify=False, allow_redirects=True, timeout=5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify=False ⚠️ don't disable SSL verification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def update(self):
"""Get the latest data for current email from REST service."""
try:
self.data = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a weird approach. Just keep a dictionary with per email the latest fetched version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm using a dictionary now

emails = config.get(CONF_EMAIL)

data = HaveIBeenPwnedData(emails)
data.update = Throttle(timedelta(seconds=SCAN_INTERVAL-1))(data.update)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this? Better to just request the info for each email address at startup and throttle the update as normal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest service blocks consequent calls it needs a delay between calls, if i was to request all email data in a loop at startup it would fail. So i wanted to add a 5 second delay between requests at startup to make sure the requests don't fail. In the beginning i've set SCAN_INTERVAL to 5 but if i set the throtlle also at 5 secons it was not updating every 5 seconds but every 10 because it's possible the throttle will block it if scan_interval & throttle are set to the same value. This did not happen if i setted the throttle 1 second lower than the scan_interval.

Futher in the code i raise the scan interval to 15 minutes and i set the throttle to scan_interval - 5 seconds for the same reason.

Basically i wanted to achieve this:
at startup to the requests per email with 5 seconds between each request, as soon as each email has gotten it's data increase the delay to 15 minutes. I also raised scan_interval because it does not need to call update so much.

at First i tried another aproach using the sleep command but this blocks everything hass will pause during startup if i use sleep command so i did not want to use sleep. I then tried to figure out a way to use just the update function as normal but make sure only 1 email is requested in each call to update.

Not sure what the best approach should have been in this case then ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm using track_point_in_time now but i still had many problems getting it to work, trying a couple of systems to make it work which all had some kind of problem. The current system that i now use works but i'm not sure if it's elegant. Basically i have one throttle value of 15 minutes for normal updates and one for forced updates and i created no throttle update functions for the data as well as the sensor. During has start up the no throtthle update function is called for each created sensor (based on an email) and they keep recalling this routine until the data object has data for the sensors email then that sensor does not shedule a new check 5 seconds in the future. The data object also does not cycle to the next email anymore except if the service returns data. This makes sure no time is lost at startup to update each email with 5 second delays. This seems to work fine for me when checking the times when it checks the data

…nitial startup of hass the request happens with 5 seconds delays and after startup with 15 minutes delays. Scan_interval is increased also to not call update as often
- use track_point_in_time system to make sure data updates initially at 5 seconds between each call until all sensor's email have a result in the dict.
@balloob balloob merged commit 8f8bba4 into home-assistant:dev Oct 8, 2016
@home-assistant home-assistant locked and limited conversation to collaborators Mar 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants