New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[apps][onelogin] Adding new integration app for OneLogin Events #355
Conversation
|
|
||
def _sleep_seconds(self): | ||
"""Return the number of seconds this polling function should sleep for | ||
between requests to avoid failed requests. OneLogin tokens allows for 5000 requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm: 5000/hour = ~83/min = ~1.3/sec - there's no way we'll accidentally exceed this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke offline, Javier will intentionally hit the limit locally so he can appropriately catch the exception and throw a useful error if it occurs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to do some testing in the actual API to see if we can just generate a new token once we exceed the maximum number of requests per hour (5000) per token. Although I doubt they would allow such a simple workaround to their rate limit. In any case I can add the code to have a safe sleep of 2 seconds per request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that additional testing should be done to best determine how this should be handled.
app_integrations/apps/onelogin.py
Outdated
every hour, so returning 0 for now. | ||
|
||
Returns: | ||
int: Number of seconds that this function shoud sleep for between requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should
def __init__(self): | ||
self._app = None | ||
|
||
# Remove all abstractmethods so we can instantiate DuoApp for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OneLoginApp
Since @ryandeivert is out, @austinbyers and @jacknagz can you take a look? |
Just added a new rule,
Schemas also validated:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments - if you want to chat about any of these in more detail let me know! Also, great start! This will be great to have 💯
app_integrations/apps/onelogin.py
Outdated
Returns: | ||
str: Bearer token to be used to call the OneLogin resource APIs | ||
""" | ||
authorization = 'client_id: %s, client_secret: %s' % (client_id, client_secret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to use str.format(...)
instead of the older percent formatting. I'd suggest changing this to:
'client_id: {}, client_secret: {}'.format(client_id, client_secret)
Same for the bearer...
formatting below.
app_integrations/apps/onelogin.py
Outdated
headers_token = {'Authorization': authorization, | ||
'Content-Type': 'application/json'} | ||
|
||
response = requests.post(token_url, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I anticipate potentially needing post
abilities in other future apps as well. Could we move this functionality to the AppIntegration
base class in app_base.py
with a method name of something like _make_post_request
and change the _make_request
method to _make_get_request
? By doing this, the new method could just return False or the JSON from the response (similar to how _make_request
works now).
If this is too much right now, this can happen at a later date - just a thought!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I also refactor the duo application to use the new _make_get_requests
in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please! :)
app_integrations/apps/onelogin.py
Outdated
def _gather_logs(self): | ||
"""Gather the authentication log events.""" | ||
|
||
if not self._auth_headers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ nice!
app_integrations/apps/onelogin.py
Outdated
""" | ||
# OneLogin API expects the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ | ||
formatted_date = datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ') | ||
params = {'since': formatted_date} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is going to actually only get events since the current time. This will likely result in events for only a few seconds or less (the time between setting this date and actually querying their API for the events).
You'll want to use the self._last_timestamp
for this param, but we'll have to reason about this more since you need a formatted time and not a unix/integer timestamp
app_integrations/apps/onelogin.py
Outdated
|
||
events = self._get_onelogin_paginated_events(params) | ||
|
||
while self._more_to_poll and self._next_page_url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think we should avoid doing the pagination directly within the subclass and just let the while
within the base classes gather handle this. We can talk more about the requirements for this.
The main thought process here is that this while loop could loop for a good amount of time, and the subclassed apps don't have a concept of when the lambda function is getting close to timing out. By letting the looping happen on the base class, where there is some checks in place to look for nearing the time out, we can operate with a little more safety.
app_integrations/apps/onelogin.py
Outdated
return events | ||
|
||
def required_auth_info(self): | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we spoke about offline, please add the localization (us/eu) to these so they can be user-provided.
app_integrations/apps/onelogin.py
Outdated
return { | ||
'client_secret': | ||
{ | ||
'description': ('the client secret for the OneLogin API. This ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work with these!
|
||
def _sleep_seconds(self): | ||
"""Return the number of seconds this polling function should sleep for | ||
between requests to avoid failed requests. OneLogin tokens allows for 5000 requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that additional testing should be done to best determine how this should be handled.
app_integrations/apps/onelogin.py
Outdated
Returns: | ||
The same as the method _get_onelogin_events() | ||
""" | ||
response = requests.get(self._next_page_url, headers=self._auth_headers, params=params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the base class' _make_request
be used in this instance, since it will do the response check and automatically return the json response value.
app_integrations/apps/onelogin.py
Outdated
|
||
class OneLoginApp(AppIntegration): | ||
"""OneLogin StreamAlert App""" | ||
_ONELOGIN_EVENTS_URL = 'https://api.us.onelogin.com/api/1/events' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we spoke about offline, switch to using a url that can be formatted with a region/localization (us/eu) for this link and the token link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Javier - overall great revisions! Some additional comments
app_integrations/apps/app_base.py
Outdated
bool or dict: False if the was an error performing the request, | ||
or a dictionary loaded from the json response | ||
""" | ||
LOGGER.debug('Making request for service \'%s\' on poll #%d', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add post
to this logger message somewhere, ie: Making post request for...
app_integrations/apps/app_base.py
Outdated
@@ -222,6 +222,24 @@ def _make_request(self, full_url, headers, params): | |||
|
|||
return response.json() | |||
|
|||
def _make_post_request(self, full_url, headers, json): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we choose a different name from json
for the variable, since that conflicts with the json
package?
app_integrations/apps/duo.py
Outdated
@@ -115,7 +115,7 @@ def _get_duo_logs(self, hostname, full_url): | |||
return False | |||
|
|||
# Make the request to the api, resulting in a bool or dict | |||
response = self._make_request(full_url, headers=headers, params=params) | |||
response = self._make_get_request(full_url, headers=headers, params=params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
app_integrations/apps/onelogin.py
Outdated
{ | ||
'description': ('the region for the OneLogin API. This should be a ' | ||
'string of 2 letters, lowercase or uppercase'), | ||
'format': re.compile(r'^[a-zA-Z]{2}$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we restrict this to eu
or us
since those are the only regions OneLogin supports?
app_integrations/apps/onelogin.py
Outdated
# Return the list of logs to the caller so they can be send to the batcher | ||
return events | ||
if not response: | ||
events = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just return False
here instead? the line below (151) could result in a KeyError
From CI:
^ You need to update your tests as @ryandeivert merged his changes. You'll need to specify Full details are here: https://streamalert.io/rule-testing.html |
@ryandeivert I have added a significant update to the PR, that It will probably require another round of code review. It does include the update to the tests format that @mime-frame pointed out above, the refactoring of get/post requests to return a tuple and handling the API rate limit sleep, more info here. Thanks guys! |
app_integrations/apps/onelogin.py
Outdated
params = {'since': formatted_date} | ||
request_url = self._events_endpoint() | ||
|
||
LOGGER.debug('Events to retrieve for \'%s\': %s', self.type(), self._more_to_poll) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @javuto - I added a similar logging message to this in loop within the base class:
streamalert/app_integrations/apps/app_base.py
Line 310 in f5e866d
LOGGER.debug('More logs to poll for \'%s\': %s', self.type(), self._more_to_poll) |
You can probably safely remove this, or change it to add more context on what's happening at this point in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per our offline chat, having this logging message here it is redundant with the added one in the base class. I will remove it.
app_integrations/apps/onelogin.py
Outdated
LOGGER.debug('Events to retrieve for \'%s\': %s', self.type(), self._more_to_poll) | ||
result, response = self._make_get_request(request_url, self._auth_headers, params) | ||
|
||
if not result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll may want to make sure response
is valid here as well, since it could be None? Below you may also want to change your lookups to reponse.get('status')
, etc in case the response is not in a format we expect (just for safety).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe do this:
if not (result or response):
return False
elif not result and response:
<do your logic here>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ This might be unnecessary - I'll let you decide :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach can add some unnecessary complexity to the code. I like checking first result
, then checking response
. It makes the code easy to read and to follow the flow.
app_integrations/apps/onelogin.py
Outdated
self._more_to_poll = (self._next_page_url is not None) | ||
|
||
# Adjust the last seen event | ||
self._last_timestamp = response['data'][-1]['created_at'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a safety check here to make sure the data
list actually has logs:
logs = response['data']
if not logs:
return False
self._last_timestamp = logs[-1]['created_at']
# Return the list of logs to the caller so they can be send to the batcher
return logs
app_integrations/apps/onelogin.py
Outdated
|
||
# Set pagination link, if there is any | ||
self._next_page_url = response['pagination']['next_link'] | ||
self._more_to_poll = (self._next_page_url is not None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip (not a necessary change): you can use bool()
to perform this logic:
self._more_to_poll = bool(self._next_page_url)
Returns: | ||
int: Number of seconds that this function should sleep for between requests | ||
""" | ||
return self._rate_limit_sleep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ Nice work on this logic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javuto I added a few comments that might need addressed, but I'm giving you the 🚀
app_integrations/apps/app_base.py
Outdated
@@ -204,23 +204,35 @@ def _check_http_response(self, response): | |||
|
|||
return success | |||
|
|||
def _make_request(self, full_url, headers, params): | |||
"""Method for returning the json loaded response for this request | |||
def _make_get_request(self, full_url, headers, params): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the params
arg be an optional that defaults to None:
def _make_get_request(self, full_url, headers, params=None): ...
And then update your call here so it doesn't have to pass None
explicitly
app_integrations/apps/onelogin.py
Outdated
|
||
# Adjust the last seen event | ||
self._last_timestamp = response['data'][-1]['created_at'] | ||
events = response.get('data') | ||
self._last_timestamp = events[-1]['created_at'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good changes here, except you aren't checking to see if the events
list is empty (and events[-1]
could cause an index out of range error if no events are returned). You can also probably drop the get(
above and just use events['data']
since they key will exist even if it is an empty list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm worried about is this case:
{
"status": {
"error": false,
"code": 200,
"type": "success",
"message": "Success"
},
"pagination": {
"before_cursor": null,
"after_cursor": "xWNjb3VudF9pZDo6OjUzNDEzLS0jI2lkOjo6OTA0MjU3NTQ2",
"previous_link": null,
"next_link": null
},
"data": []
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I will handle this case so it will be safe if it appears
@@ -266,7 +266,7 @@ def _determine_last_time(self): | |||
current_time = time.mktime(time.gmtime()) | |||
LOGGER.debug('Current timestamp: %s seconds', current_time) | |||
|
|||
self.last_timestamp = current_time - interval_time | |||
self.last_timestamp = int(current_time - interval_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this fix
app_integrations/apps/onelogin.py
Outdated
@@ -167,7 +184,7 @@ def _get_onelogin_events(self): | |||
# Check the type to understand the format stored | |||
if isinstance(self._last_timestamp, int): | |||
# OneLogin API expects the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ | |||
formatted_date = datetime.fromtimestamp( | |||
formatted_date = datetime.utcfromtimestamp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
@@ -96,6 +96,12 @@ def get_auth_info(cls, app_type): | |||
'integration_key': 'DI1234567890ABCDEF12', | |||
'secret_key': 'abcdefghijklmnopqrstuvwxyz1234567890ABCD' | |||
} | |||
elif app_type in {'onelogin'}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 👏 for simplification
to @ryandeivert
cc @mime-frame
resolves: #347
Change
Adding a new integration application for OneLogin. It will allow the collection of OneLogin events and process them using StreamAlert. Also the schema for the OneLogin events format is added here.
It handles the generation of tokens, using API
client_secret/client_id
pair and pagination of requests to the API, when the number of events returned (usingsince
) is over the limit (50). More information here: https://developers.onelogin.com/api-docs/1/events/get-eventsAlso added new rule to alert whenever a user is assuming a different role within OneLogin.
Testing
Unit tests are added for the class: