Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[apps][onelogin] Adding new integration app for OneLogin Events #355

Merged
merged 15 commits into from Oct 17, 2017

Conversation

javuto
Copy link
Contributor

@javuto javuto commented Oct 6, 2017

to @ryandeivert
cc @mime-frame
resolves: #347

Change

Adding a new integration application for OneLogin. It will allow the collection of OneLogin events and process them using StreamAlert. Also the schema for the OneLogin events format is added here.
It handles the generation of tokens, using API client_secret/client_id pair and pagination of requests to the API, when the number of events returned (using since) is over the limit (50). More information here: https://developers.onelogin.com/api-docs/1/events/get-events

Also added new rule to alert whenever a user is assuming a different role within OneLogin.

Testing

Unit tests are added for the class:

$ ./tests/scripts/unit_tests.sh
...
OneLoginApp - Gather Events Entry Point ... ok
OneLoginApp - Generate Headers, ... ok
OneLoginApp - Get OneLogin Events, Bad Response ... ok
OneLoginApp - Get OneLogin Events, No Headers ... ok
OneLoginApp - Get Paginated Events ... ok
OneLoginApp - Get OneLogin Paginated Events, Bad Response ... ok
OneLoginApp - Required Auth Info ... ok
OneLoginApp - Sleep Seconds ... ok
OneLoginApp - Verify Events Endpoint ... ok
OneLoginApp - Verify Token Endpoint ... ok
OneLoginApp - Verify Events Type ... ok
...

@ghost
Copy link

ghost commented Oct 6, 2017

42.56s$ ./tests/scripts/pylint.sh
************* Module app_integrations.apps.onelogin
W: 22, 0: Unused app imported from app_integrations.apps.app_base (unused-import)


def _sleep_seconds(self):
"""Return the number of seconds this polling function should sleep for
between requests to avoid failed requests. OneLogin tokens allows for 5000 requests
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm: 5000/hour = ~83/min = ~1.3/sec - there's no way we'll accidentally exceed this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline, Javier will intentionally hit the limit locally so he can appropriately catch the exception and throw a useful error if it occurs

Copy link
Contributor Author

@javuto javuto Oct 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do some testing in the actual API to see if we can just generate a new token once we exceed the maximum number of requests per hour (5000) per token. Although I doubt they would allow such a simple workaround to their rate limit. In any case I can add the code to have a safe sleep of 2 seconds per request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that additional testing should be done to best determine how this should be handled.

every hour, so returning 0 for now.

Returns:
int: Number of seconds that this function shoud sleep for between requests
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should

def __init__(self):
self._app = None

# Remove all abstractmethods so we can instantiate DuoApp for testing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OneLoginApp

@ghost
Copy link

ghost commented Oct 6, 2017

Since @ryandeivert is out, @austinbyers and @jacknagz can you take a look?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.7%) to 94.038% when pulling f78b963 on javuto-streamalert-onelogin-app into 82fd1a4 on master.

@javuto
Copy link
Contributor Author

javuto commented Oct 6, 2017

Just added a new rule, onelogin_events_assumed_role and tested it:

$ python manage.py lambda test --processor rule --rules onelogin_events_assumed_role
StreamAlertCLI [INFO]: Issues? Report here: https://github.com/airbnb/streamalert/issues

onelogin_events_assumed_role
       [Pass]  [trigger=1]                   rule      (stream_alert_app): OneLogin generated event when a user assumed a different role, it should alert
       [Pass]  [trigger=0]                   rule      (stream_alert_app): OneLogin generated event when a user do any other action, it should not alert



StreamAlertCLI [INFO]: (2/2) Successful Tests
StreamAlertCLI [INFO]: Completed

Schemas also validated:

$ python manage.py validate-schemas
...
onelogin_events_assumed_role
       [Pass]  [log='onelogin:events']       validation  (stream_alert_app): OneLogin generated event when a user assumed a different role, it should alert
       [Pass]  [log='onelogin:events']       validation  (stream_alert_app): OneLogin generated event when a user do any other action, it should not alert
...
StreamAlertCLI [INFO]: (27/27) Successful Tests
StreamAlertCLI [INFO]: Completed

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 94.188% when pulling 623154e on javuto-streamalert-onelogin-app into 82fd1a4 on master.

Copy link
Contributor

@ryandeivert ryandeivert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments - if you want to chat about any of these in more detail let me know! Also, great start! This will be great to have 💯

Returns:
str: Bearer token to be used to call the OneLogin resource APIs
"""
authorization = 'client_id: %s, client_secret: %s' % (client_id, client_secret)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend to use str.format(...) instead of the older percent formatting. I'd suggest changing this to:

'client_id: {}, client_secret: {}'.format(client_id, client_secret)

Same for the bearer... formatting below.

headers_token = {'Authorization': authorization,
'Content-Type': 'application/json'}

response = requests.post(token_url,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I anticipate potentially needing post abilities in other future apps as well. Could we move this functionality to the AppIntegration base class in app_base.py with a method name of something like _make_post_request and change the _make_request method to _make_get_request? By doing this, the new method could just return False or the JSON from the response (similar to how _make_request works now).

If this is too much right now, this can happen at a later date - just a thought!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I also refactor the duo application to use the new _make_get_requests in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please! :)

def _gather_logs(self):
"""Gather the authentication log events."""

if not self._auth_headers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ nice!

"""
# OneLogin API expects the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ
formatted_date = datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ')
params = {'since': formatted_date}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is going to actually only get events since the current time. This will likely result in events for only a few seconds or less (the time between setting this date and actually querying their API for the events).

You'll want to use the self._last_timestamp for this param, but we'll have to reason about this more since you need a formatted time and not a unix/integer timestamp


events = self._get_onelogin_paginated_events(params)

while self._more_to_poll and self._next_page_url:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we should avoid doing the pagination directly within the subclass and just let the while within the base classes gather handle this. We can talk more about the requirements for this.

The main thought process here is that this while loop could loop for a good amount of time, and the subclassed apps don't have a concept of when the lambda function is getting close to timing out. By letting the looping happen on the base class, where there is some checks in place to look for nearing the time out, we can operate with a little more safety.

return events

def required_auth_info(self):
return {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we spoke about offline, please add the localization (us/eu) to these so they can be user-provided.

return {
'client_secret':
{
'description': ('the client secret for the OneLogin API. This '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work with these!


def _sleep_seconds(self):
"""Return the number of seconds this polling function should sleep for
between requests to avoid failed requests. OneLogin tokens allows for 5000 requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that additional testing should be done to best determine how this should be handled.

Returns:
The same as the method _get_onelogin_events()
"""
response = requests.get(self._next_page_url, headers=self._auth_headers, params=params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the base class' _make_request be used in this instance, since it will do the response check and automatically return the json response value.


class OneLoginApp(AppIntegration):
"""OneLogin StreamAlert App"""
_ONELOGIN_EVENTS_URL = 'https://api.us.onelogin.com/api/1/events'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we spoke about offline, switch to using a url that can be formatted with a region/localization (us/eu) for this link and the token link.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 94.191% when pulling 52cf350 on javuto-streamalert-onelogin-app into 82fd1a4 on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 94.097% when pulling 167b59e on javuto-streamalert-onelogin-app into 82fd1a4 on master.

Copy link
Contributor

@ryandeivert ryandeivert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Javier - overall great revisions! Some additional comments

bool or dict: False if the was an error performing the request,
or a dictionary loaded from the json response
"""
LOGGER.debug('Making request for service \'%s\' on poll #%d',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add post to this logger message somewhere, ie: Making post request for...

@@ -222,6 +222,24 @@ def _make_request(self, full_url, headers, params):

return response.json()

def _make_post_request(self, full_url, headers, json):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we choose a different name from json for the variable, since that conflicts with the json package?

@@ -115,7 +115,7 @@ def _get_duo_logs(self, hostname, full_url):
return False

# Make the request to the api, resulting in a bool or dict
response = self._make_request(full_url, headers=headers, params=params)
response = self._make_get_request(full_url, headers=headers, params=params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

{
'description': ('the region for the OneLogin API. This should be a '
'string of 2 letters, lowercase or uppercase'),
'format': re.compile(r'^[a-zA-Z]{2}$')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we restrict this to eu or us since those are the only regions OneLogin supports?

# Return the list of logs to the caller so they can be send to the batcher
return events
if not response:
events = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just return False here instead? the line below (151) could result in a KeyError

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 94.845% when pulling c02c975 on javuto-streamalert-onelogin-app into 82fd1a4 on master.

@ghost
Copy link

ghost commented Oct 12, 2017

From CI:

StreamAlertCLI [ERROR]: (2/27) Failures
StreamAlertCLI [ERROR]: (1/2) Detected old format for test event in file 'onelogin_events_assumed_role.json'. Please visit https://streamalert.io/rule-testing.html for information on the new format and update your test events accordingly.
StreamAlertCLI [ERROR]: (2/2) Detected old format for test event in file 'onelogin_events_assumed_role.json'. Please visit https://streamalert.io/rule-testing.html for information on the new format and update your test events accordingly.

^ You need to update your tests as @ryandeivert merged his changes. You'll need to specify logs and trigger_rules, here's an example: https://github.com/airbnb/streamalert/pull/372/files#diff-0e3d1b3c864f17c2c1e876b811c33590

Full details are here: https://streamalert.io/rule-testing.html

@javuto
Copy link
Contributor Author

javuto commented Oct 16, 2017

@ryandeivert I have added a significant update to the PR, that It will probably require another round of code review. It does include the update to the tests format that @mime-frame pointed out above, the refactoring of get/post requests to return a tuple and handling the API rate limit sleep, more info here.

Thanks guys!

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 94.62% when pulling f5e866d on javuto-streamalert-onelogin-app into 3a22d5c on master.

params = {'since': formatted_date}
request_url = self._events_endpoint()

LOGGER.debug('Events to retrieve for \'%s\': %s', self.type(), self._more_to_poll)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @javuto - I added a similar logging message to this in loop within the base class:

LOGGER.debug('More logs to poll for \'%s\': %s', self.type(), self._more_to_poll)

You can probably safely remove this, or change it to add more context on what's happening at this point in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our offline chat, having this logging message here it is redundant with the added one in the base class. I will remove it.

LOGGER.debug('Events to retrieve for \'%s\': %s', self.type(), self._more_to_poll)
result, response = self._make_get_request(request_url, self._auth_headers, params)

if not result:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll may want to make sure response is valid here as well, since it could be None? Below you may also want to change your lookups to reponse.get('status'), etc in case the response is not in a format we expect (just for safety).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do this:

if not (result or response):
   return False
elif not result and response:
   <do your logic here>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ This might be unnecessary - I'll let you decide :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach can add some unnecessary complexity to the code. I like checking first result, then checking response. It makes the code easy to read and to follow the flow.

self._more_to_poll = (self._next_page_url is not None)

# Adjust the last seen event
self._last_timestamp = response['data'][-1]['created_at']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a safety check here to make sure the data list actually has logs:

logs = response['data']
if not logs:
    return False

self._last_timestamp = logs[-1]['created_at']

# Return the list of logs to the caller so they can be send to the batcher
return logs


# Set pagination link, if there is any
self._next_page_url = response['pagination']['next_link']
self._more_to_poll = (self._next_page_url is not None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip (not a necessary change): you can use bool() to perform this logic:

self._more_to_poll = bool(self._next_page_url)

Returns:
int: Number of seconds that this function should sleep for between requests
"""
return self._rate_limit_sleep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ Nice work on this logic!

Copy link
Contributor

@ryandeivert ryandeivert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javuto I added a few comments that might need addressed, but I'm giving you the 🚀

@@ -204,23 +204,35 @@ def _check_http_response(self, response):

return success

def _make_request(self, full_url, headers, params):
"""Method for returning the json loaded response for this request
def _make_get_request(self, full_url, headers, params):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the params arg be an optional that defaults to None:

def _make_get_request(self, full_url, headers, params=None): ...

And then update your call here so it doesn't have to pass None explicitly

@coveralls
Copy link

Coverage Status

Coverage increased (+0.04%) to 94.711% when pulling d7585ad on javuto-streamalert-onelogin-app into 3a22d5c on master.


# Adjust the last seen event
self._last_timestamp = response['data'][-1]['created_at']
events = response.get('data')
self._last_timestamp = events[-1]['created_at']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes here, except you aren't checking to see if the events list is empty (and events[-1] could cause an index out of range error if no events are returned). You can also probably drop the get( above and just use events['data'] since they key will exist even if it is an empty list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm worried about is this case:

{
    "status": {
        "error": false,
        "code": 200,
        "type": "success",
        "message": "Success"
    },
    "pagination": {
        "before_cursor": null,
        "after_cursor": "xWNjb3VudF9pZDo6OjUzNDEzLS0jI2lkOjo6OTA0MjU3NTQ2",
        "previous_link": null,
        "next_link": null
    },
    "data": []
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I will handle this case so it will be safe if it appears

@@ -266,7 +266,7 @@ def _determine_last_time(self):
current_time = time.mktime(time.gmtime())
LOGGER.debug('Current timestamp: %s seconds', current_time)

self.last_timestamp = current_time - interval_time
self.last_timestamp = int(current_time - interval_time)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix

@@ -167,7 +184,7 @@ def _get_onelogin_events(self):
# Check the type to understand the format stored
if isinstance(self._last_timestamp, int):
# OneLogin API expects the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ
formatted_date = datetime.fromtimestamp(
formatted_date = datetime.utcfromtimestamp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@@ -96,6 +96,12 @@ def get_auth_info(cls, app_type):
'integration_key': 'DI1234567890ABCDEF12',
'secret_key': 'abcdefghijklmnopqrstuvwxyz1234567890ABCD'
}
elif app_type in {'onelogin'}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 👏 for simplification

@coveralls
Copy link

Coverage Status

Coverage increased (+0.04%) to 94.714% when pulling 7bfe013 on javuto-streamalert-onelogin-app into 3a22d5c on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.06%) to 94.616% when pulling ba14c63 on javuto-streamalert-onelogin-app into 80a974b on master.

@javuto javuto merged commit 24b04d5 into master Oct 17, 2017
@javuto javuto deleted the javuto-streamalert-onelogin-app branch October 17, 2017 17:19
@ryandeivert ryandeivert changed the title [streamalert][app][onelogin] Adding new integration app for OneLogin [apps][onelogin] Adding new integration app for OneLogin Oct 18, 2017
@ryandeivert ryandeivert changed the title [apps][onelogin] Adding new integration app for OneLogin [apps][onelogin] Adding new integration app for OneLogin Events Oct 18, 2017
@javuto javuto added this to the 1.6.0 milestone Nov 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

App: Create StreamAlert App for OneLogin Events API
3 participants