Skip to content

Conversation

@saurabhbangad
Copy link
Contributor

This script retrieves all audit logs across an OCI Tenancy for a defined timespan.

This script relies on OCI config as per the format defined at
https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/sdkconfig.htm

This script will work at a tenancy level only.

The config file should have the following contents
User OCID
RSA private key in PEM format
Fingerprint
Tenancy OCID

This script retrieves all audit logs across an OCI Tenancy for a defined timespan.

This script relies on OCI config as per the format defined at
https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/sdkconfig.htm

This script will work at a tenancy level only.

The config file should have the following contents
User OCID
RSA private key in PEM format
Fingerprint
Tenancy OCID
@jodoglevy
Copy link
Contributor

Thanks for the contribution, we'll review and get back to you

Copy link
Contributor

@nathan-vu nathan-vu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if took me a while to take a look at this. I've made some comments against the PR.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.
'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Favour block comments (one # per line) as per PEP8: https://www.python.org/dev/peps/pep-0008/#block-comments rather than the triple quote syntax

This script retrieves all audit logs across an OCI Tenancy
for a defined timespan.

This script relies on OCI config as per the format defined at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that lines 8-17, apart from line 11, are superfluous and we don't need to include in every script what the prerequisites for using the SDK are as those are documented elsewhere.


import oci

__author__ = "saurabh.bangad@oracle.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally do not do individual author attributions in our example scripts


configfile = "~/.oci/config"

config = oci.config.from_file(configfile) # Path to relevant config file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to use the default configuration location, just do oci.config.from_file() and let the SDK work it out

compartments = []
compartments.append(tenancy_id) # Tenancy is the first compartment
regions = [] # This array will be used to store the list of available regions
auditevents = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use snake casing for variable names. For example this should be audit_events

for the region defined in "config['region']".
'''
listofauditevents = []
for i in range(len(compartments)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to your other loop, range and len seem unnecessary here - you could just do for c in compartments


'''Here's the main code that retrives audits across the tenancy'''

# Retrieve the list of all the OCI regions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block comment, and the one on line 123-124, don't seem to add much value since the intent is clear from the function call what you're doing

compartments = get_compartments()

# For a region defined by config['region'] get the logs for each compartment
for i in range(len(regions)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for r in regions


# Reintialize with a different region value
audit = oci.audit.audit_client.AuditClient(config)
identity = oci.identity.IdentityClient(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never use the IdentityClient which you create here

config['region'] = regions[i]

# Reintialize with a different region value
audit = oci.audit.audit_client.AuditClient(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This re-doing of the configuration and then creating new clients per region is not necessary. Since you're not running this across multiple threads, you can just change the client in-place by calling set_region() on it

@saurabhbangad
Copy link
Contributor Author

Thanks @nathan-vu I will update the script with accommodating your feedback.

Accommodated the changes suggested by @nathan-vu

The following is a summary of all the changes:
-file name from RetrieveAuditEvents.py to retrieve_audit_events.py
-instead of hard coding to start_time and end_time, they receive values on execution
-oci pagination is being used instead of custom definition
-removed the len(range(list)) for 
-get_regions() and get_compartments() now accept identity as their parameter
-region is updated with set_region()
-changes to variables name; they use snake_casing
-edited comments
compartments.append(tenancy_id) # Tenancy is the first compartment.
compartments = get_compartments(identity)

audit = oci.audit.audit_client.AuditClient(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This re-initialization seems unnecessary - is there any reason it is here?


# ListEvents expects timestamps into RFC3339 format.
# Converting dates into RFC3339 e.g. 2018-01-15T00:00:00Z
end_time = str(today.year) + '-' + str(today.month).zfill(2) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary - the list_events call accepts these as Python datetime.datetime. We shouldn't be doing any string manipulation here and should instead rely on the SDK to do this properly for us.

Additionally, you should not use datetime.date here - datetime.datetime has the right precision you (and the API) needs

compartment_ocids = []
list_compartments_response = oci.pagination.list_call_get_all_results(
identity.list_compartments,
compartment_id=config['tenancy']).data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd because you're relying on some global which will exist at runtime.

Instead, I think this should be:

def get_compartments(identity, compartment_id):
    compartment_ocids = []
    list_compartments_response = oci.pagination.list_call_get_all_results(
        identity.list_compartments,
        compartment_id=compartment_id
    )
    ....

And then you'd just call it as get_compartments(identity, tenancy_id)

return compartment_ocids


def get_audit_events(audit, compartment_ocids, start_time, end_time, r):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the r parameter here? If it's not needed them remove it, but if it is needed then give it a more descriptive name

+ str(today_minus_one_month.month).zfill(2) + '-' \
+ str(today_minus_one_month.day) + 'T00:00:00Z'

regions = [] # This array will be used to store the list of available regions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This initialization is unnecessary

regions = get_regions(identity)

# This array will be used to store the list of compartments in the tenancy.
compartments = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three lines have a potential problem (depending on whether listing all the compartments also returns the tenancy - I don't remember if it does or doesn't). The issues is that if it doesn't return the tenancy then line 89 (compartments.append(tenancy_id)) is useless because the array is overwritten by what happens on line 90.

This should either be:

compartments = get_compartments(identity)
compartments.append(config[tenancy])

If it doesn't return the tenancy, or just compartments = get_compartments(identity, tenancy_id) (i.e. one line instead of 3) if it does return the tenancy



def get_regions(identity):
'''To retrieve the list of all available regions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to do this kind of comment, then our style in the SDK would be:

"""
To retrieve the list of all available regions
"""

# Intialize with a region value.
audit.base_client.set_region(r)
# To separate results by region use print here.
audit_events = get_audit_events(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just for demo/example then this is probably OK. In real life, this would probably be very time-consuming because a customer has to sit there until everything is done (as get_audit_events is eagerly loading all results) for each region.

If a customer really did want to slurp all the audit entries, there'd be better ways to do that. But as an example of how to use the API, probably fine

The following were the changes made in the latest version:
-start_time and end_time do not do any str operations and rely on datetime.datetime
-logs are retrieved only for the timespan of last 5 days instead of a month; this would be a better experience considering the time retrieve results
-changes to get_compartments; it will accept tenancy_id as its parameter and it will also be the added to the list even before ListCompartments api call is made
-unnecessary initializations were removed
-change of comment style for the description of functions
-parameter 'r' was removed from get_regions() which was originally put to demonstrate the region being used

Changes not made:
-No changes were made to the retrieval method i.e. it is still eagerly loading all the results
# For example, to filter write events make use of
# the following condition
# list_events_response.data.request_action != 'GET'
list_of_audit_events.append(list_events_response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be extend and not append, otherwise you'll end up with an array of arrays



import oci
from datetime import date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import

import oci
from datetime import date
import datetime
from dateutil.relativedelta import relativedelta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import

# Events can be filtered here based on some constraints.
# For example, to filter write events make use of
# the following condition
# list_events_response.data.request_action != 'GET'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrect as list_events_response.data is an array so so the statement would fail at runtime.

Additionally, if you want to tell customers about filtering, it's not going to be the most efficient if they eager load the entire result set and then filter - it would be better if they use one of the methods in oci.pagination which returns a generator and results can be filtered as they are available/requested rather than all at once at the end.

If you wish to demonstrate this scenario, I would recommend having a separate method which shows customers how to do this properly rather than just as a comment in the code.

Copy link
Contributor

@nathan-vu nathan-vu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall fine - some comment and documentation changes to make.

I also have a reasonable suspicion this won't pass linting checks (though most issues will likely be formatting related rather than functional). But someone on this end can fix up any issues from that.

return list_of_audit_events


'''Here's the code that retrives audits across the tenancy'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary comment - please remove

'''
list_of_audit_events = []
for c in compartment_ocids:
# To separate results by compartments use print here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment no longer makes sense - remove


def get_audit_events(audit, compartment_ocids, start_time, end_time):
'''
Get events iteratively for each compartment defined in 'compartments' array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment needs to be updated - there isn't a compartments array and config['region'] is not used here. Per my comment in a previous review, we should emphasise here that this eagerly loads all audit records in the time range and that customers should be aware of the performance implications of that if they have a lot of audit records.

Customers can more lazily load results by using the generator methods in oci.pagination.

Updated the following sections' comments:
-get_audit_events
@nathan-vu nathan-vu merged commit 7062f9e into oracle:master Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants