-
Notifications
You must be signed in to change notification settings - Fork 276
Added resource_name column to all the scanners, added schema update handles in scanner dao #1864
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, it's awesome to have a way to make the schema changes. Just a comment on where the update is run.
if callable(schema_update) and subclass.__tablename__ in tables: | ||
LOGGER.info('Updating table %s', subclass.__tablename__) | ||
# schema_update will require the Table object. | ||
schema_update(tables.get(subclass.__tablename__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, it would be awesome if the running of the schema update can be moved outside of the forseti application, so that the db update operation doesn't have to run everytime the forseti application starts up, and users can have the chance to backup their database before their db schema is actually updated. This will also require a documentation change to our upgrade instruction as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great.... where is the change in the startup script, that will call the db_migrator.py?
table (Table): The table object of this class. | ||
""" | ||
col = Column('resource_name', String(256), default='') | ||
create_column(table, col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this is called again when the resource_name
column already exists? Should there be error handling or logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added error handling in db_migrator.py.
@@ -0,0 +1,62 @@ | |||
import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about calling this db_migrator.py
for a proper noun?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to db_migrator.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall, some small comments.
|
||
echo "Attempting to upgrade database." | ||
python $USER_HOME/forseti-security/install/gcp/upgrade_tools/db_migrator.py | ||
sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why sleep 5
.... can it take more than 5 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migrator script should block until the upgrade is finished, removed the sleep statement.
@@ -193,6 +193,11 @@ def GenerateConfig(context): | |||
echo "Starting services." | |||
systemctl start cloudsqlproxy | |||
sleep 5 | |||
|
|||
echo "Attempting to upgrade database." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be clear that this updating the schema
, and add if necessary
at the end.
echo "Attempting to update database schema. if necessary."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
LOGGER.info('Failed to update db schema, table=%s', | ||
subclass.__tablename__) | ||
except Exception as e: | ||
LOGGER.error(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using LOGGER.exception() here?
LOGGER.exception('Unexpected error happened when attempting to update database schema.')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
try: | ||
schema_update(tables.get(subclass.__tablename__)) | ||
except OperationalError: | ||
LOGGER.info('Failed to update db schema, table=%s', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be LOGGER.error(), rather than info()
If this is OperationalError
, can we be more specific in the log message what might have happened?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is expected when the table already has that column/column already deleted, I think info makes more sense so user won't need to worry about seeing the message at error level. Added log message in the update_schema method so we know what is failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no log message in the update_schema()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an info log in the update_schema method - LOGGER.info('Attempting to create column: %s', col_name)
LOGGER.info('Failed to update db schema, table=%s', | ||
subclass.__tablename__) | ||
except Exception as e: | ||
LOGGER.error(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add the table name here? Or is that already in the stacktrace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@@ -183,6 +188,16 @@ def __repr__(self): | |||
return string.format( | |||
self.violation_type, self.resource_type, self.rule_name) | |||
|
|||
@staticmethod | |||
def schema_update(table): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name should start with verb first. So, update_schema()
, which is also consistent with migrate_schema()
. Or is there a reason to call this schema_update()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to update_schema
pool_recycle=3600) | ||
|
||
# Upgrade Scanner tables. | ||
migrate_schema(sql_engine, scanner_dao.BASE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this? Which will make it easier to add more tables in the future?
declaritive_bases = [scanner_dao.BASE, inventory_dao.BASE]
for base in declaritive_bases:
migrate_schema(sql_engine, base)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
… into add_resource_name_col
Hi @ahoying , can you take a look at this PR again? This would be the final version if no new bugs were found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the content of every resource_name field in the scanners to ensure that was the best value for the field, but the code LGTM.
@@ -193,6 +193,10 @@ def GenerateConfig(context): | |||
echo "Starting services." | |||
systemctl start cloudsqlproxy | |||
sleep 5 | |||
|
|||
echo "Attempting to update database schema. if necessary." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nit, use a comma after schema, not period: "schema, if necessary"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out, that should definitely be a comma, updated!
… into add_resource_name_col
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of quick comments.
Args: | ||
table (Table): The table object of this class. | ||
""" | ||
col_name = 'resource_name' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq, in the future, as we have more columns to add, how would they be handled? This time, we want to add resource_name
, next time we want to add xyz
. It's hard to see how xyz
would fit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will add the column as an attribute to the class and add the column creation logic in this method,
i.e.
class Violation(BASE):
xyz = Column(...)
def update_schema(table):
col = Column('resource_name', String(256), default='')
col.create(table, populate_default=True)
col = Column('xyz', String(256), default='')
col.create(table, populate_default=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I thought. I think you should make col_name
into a list of tuples [(name, type, default), ....]
And then just loop?
Or else just hardcode the name into the Column(), instead of having a variable col_name
to track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to hard code the column name
try: | ||
schema_update(tables.get(subclass.__tablename__)) | ||
except OperationalError: | ||
LOGGER.info('Failed to update db schema, table=%s', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no log message in the update_schema()
""" | ||
col = Column('resource_name', String(256), default='') | ||
LOGGER.info('Attempting to create column: %s', col.name) | ||
col.create(table, populate_default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the pattern is to add the column as you mentioned, then should we surround try/except around each col.create()
? Otherwise, when adding xyz
in the future, would the execution reach xyz
if resource_name
already exists?
col = Column('resource_name', String(256), default='')
col.create(table, populate_default=True)
col = Column('xyz', String(256), default='')
col.create(table, populate_default=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For support purposes, will we ever need to know which columns are updated in which forseti versions?
Returns: | ||
dict: A mapping of Action: Column. | ||
""" | ||
columnMapping = {'CREATE': Column('resource_name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we want to add more columns in the future? Shouldn't we use a list?:
columnMapping = {'CREATE', [column1, column2, ..]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Returns: | ||
dict: A mapping of Action: Column. | ||
""" | ||
columnMapping = {'CREATE': Column('resource_name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why camel casing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
CREATE = 'CREATE' | ||
|
||
|
||
def create_col(table, col): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid abbreviation here?
def create_column(table, column):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
|
||
Args: | ||
table (Table): The table object. | ||
col (Column): The column object.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more specific on the Column
and Table
object's type here? Other than just 'Column` and Tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
col.create(table, populate_default=True) | ||
|
||
|
||
def drop_col(table, col): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def drop_column(table, column):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
# The format of tables is: {table_name: Table object}. | ||
tables = base.metadata.tables | ||
|
||
schema_update_method_name = 'update_schema' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be consistent with the update_schema()
method, so update_schema_method_name
would be better.
update_schema_method_name = 'update_schema'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
schema_update_method_name = 'update_schema' | ||
|
||
for subclass in base_subclasses: | ||
schema_update = getattr(subclass, schema_update_method_name, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better as:
update_schema = getattr(subclass, update_schema_method_name, None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
# schema_update will require the Table object. | ||
try: | ||
table = tables.get(subclass.__tablename__) | ||
columnMapping = schema_update() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why camel casing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@@ -183,6 +184,18 @@ def __repr__(self): | |||
return string.format( | |||
self.violation_type, self.resource_type, self.rule_name) | |||
|
|||
@staticmethod | |||
def update_schema(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method does not do any update. It would be better to be called as get_update_actions()
or get_migrate_actions()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
for columnAction, column in columnMapping.iteritems(): | ||
columnAction = columnAction.upper() | ||
if columnAction in ColumnActionMap: | ||
ColumnActionMap.get(columnAction)(table, column) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a try/except
around this line, so that if a column exists, then it can move to the next column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
… into add_resource_name_col
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more nits here.
Returns: | ||
dict: A mapping of Action: Column. | ||
""" | ||
column_mapping = {'CREATE': [Column('resource_name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the list out, so that it's easier to read?
columns_to_create = [
Column('resource_name', String(256), default=''),
Column('abc', String(256), default=''),
Column('xyz', String(256), default=''),
]
column_mapping = {
'CREATE': columns_to_create,
DELETE: columns_to_delete,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Returns: | ||
dict: A mapping of Action: Column. | ||
""" | ||
column_mapping = {'CREATE': [Column('resource_name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of column_mapping
, it would document better if the naming is schema_update_mapping
or schema_action_mapping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
schema_update_actions_method = 'get_schema_update_actions' | ||
|
||
for subclass in base_subclasses: | ||
update_actions = getattr(subclass, schema_update_actions_method, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line here needs to be consistent with the naming comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
… into add_resource_name_col
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just one more nit.
LOGGER.info('Updating table %s', subclass.__tablename__) | ||
# schema_update will require the Table object. | ||
table = tables.get(subclass.__tablename__) | ||
schema_update_mapping = get_schema_update_actions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one more nit: this line is a bit inconsistent. We should make it read more smoothly as:
schema_update_actions = get_schema_update_actions()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Codecov Report
@@ Coverage Diff @@
## dev #1864 +/- ##
==========================================
- Coverage 89.27% 89.25% -0.03%
==========================================
Files 162 162
Lines 12177 12182 +5
==========================================
+ Hits 10871 10873 +2
- Misses 1306 1309 +3
|
Codecov Report
@@ Coverage Diff @@
## dev #1864 +/- ##
==========================================
- Coverage 89.33% 89.31% -0.03%
==========================================
Files 162 162
Lines 12212 12217 +5
==========================================
+ Hits 10910 10912 +2
- Misses 1302 1305 +3
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## dev #1864 +/- ##
==========================================
- Coverage 89.33% 89.31% -0.03%
==========================================
Files 162 162
Lines 12212 12217 +5
==========================================
+ Hits 10910 10912 +2
- Misses 1302 1305 +3
|
Resolves #1828 , #109