Added resource_name column to all the scanners, added schema update handles in scanner dao #1864

joecheuk · 2018-08-07T23:52:04Z

blueandgold

Thanks for adding this, it's awesome to have a way to make the schema changes. Just a comment on where the update is run.

blueandgold · 2018-08-24T22:15:15Z

google/cloud/forseti/services/scanner/dao.py

+        if callable(schema_update) and subclass.__tablename__ in tables:
+            LOGGER.info('Updating table %s', subclass.__tablename__)
+            # schema_update will require the Table object.
+            schema_update(tables.get(subclass.__tablename__))


As discussed, it would be awesome if the running of the schema update can be moved outside of the forseti application, so that the db update operation doesn't have to run everytime the forseti application starts up, and users can have the chance to backup their database before their db schema is actually updated. This will also require a documentation change to our upgrade instruction as well.

blueandgold

This is great.... where is the change in the startup script, that will call the db_migrator.py?

blueandgold · 2018-08-27T15:42:05Z

google/cloud/forseti/services/scanner/dao.py

+            table (Table): The table object of this class.
+        """
+        col = Column('resource_name', String(256), default='')
+        create_column(table, col)


What happens if this is called again when the resource_name column already exists? Should there be error handling or logging?

Added error handling in db_migrator.py.

blueandgold · 2018-08-27T17:47:01Z

install/gcp/upgrade_tools/db_migrate.py

@@ -0,0 +1,62 @@
+import sys


How about calling this db_migrator.py for a proper noun?

Renamed to db_migrator.py

… into add_resource_name_col

codecov · 2018-08-28T20:16:38Z

Codecov Report

Merging #1864 into dev will decrease coverage by 0.01%.
The diff coverage is 50%.

@@            Coverage Diff             @@
##              dev    #1864      +/-   ##
==========================================
- Coverage   89.25%   89.24%   -0.02%     
==========================================
  Files         161      161              
  Lines       12004    12008       +4     
==========================================
+ Hits        10714    10716       +2     
- Misses       1290     1292       +2

Impacted Files	Coverage Δ
...oud/forseti/scanner/audit/bigquery_rules_engine.py	`95.83% <ø> (ø)`	⬆️
...forseti/scanner/audit/enabled_apis_rules_engine.py	`88.49% <ø> (ø)`	⬆️
...r/audit/instance_network_interface_rules_engine.py	`92.64% <ø> (ø)`	⬆️
...d/forseti/scanner/audit/ke_version_rules_engine.py	`88.88% <ø> (ø)`	⬆️
...loud/forseti/scanner/audit/buckets_rules_engine.py	`81.92% <ø> (ø)`	⬆️
...seti/scanner/audit/forwarding_rule_rules_engine.py	`92% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/cloudsql_rules_engine.py	`79.26% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/log_sink_rules_engine.py	`93.7% <ø> (ø)`	⬆️
...ner/scanners/instance_network_interface_scanner.py	`0% <ø> (ø)`	⬆️
...ud/forseti/scanner/audit/blacklist_rules_engine.py	`93.18% <ø> (ø)`	⬆️
... and 6 more

blueandgold

Looks great overall, some small comments.

blueandgold · 2018-08-28T20:30:43Z

deployment-templates/compute-engine/server/forseti-instance-server.py

+
+echo "Attempting to upgrade database."
+python $USER_HOME/forseti-security/install/gcp/upgrade_tools/db_migrator.py
+sleep 5


Why sleep 5.... can it take more than 5 seconds?

The migrator script should block until the upgrade is finished, removed the sleep statement.

blueandgold · 2018-08-28T20:32:28Z

deployment-templates/compute-engine/server/forseti-instance-server.py

@@ -193,6 +193,11 @@ def GenerateConfig(context):
 echo "Starting services."
 systemctl start cloudsqlproxy
 sleep 5
+
+echo "Attempting to upgrade database."


I would be clear that this updating the schema, and add if necessary at the end.

echo "Attempting to update database schema. if necessary."

blueandgold · 2018-08-28T20:35:58Z

install/gcp/upgrade_tools/db_migrator.py

+                LOGGER.info('Failed to update db schema, table=%s',
+                            subclass.__tablename__)
+            except Exception as e:
+                LOGGER.error(e)


How about using LOGGER.exception() here?

LOGGER.exception('Unexpected error happened when attempting to update database schema.')

blueandgold · 2018-08-28T20:37:27Z

install/gcp/upgrade_tools/db_migrator.py

+            try:
+                schema_update(tables.get(subclass.__tablename__))
+            except OperationalError:
+                LOGGER.info('Failed to update db schema, table=%s',


This should be LOGGER.error(), rather than info()

If this is OperationalError, can we be more specific in the log message what might have happened?

This is expected when the table already has that column/column already deleted, I think info makes more sense so user won't need to worry about seeing the message at error level. Added log message in the update_schema method so we know what is failing.

There is no log message in the update_schema()

There is an info log in the update_schema method - LOGGER.info('Attempting to create column: %s', col_name)

blueandgold · 2018-08-28T20:38:13Z

install/gcp/upgrade_tools/db_migrator.py

+                LOGGER.info('Failed to update db schema, table=%s',
+                            subclass.__tablename__)
+            except Exception as e:
+                LOGGER.error(e)


Should we also add the table name here? Or is that already in the stacktrace?

blueandgold · 2018-08-28T20:42:33Z

google/cloud/forseti/services/scanner/dao.py

@@ -183,6 +188,16 @@ def __repr__(self):
        return string.format(
            self.violation_type, self.resource_type, self.rule_name)

+    @staticmethod
+    def schema_update(table):


Method name should start with verb first. So, update_schema(), which is also consistent with migrate_schema(). Or is there a reason to call this schema_update()?

Renamed to update_schema

blueandgold · 2018-08-28T20:44:33Z

install/gcp/upgrade_tools/db_migrator.py

+                                           pool_recycle=3600)
+
+    # Upgrade Scanner tables.
+    migrate_schema(sql_engine, scanner_dao.BASE)


Can we do this? Which will make it easier to add more tables in the future?

declaritive_bases = [scanner_dao.BASE, inventory_dao.BASE] for base in declaritive_bases: migrate_schema(sql_engine, base)

… into add_resource_name_col

joecheuk · 2018-08-29T17:12:30Z

Hi @ahoying , can you take a look at this PR again? This would be the final version if no new bugs were found.

ahoying

I didn't review the content of every resource_name field in the scanners to ensure that was the best value for the field, but the code LGTM.

ahoying · 2018-08-29T17:18:20Z

deployment-templates/compute-engine/server/forseti-instance-server.py

@@ -193,6 +193,10 @@ def GenerateConfig(context):
 echo "Starting services."
 systemctl start cloudsqlproxy
 sleep 5
+
+echo "Attempting to update database schema. if necessary."


Super nit, use a comma after schema, not period: "schema, if necessary"

Thanks for pointing that out, that should definitely be a comma, updated!

… into add_resource_name_col

blueandgold

A couple of quick comments.

blueandgold · 2018-08-30T20:49:53Z

google/cloud/forseti/services/scanner/dao.py

+        Args:
+            table (Table): The table object of this class.
+        """
+        col_name = 'resource_name'


qq, in the future, as we have more columns to add, how would they be handled? This time, we want to add resource_name, next time we want to add xyz. It's hard to see how xyz would fit here.

You will add the column as an attribute to the class and add the column creation logic in this method,

i.e.
class Violation(BASE):
xyz = Column(...)

def update_schema(table): col = Column('resource_name', String(256), default='') col.create(table, populate_default=True) col = Column('xyz', String(256), default='') col.create(table, populate_default=True)

Right, that's what I thought. I think you should make col_name into a list of tuples [(name, type, default), ....]

And then just loop?

Or else just hardcode the name into the Column(), instead of having a variable col_name to track it.

Updated to hard code the column name

blueandgold · 2018-08-30T20:51:20Z

install/gcp/upgrade_tools/db_migrator.py

+            try:
+                schema_update(tables.get(subclass.__tablename__))
+            except OperationalError:
+                LOGGER.info('Failed to update db schema, table=%s',


There is no log message in the update_schema()

blueandgold · 2018-08-30T22:14:10Z

google/cloud/forseti/services/scanner/dao.py

+        """
+        col = Column('resource_name', String(256), default='')
+        LOGGER.info('Attempting to create column: %s', col.name)
+        col.create(table, populate_default=True)


if the pattern is to add the column as you mentioned, then should we surround try/except around each col.create()? Otherwise, when adding xyz in the future, would the execution reach xyz if resource_name already exists?

col = Column('resource_name', String(256), default='') col.create(table, populate_default=True) col = Column('xyz', String(256), default='') col.create(table, populate_default=True)

blueandgold

For support purposes, will we ever need to know which columns are updated in which forseti versions?

blueandgold · 2018-09-04T21:38:57Z

google/cloud/forseti/services/scanner/dao.py

+        Returns:
+            dict: A mapping of Action: Column.
+        """
+        columnMapping = {'CREATE': Column('resource_name',


What if we want to add more columns in the future? Shouldn't we use a list?:

columnMapping = {'CREATE', [column1, column2, ..]}

blueandgold · 2018-09-04T21:39:47Z

google/cloud/forseti/services/scanner/dao.py

+        Returns:
+            dict: A mapping of Action: Column.
+        """
+        columnMapping = {'CREATE': Column('resource_name',


Why camel casing?

blueandgold · 2018-09-04T21:41:34Z

install/gcp/upgrade_tools/db_migrator.py

+    CREATE = 'CREATE'
+
+
+def create_col(table, col):


Can we avoid abbreviation here?

def create_column(table, column):

blueandgold · 2018-09-04T21:42:39Z

install/gcp/upgrade_tools/db_migrator.py

+
+    Args:
+        table (Table): The table object.
+        col (Column): The column object."""


Can you be more specific on the Column and Table object's type here? Other than just 'Column` and Tables.

blueandgold · 2018-09-04T21:42:55Z

install/gcp/upgrade_tools/db_migrator.py

+    col.create(table, populate_default=True)
+
+
+def drop_col(table, col):


def drop_column(table, column):

blueandgold · 2018-09-04T21:43:49Z

install/gcp/upgrade_tools/db_migrator.py

+    # The format of tables is: {table_name: Table object}.
+    tables = base.metadata.tables
+
+    schema_update_method_name = 'update_schema'


Be consistent with the update_schema() method, so update_schema_method_name would be better.

update_schema_method_name = 'update_schema'

blueandgold · 2018-09-04T21:45:12Z

install/gcp/upgrade_tools/db_migrator.py

+    schema_update_method_name = 'update_schema'
+
+    for subclass in base_subclasses:
+        schema_update = getattr(subclass, schema_update_method_name, None)


Better as:

update_schema = getattr(subclass, update_schema_method_name, None)

blueandgold · 2018-09-04T21:46:24Z

install/gcp/upgrade_tools/db_migrator.py

+            # schema_update will require the Table object.
+            try:
+                table = tables.get(subclass.__tablename__)
+                columnMapping = schema_update()


Again, why camel casing?

blueandgold · 2018-09-04T21:47:33Z

google/cloud/forseti/services/scanner/dao.py

@@ -183,6 +184,18 @@ def __repr__(self):
        return string.format(
            self.violation_type, self.resource_type, self.rule_name)

+    @staticmethod
+    def update_schema():


This method does not do any update. It would be better to be called as get_update_actions() or get_migrate_actions().

blueandgold · 2018-09-04T21:51:01Z

install/gcp/upgrade_tools/db_migrator.py

+                for columnAction, column in columnMapping.iteritems():
+                    columnAction = columnAction.upper()
+                    if columnAction in ColumnActionMap:
+                        ColumnActionMap.get(columnAction)(table, column)


Should there be a try/except around this line, so that if a column exists, then it can move to the next column?

… into add_resource_name_col

blueandgold

A few more nits here.

blueandgold · 2018-09-12T23:28:44Z

google/cloud/forseti/services/scanner/dao.py

+        Returns:
+            dict: A mapping of Action: Column.
+        """
+        column_mapping = {'CREATE': [Column('resource_name',


Can we move the list out, so that it's easier to read?

columns_to_create = [ Column('resource_name', String(256), default=''), Column('abc', String(256), default=''), Column('xyz', String(256), default=''), ] column_mapping = { 'CREATE': columns_to_create, DELETE: columns_to_delete, }

blueandgold · 2018-09-12T23:30:32Z

google/cloud/forseti/services/scanner/dao.py

+        Returns:
+            dict: A mapping of Action: Column.
+        """
+        column_mapping = {'CREATE': [Column('resource_name',


Instead of column_mapping, it would document better if the naming is schema_update_mapping or schema_action_mapping

blueandgold · 2018-09-12T23:41:36Z

install/gcp/upgrade_tools/db_migrator.py

+    schema_update_actions_method = 'get_schema_update_actions'
+
+    for subclass in base_subclasses:
+        update_actions = getattr(subclass, schema_update_actions_method, None)


This line here needs to be consistent with the naming comment above.

… into add_resource_name_col

blueandgold

Looks great, just one more nit.

blueandgold · 2018-09-18T18:45:50Z

install/gcp/upgrade_tools/db_migrator.py

+        LOGGER.info('Updating table %s', subclass.__tablename__)
+        # schema_update will require the Table object.
+        table = tables.get(subclass.__tablename__)
+        schema_update_mapping = get_schema_update_actions()


Just one more nit: this line is a bit inconsistent. We should make it read more smoothly as:

schema_update_actions = get_schema_update_actions()

codecov · 2018-09-18T19:09:14Z

Codecov Report

Merging #1864 into dev will decrease coverage by 0.02%.
The diff coverage is 40%.

@@            Coverage Diff             @@
##              dev    #1864      +/-   ##
==========================================
- Coverage   89.27%   89.25%   -0.03%     
==========================================
  Files         162      162              
  Lines       12177    12182       +5     
==========================================
+ Hits        10871    10873       +2     
- Misses       1306     1309       +3

Impacted Files	Coverage Δ
...oud/forseti/scanner/audit/bigquery_rules_engine.py	`92.85% <ø> (ø)`	⬆️
...forseti/scanner/audit/enabled_apis_rules_engine.py	`88.49% <ø> (ø)`	⬆️
...r/audit/instance_network_interface_rules_engine.py	`92.64% <ø> (ø)`	⬆️
...d/forseti/scanner/audit/ke_version_rules_engine.py	`88.88% <ø> (ø)`	⬆️
...loud/forseti/scanner/audit/buckets_rules_engine.py	`81.92% <ø> (ø)`	⬆️
...seti/scanner/audit/forwarding_rule_rules_engine.py	`92% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/cloudsql_rules_engine.py	`79.26% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/log_sink_rules_engine.py	`93.7% <ø> (ø)`	⬆️
...ner/scanners/instance_network_interface_scanner.py	`0% <ø> (ø)`	⬆️
...ud/forseti/scanner/audit/blacklist_rules_engine.py	`93.18% <ø> (ø)`	⬆️
... and 6 more

codecov · 2018-09-18T21:57:23Z

Codecov Report

Merging #1864 into dev will decrease coverage by 0.02%.
The diff coverage is 40%.

@@            Coverage Diff             @@
##              dev    #1864      +/-   ##
==========================================
- Coverage   89.33%   89.31%   -0.03%     
==========================================
  Files         162      162              
  Lines       12212    12217       +5     
==========================================
+ Hits        10910    10912       +2     
- Misses       1302     1305       +3

Impacted Files	Coverage Δ
...oud/forseti/scanner/audit/bigquery_rules_engine.py	`92.85% <ø> (ø)`	⬆️
...forseti/scanner/audit/enabled_apis_rules_engine.py	`88.49% <ø> (ø)`	⬆️
...r/audit/instance_network_interface_rules_engine.py	`92.64% <ø> (ø)`	⬆️
...d/forseti/scanner/audit/ke_version_rules_engine.py	`88.88% <ø> (ø)`	⬆️
...loud/forseti/scanner/audit/buckets_rules_engine.py	`81.92% <ø> (ø)`	⬆️
...seti/scanner/audit/forwarding_rule_rules_engine.py	`92% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/cloudsql_rules_engine.py	`79.26% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/log_sink_rules_engine.py	`93.7% <ø> (ø)`	⬆️
...ner/scanners/instance_network_interface_scanner.py	`0% <ø> (ø)`	⬆️
...ud/forseti/scanner/audit/blacklist_rules_engine.py	`93.18% <ø> (ø)`	⬆️
... and 6 more

codecov · 2018-09-18T21:59:42Z

Codecov Report

Merging #1864 into dev will decrease coverage by 0.02%.
The diff coverage is 40%.

@@            Coverage Diff             @@
##              dev    #1864      +/-   ##
==========================================
- Coverage   89.33%   89.31%   -0.03%     
==========================================
  Files         162      162              
  Lines       12212    12217       +5     
==========================================
+ Hits        10910    10912       +2     
- Misses       1302     1305       +3

Impacted Files	Coverage Δ
...oud/forseti/scanner/audit/bigquery_rules_engine.py	`92.85% <ø> (ø)`	⬆️
...forseti/scanner/audit/enabled_apis_rules_engine.py	`88.49% <ø> (ø)`	⬆️
...r/audit/instance_network_interface_rules_engine.py	`92.64% <ø> (ø)`	⬆️
...d/forseti/scanner/audit/ke_version_rules_engine.py	`88.88% <ø> (ø)`	⬆️
...loud/forseti/scanner/audit/buckets_rules_engine.py	`81.92% <ø> (ø)`	⬆️
...seti/scanner/audit/forwarding_rule_rules_engine.py	`92% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/cloudsql_rules_engine.py	`79.26% <ø> (ø)`	⬆️
...oud/forseti/scanner/audit/log_sink_rules_engine.py	`93.7% <ø> (ø)`	⬆️
...ner/scanners/instance_network_interface_scanner.py	`0% <ø> (ø)`	⬆️
...ud/forseti/scanner/audit/blacklist_rules_engine.py	`93.18% <ø> (ø)`	⬆️
... and 6 more

joecheuk added 2 commits August 7, 2018 16:48

Added resource_name column in all of the scanners.

6a5e046

removed unused imports

0be06d2

joecheuk requested review from ahoying and blueandgold August 7, 2018 23:52

googlebot added the cla: yes label Aug 7, 2018

forseti-security-waffle-bot added the status: in-progress label Aug 7, 2018

joecheuk added 3 commits August 7, 2018 16:53

re-raise the exception after logging the error message.

c4f51ef

Updated comments

e53bd6f

Updated comments

c586b92

blueandgold reviewed Aug 24, 2018

View reviewed changes

joecheuk added 2 commits August 24, 2018 15:38

merged dev into local

ad0339b

Updates

995cc25

blueandgold reviewed Aug 27, 2018

View reviewed changes

joecheuk added 7 commits August 27, 2018 13:53

updates

30627b0

updates

2ea2756

Merge branch 'dev' of github.com:GoogleCloudPlatform/forseti-security…

b14a22e

… into add_resource_name_col

updates

9a72a6b

Merge branch 'dev' into add_resource_name_col

78d9ff6

updates

2dca9f9

flake8 updates

cbd56fe

blueandgold reviewed Aug 28, 2018

View reviewed changes

joecheuk added 2 commits August 28, 2018 15:48

Addressed PR comments

aa72427

Merge branch 'dev' of github.com:GoogleCloudPlatform/forseti-security…

2c1f13e

… into add_resource_name_col

ahoying approved these changes Aug 29, 2018

View reviewed changes

joecheuk added 2 commits August 29, 2018 10:44

Merge branch 'dev' of github.com:GoogleCloudPlatform/forseti-security…

83ce241

… into add_resource_name_col

Addressed PR comment

2404b45

blueandgold reviewed Aug 30, 2018

View reviewed changes

updates

5f168b2

blueandgold reviewed Aug 30, 2018

View reviewed changes

joecheuk added 2 commits August 30, 2018 15:44

updates

835c21f

updates

8a8432c

blueandgold reviewed Sep 4, 2018

View reviewed changes

joecheuk added 2 commits September 4, 2018 16:05

Merge branch 'dev' of github.com:GoogleCloudPlatform/forseti-security…

72619b1

… into add_resource_name_col

updates

12d123e

blueandgold reviewed Sep 12, 2018

View reviewed changes

joecheuk added 3 commits September 18, 2018 08:36

Merge branch 'dev' of github.com:GoogleCloudPlatform/forseti-security…

1678cfe

… into add_resource_name_col

Addressed PR comments

5df2409

pylint updates

03c4c19

blueandgold approved these changes Sep 18, 2018

View reviewed changes

Addressed PR comments.

0a693e1

Merge branch 'dev' into add_resource_name_col

19e14ae

joecheuk merged commit 2592fbd into dev Sep 18, 2018

joecheuk deleted the add_resource_name_col branch September 18, 2018 22:01

forseti-security-waffle-bot removed the status: in-progress label Sep 18, 2018

umairidris mentioned this pull request Oct 16, 2018

Feature Request - Parse out project name from the full_name in the findings #2092

Closed

		col.create(table, populate_default=True)


		def drop_col(table, col):

Added resource_name column to all the scanners, added schema update handles in scanner dao #1864

Added resource_name column to all the scanners, added schema update handles in scanner dao #1864

Conversation

joecheuk commented Aug 7, 2018 • edited Loading

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Aug 28, 2018 • edited Loading

Codecov Report

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joecheuk Aug 28, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joecheuk commented Aug 29, 2018

ahoying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joecheuk Aug 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

joecheuk commented Aug 7, 2018 •

edited

Loading

codecov bot commented Aug 28, 2018 •

edited

Loading

joecheuk Aug 28, 2018 •

edited

Loading

joecheuk Aug 30, 2018 •

edited

Loading