Add Cloud Profiler to Forseti server #3113

hshin-g · 2019-08-20T23:06:51Z

Addresses #3104.

Below is the output on Cloud Profiler:

blueandgold

Thanks for adding this and the flag to control it! Small comments.

google/cloud/forseti/services/server.py

blueandgold

Thanks for the update! Added one comment for Andrew if he prefers to make the profiling package to be optional, so it's easier for him to port into Google's environment. If yes, then there will be additional tweaks involved. Otherwise, looks good to me!!

blueandgold · 2019-08-21T03:36:08Z

setup.py

@@ -51,6 +51,8 @@
    'sendgrid==5.6.0',
    'simple-crypt==4.1.7',
    'unicodecsv==0.14.1',
+    # Profiler related.
+    'google-cloud-profiler==1.0.8',


Apologies, I just recalled that @ahoying mentioned that instrumentation packages to be optional, so that it's easier for him to port into Google's environment. @ahoying, please confirm?

If so, the optional dependency can be specified like this in setup.py:

Then, where you use the import in server.py, you can do this try/except:

Then when we want to use it, we can do this:
pip3 install .[profiler]

I took the screenshots from this PR #2317, where this optionality is implemented, so we should do the same here with the profiler.

ocervell · 2019-08-21T10:02:31Z

Note that the google-cloud-profiler package should also be added to requirements.txt if you want it to be installed in the VM when the startup script runs. Otherwise, you'll have to add pip3 install .[profiler] to all the server install scripts (Terraform, Deployment Manager, Docker, GCP Install Script).

The . notation for relative install also does not work with the current Docker install (pip hangs, see this issue), which is why it's preferable to have it in requirements.txt along with build and test libraries.

You should also update the GCP install scripts to enable the Profiler API, and grant the correct role cloudprofiler.agent to the server / client service accounts.

Once this is ready, please let the CFT team know so that we can also update the Terraform accordingly.

…filer import

blueandgold · 2019-08-21T19:15:33Z

LGTM, thanks for making the optional package.

blueandgold · 2019-08-22T03:16:51Z

@hshin-g Can you please respond to @ocervell above? Let him know what our plan is for this build/release. Thanks.

hshin-g · 2019-08-22T15:45:43Z

@ocervell Thanks for your comments! In terms of packages being optional, if we add them to the requirements.txt file, then that would be installed with the other packages in that file on install (would not be optional at that point), so I have taken out the google-cloud-profiler package from the requirements.txt file.

After taking out the package, you will need to run the following commands on the server VM in order to install the package and get Forseti to use it:

$ sudo su ubuntu
$ pip3 install google-cloud-profiler
$ sudo systemctl restart forseti

For this PR, we have decided to manually enable roles/permissions, so any changes made to Terraform will not be needed until a later time.

ocervell · 2019-08-27T14:14:32Z

@hshin-g LGTM

* Incremented version to 2.18.0. * Added missing comma in the iam roles list. * Speed up import of cloudasset data into temporary sql table. * Strip unused data before writing to the database * Switch from SQLAlchemy ORM to Core style bulk CAI inserts * Ensure the maximum number of rows are written on each bulk insert This is one of a series of changes to reduce the time to complete a forseti inventory snapshot for large organizations. This change reduces the time to import data into a cloudsql database from a test system by about 50%, from about 312 seconds to about 144 seconds for around 350 megabytes of raw data. * Ignore unactionable ResourceWarning messages in tests. * Update CAI temporary table read methods to use SQLAlchemy Core. * Remove per thread sessions from CAI crawler implementation * Remove ORM overhead when reading assets from CAI temporary table This is part of a series of changes to reduce the run time of the forseti inventory crawler for large organizations. * Move the CAI temporary table to a local sqlite database. * This reduces network load by 50% by reducing round trips to the cloudsql server, removing 700MB of network traffic when 350MB of test assets is imported. * Total time to import data into the sqlite database is further reduced from 150 seconds to 30 seconds for 350MB of test assets. * Temporary file is cleaned up at the end of inventory, reducing storage requirements on server. * Further optimize sqlite database interface. * Set pragmas to speed up writes and reads. Can write 350MB in 11 seconds on test system (down from 330 seconds for version 2.18 baseline). * Optimize per thread reading, can complete over 100 queries per second on 10 threads with an 8 CPU virtual machine. Queried 5000 project cloudsql instances in 45 seconds on test VM. * Fix db_migrator and lint issue. * Pass thread count consistently across constructors. * Update dependent google python libraries to latest versions. (#3021) * Stream CloudAsset data straight to sqlite database. * Change cloudasset implementation to stream data from GCS to the local sqlite database using OS pipes instead of downloading the full file to a temporary location and then reading it from there. This change reduces storage requirements on Forseti Server and allows data to be processed only once instead of writing and rereading each row of data from the CloudAsset export. * Fix flake. * Increase max asset name size to 2048 and add additional comments. * Remove extra imports in test. * Move methods to read from the inventory table into DataAccess class. This cleans up the base Storage class to focus on efficient writing of data into the Inventory table. * Refactored inventory to remove global write lock on database. * Use SQLAlchemy Core for writing inventory rows to Cloud SQL. * Removed all updates to inventory data, each row is only written once. * Added full_name to the inventory table for each resource. * Created a memory cache for written resources, detects and skips duplicate full names. This introduces a small global lock on updating the cache, but is less than a millisecond on average. * Ensured warning messages written to the inventory index table always contain the full context. * Fix external project scanner to use new DAO for inventory data. * Fix flakes. * Fix docstring. * Address comments. * Update Forseti version (#3047) * + Add _iter_bigquery_tables func + Add bigquery.googleapis.com/Table * + Add mock data * Revert "Merge branch 'inventory-optimization-3035' into inventory-optimization" This reverts commit babc2ef, reversing changes made to 52d0a0e. * Move inventory warning messages to a child table of inventory_index. This change allows warnings to be written in parallel from multiple threads, and removes the need to update the existing inventory_index row for each new warning. Warnings are loaded along side the inventory index through the DAO. * Add a warnings_count virtual column to InventoryIndex. The warnings_count field is more efficient when code just needs to check if warnings exist. Added additional comments to inventory tables to make the current usage of various columns more explicit. * Add full resource name field to output warning messages. * Add Bigquery table data (#3057) * + Add Bigquery table data * + Update test for bigquery table * Reformatted code. * Ensure errors during inventory run are handled correctly. * Make sure errors are recorded in inventory index table. * Make sure errors are handled by the service without raising grpc exception. * Add a unit test to verify functionality. * Allow inventory module to ingest GCS dump files directly. (#3065) * Added dump file path variable in config file. * Removed unncessary changes. * Removed unused variables in docstring. * Addressed PR comments. * updates * updates. * Removed unused import. * [CAI] Add compute.googleapis.com/RegionDisk from CAI to Forseti Inventory (#3073) * + Add try except block for CAI export * pylint fix * Added locking before incrementing object count. (#3080) * Updated to use the engine execute method on rollback / commit. (#3105) * Updated to use the execute method on rollback / commit. * Updates * Added additional checks for FAILURE status on commit. * flake8 format updates. * Added expunge to inventory index object. * updates. * updates. * Added extra log statement. * Add Cloud Profiler to Forseti server (#3113) * Update python base image * Update python base image * explicit package vwersion, pylint fixes * pylint fixes, nits * Add cloud profiler to optional packages, add try-except for cloud profiler import * Remove cloud profiler from requirements.txt * Removed all cai fallback. (#3116) * Removed all cai fallback. * updates. * Added command to remove tmp files prefixed with forseti-cai. (#3129) * Added command to remove tmp files prefixed with forseti-cai. * updates. * updates completed_at_datetime with a utc timestamp. (#3130) * Commit instead of flush while building the data model. * Commit once every 500k resources flushed. * Fix Flaky Firewall Test (#3118) * Fix flaky test * fix replay test * add comment * add print * add print * add print * add print * remove print (cherry picked from commit cde7b83) * Fix Flaky Replay Test (#3119) * test * test * fake request.uri * fix in test * revert replay.py * add comment (cherry picked from commit 8d3d162) * Updated to commit once every 100k rows flushed. (#3154) * commit per scanner run. * Commit per scanner run (#3158) * Commit per resource type, use add() instead of add_all when inserting to the session. * Check for role uniqueness properly. * Updated to use the .get() method when retrieving values in a dict in method _convert_role(). * Added comprehensive debug messages. * Updated to use resource count instead of flush count when logging. * Added collation to name column in Role table to make sure the column is case sensitive. * Updated binding table to reference to role name column with the correct type. * + Add Service Usage API * pylint fixes * Added paging for config validator scanner. (#3191) * Updated config validator scanner to page the input & output. * Updated max page size to 3.5 mb. * Updated the mechanism to estimate dictionary size. * Added debug log statement. * Updated to audit once every 50 MB. * Add unittests * Nit changes * Updated scanner_iter method to use query.slice() instead of yield_per and increased the cv audit size to 100mb per. (#3244) * Use query.slice() instead of yield_per to avoid losing connection to mysql server. * updates * Updated to to audit every 100mb. * Updated block size to 4096. * Merge changes from RC (#3253) * commit per scanner run. * Commit per resource type, use add() instead of add_all when inserting to the session. * Check for role uniqueness properly. * Updated to use the .get() method when retrieving values in a dict in method _convert_role(). * Added comprehensive debug messages. * Updated to use resource count instead of flush count when logging. * Added collation to name column in Role table to make sure the column is case sensitive. * Updated binding table to reference to role name column with the correct type. * + Add Service Usage API * pylint fixes * Add unittests * Nit changes * Updated iter() method in storage.py to use slice() instead of yield_p… (#3255) * Updated iter() method in storage.py to use slice() instead of yield_per(). * updates. * Lowercased generator in docstring. * Removed page_query usage in storage.py (#3271) * Update sizes of columns (#3267) * Update sizes of columns * Update sizes of columns * Updated to use yield_per for all the scanners except CV. (#3274) * Fix bigtable CAI resources (#3297) * Fix bigtable CAI resources * lint fixes * Fix CAI Disk resource (#3298)

hshin-g added 3 commits August 20, 2019 12:55

Update python base image

d683472

Update python base image

cfa3c61

explicit package vwersion, pylint fixes

b1d2922

hshin-g requested review from ahoying, joecheuk and blueandgold August 20, 2019 23:06

googlebot added the cla: yes label Aug 20, 2019

blueandgold reviewed Aug 20, 2019

View reviewed changes

google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved

google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved

google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved

pylint fixes, nits

54c0157

blueandgold reviewed Aug 21, 2019

View reviewed changes

hshin-g added 2 commits August 21, 2019 07:59

Add cloud profiler to optional packages, add try-except for cloud pro…

4208e64

…filer import

Remove cloud profiler from requirements.txt

49ebe70

hshin-g merged commit 28af78d into inventory-optimization Aug 21, 2019

dekuhn added this to Done in Forseti Inventory Performance Aug 22, 2019

hshin-g deleted the hshin-inventory-optimization-3104 branch August 23, 2019 23:28

dekuhn added this to the v2.22.0 milestone Sep 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Cloud Profiler to Forseti server #3113

Add Cloud Profiler to Forseti server #3113

hshin-g commented Aug 20, 2019 •

edited

blueandgold left a comment

blueandgold left a comment

blueandgold Aug 21, 2019 •

edited

ocervell commented Aug 21, 2019

blueandgold commented Aug 21, 2019

blueandgold commented Aug 22, 2019

hshin-g commented Aug 22, 2019

ocervell commented Aug 27, 2019

Add Cloud Profiler to Forseti server #3113

Add Cloud Profiler to Forseti server #3113

Conversation

hshin-g commented Aug 20, 2019 • edited

blueandgold left a comment

Choose a reason for hiding this comment

blueandgold left a comment

Choose a reason for hiding this comment

blueandgold Aug 21, 2019 • edited

Choose a reason for hiding this comment

ocervell commented Aug 21, 2019

blueandgold commented Aug 21, 2019

blueandgold commented Aug 22, 2019

hshin-g commented Aug 22, 2019

ocervell commented Aug 27, 2019

hshin-g commented Aug 20, 2019 •

edited

blueandgold Aug 21, 2019 •

edited