Skip to content
This repository has been archived by the owner on Jun 5, 2023. It is now read-only.

Add Cloud Profiler to Forseti server #3113

Merged
merged 6 commits into from
Aug 21, 2019

Conversation

hshin-g
Copy link
Contributor

@hshin-g hshin-g commented Aug 20, 2019

Addresses #3104.

Below is the output on Cloud Profiler:
Screen Shot 2019-08-20 at 4 06 17 PM

Copy link
Contributor

@blueandgold blueandgold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this and the flag to control it! Small comments.

google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved
google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved
google/cloud/forseti/services/server.py Outdated Show resolved Hide resolved
Copy link
Contributor

@blueandgold blueandgold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Added one comment for Andrew if he prefers to make the profiling package to be optional, so it's easier for him to port into Google's environment. If yes, then there will be additional tweaks involved. Otherwise, looks good to me!!

setup.py Outdated
@@ -51,6 +51,8 @@
'sendgrid==5.6.0',
'simple-crypt==4.1.7',
'unicodecsv==0.14.1',
# Profiler related.
'google-cloud-profiler==1.0.8',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I just recalled that @ahoying mentioned that instrumentation packages to be optional, so that it's easier for him to port into Google's environment. @ahoying, please confirm?

If so, the optional dependency can be specified like this in setup.py:
image

Then, where you use the import in server.py, you can do this try/except:
image

Then when we want to use it, we can do this:
pip3 install .[profiler]

I took the screenshots from this PR #2317, where this optionality is implemented, so we should do the same here with the profiler.

@ocervell
Copy link
Contributor

Note that the google-cloud-profiler package should also be added to requirements.txt if you want it to be installed in the VM when the startup script runs. Otherwise, you'll have to add pip3 install .[profiler] to all the server install scripts (Terraform, Deployment Manager, Docker, GCP Install Script).

The . notation for relative install also does not work with the current Docker install (pip hangs, see this issue), which is why it's preferable to have it in requirements.txt along with build and test libraries.

You should also update the GCP install scripts to enable the Profiler API, and grant the correct role cloudprofiler.agent to the server / client service accounts.

Once this is ready, please let the CFT team know so that we can also update the Terraform accordingly.

@hshin-g hshin-g merged commit 28af78d into inventory-optimization Aug 21, 2019
@blueandgold
Copy link
Contributor

LGTM, thanks for making the optional package.

@blueandgold
Copy link
Contributor

@hshin-g Can you please respond to @ocervell above? Let him know what our plan is for this build/release. Thanks.

@hshin-g
Copy link
Contributor Author

hshin-g commented Aug 22, 2019

@ocervell Thanks for your comments! In terms of packages being optional, if we add them to the requirements.txt file, then that would be installed with the other packages in that file on install (would not be optional at that point), so I have taken out the google-cloud-profiler package from the requirements.txt file.

After taking out the package, you will need to run the following commands on the server VM in order to install the package and get Forseti to use it:

$ sudo su ubuntu
$ pip3 install google-cloud-profiler
$ sudo systemctl restart forseti

For this PR, we have decided to manually enable roles/permissions, so any changes made to Terraform will not be needed until a later time.

@hshin-g hshin-g deleted the hshin-inventory-optimization-3104 branch August 23, 2019 23:28
@ocervell
Copy link
Contributor

@hshin-g LGTM

@dekuhn dekuhn added this to the v2.22.0 milestone Sep 16, 2019
hshin-g added a commit that referenced this pull request Oct 10, 2019
* Incremented version to 2.18.0.

* Added missing comma in the iam roles list.

* Speed up import of cloudasset data into temporary sql table.

* Strip unused data before writing to the database
* Switch from SQLAlchemy ORM to Core style bulk CAI inserts
* Ensure the maximum number of rows are written on each bulk insert

This is one of a series of changes to reduce the time to complete
a forseti inventory snapshot for large organizations.

This change reduces the time to import data into a cloudsql database
from a test system by about 50%, from about 312 seconds to about 144
seconds for around 350 megabytes of raw data.

* Ignore unactionable ResourceWarning messages in tests.

* Update CAI temporary table read methods to use SQLAlchemy Core.

* Remove per thread sessions from CAI crawler implementation
* Remove ORM overhead when reading assets from CAI temporary table

This is part of a series of changes to reduce the run time of the
forseti inventory crawler for large organizations.

* Move the CAI temporary table to a local sqlite database.

* This reduces network load by 50% by reducing round trips to the
cloudsql server, removing 700MB of network traffic when 350MB of test
assets is imported.
* Total time to import data into the sqlite database is further reduced
from 150 seconds to 30 seconds for 350MB of test assets.
* Temporary file is cleaned up at the end of inventory, reducing storage
requirements on server.

* Further optimize sqlite database interface.

* Set pragmas to speed up writes and reads. Can write 350MB in 11
seconds on test system (down from 330 seconds for version 2.18 baseline).

* Optimize per thread reading, can complete over 100 queries per second
on 10 threads with an 8 CPU virtual machine. Queried 5000 project
cloudsql instances in 45 seconds on test VM.

* Fix db_migrator and lint issue.

* Pass thread count consistently across constructors.

* Update dependent google python libraries to latest versions. (#3021)

* Stream CloudAsset data straight to sqlite database.

* Change cloudasset implementation to stream data from GCS to the local
sqlite database using OS pipes instead of downloading the full file to a
temporary location and then reading it from there.

This change reduces storage requirements on Forseti Server and allows
data to be processed only once instead of writing and rereading each row
of data from the CloudAsset export.

* Fix flake.

* Increase max asset name size to 2048 and add additional comments.

* Remove extra imports in test.

* Move methods to read from the inventory table into DataAccess class.

This cleans up the base Storage class to focus on efficient writing of
data into the Inventory table.

* Refactored inventory to remove global write lock on database.

* Use SQLAlchemy Core for writing inventory rows to Cloud SQL.
* Removed all updates to inventory data, each row is only written once.
* Added full_name to the inventory table for each resource.
* Created a memory cache for written resources, detects and skips duplicate full names. This introduces a small global lock on updating the cache, but is less than a millisecond on average.
* Ensured warning messages written to the inventory index table always contain the full context.
* Fix external project scanner to use new DAO for inventory data.

* Fix flakes.

* Fix docstring.

* Address comments.

* Update Forseti version (#3047)

* + Add _iter_bigquery_tables func
+ Add bigquery.googleapis.com/Table

* + Add mock data

* Revert "Merge branch 'inventory-optimization-3035' into inventory-optimization"

This reverts commit babc2ef, reversing
changes made to 52d0a0e.

* Move inventory warning messages to a child table of inventory_index.

This change allows warnings to be written in parallel from multiple
threads, and removes the need to update the existing inventory_index row
for each new warning.

Warnings are loaded along side the inventory index through the DAO.

* Add a warnings_count virtual column to InventoryIndex.

The warnings_count field is more efficient when code just needs to check
if warnings exist.

Added additional comments to inventory tables to make the current usage
of various columns more explicit.

* Add full resource name field to output warning messages.

* Add Bigquery table data (#3057)

* + Add Bigquery table data

* + Update test for bigquery table

* Reformatted code.

* Ensure errors during inventory run are handled correctly.

* Make sure errors are recorded in inventory index table.
* Make sure errors are handled by the service without raising grpc
exception.
* Add a unit test to verify functionality.

* Allow inventory module to ingest GCS dump files directly. (#3065)

* Added dump file path variable in config file.

* Removed unncessary changes.

* Removed unused variables in docstring.

* Addressed PR comments.

* updates

* updates.

* Removed unused import.

* [CAI] Add compute.googleapis.com/RegionDisk from CAI to Forseti Inventory (#3073)

* + Add try except block for CAI export

* pylint fix

* Added locking before incrementing object count. (#3080)

* Updated to use the engine execute method on rollback / commit. (#3105)

* Updated to use the execute method on rollback / commit.

* Updates

* Added additional checks for FAILURE status on commit.

* flake8 format updates.

* Added expunge to inventory index object.

* updates.

* updates.

* Added extra log statement.

* Add Cloud Profiler to Forseti server (#3113)

* Update python base image

* Update python base image

* explicit package vwersion, pylint fixes

* pylint fixes, nits

* Add cloud profiler to optional packages, add try-except for cloud profiler import

* Remove cloud profiler from requirements.txt

* Removed all cai fallback. (#3116)

* Removed all cai fallback.

* updates.

* Added command to remove tmp files prefixed with forseti-cai. (#3129)

* Added command to remove tmp files prefixed with forseti-cai.

* updates.

* updates completed_at_datetime with a utc timestamp. (#3130)

* Commit instead of flush while building the data model.

* Commit once every 500k resources flushed.

* Fix Flaky Firewall Test (#3118)

* Fix flaky test

* fix replay test

* add comment

* add print

* add print

* add print

* add print

* remove print

(cherry picked from commit cde7b83)

* Fix Flaky Replay Test (#3119)

* test

* test

* fake request.uri

* fix in test

* revert replay.py

* add comment

(cherry picked from commit 8d3d162)

* Updated to commit once every 100k rows flushed. (#3154)

* commit per scanner run.

* Commit per scanner run (#3158)

* Commit per resource type, use add() instead of add_all when inserting to the session.

* Check for role uniqueness properly.

* Updated to use the .get() method when retrieving values in a dict in method _convert_role().

* Added comprehensive debug messages.

* Updated to use resource count instead of flush count when logging.

* Added collation to name column in Role table to make sure the column is case sensitive.

* Updated binding table to reference to role name column with the correct type.

* + Add Service Usage API

* pylint fixes

* Added paging for config validator scanner. (#3191)

* Updated config validator scanner to page the input & output.

* Updated max page size to 3.5 mb.

* Updated the mechanism to estimate dictionary size.

* Added debug log statement.

* Updated to audit once every 50 MB.

* Add unittests

* Nit changes

* Updated scanner_iter method to use query.slice() instead of yield_per and increased the cv audit size to 100mb per. (#3244)

* Use query.slice() instead of yield_per to avoid losing connection to mysql server.

* updates

* Updated to to audit every 100mb.

* Updated block size to 4096.

* Merge changes from RC (#3253)

* commit per scanner run.

* Commit per resource type, use add() instead of add_all when inserting to the session.

* Check for role uniqueness properly.

* Updated to use the .get() method when retrieving values in a dict in method _convert_role().

* Added comprehensive debug messages.

* Updated to use resource count instead of flush count when logging.

* Added collation to name column in Role table to make sure the column is case sensitive.

* Updated binding table to reference to role name column with the correct type.

* + Add Service Usage API

* pylint fixes

* Add unittests

* Nit changes

* Updated iter() method in storage.py to use slice() instead of yield_p… (#3255)

* Updated iter() method in storage.py to use slice() instead of yield_per().

* updates.

* Lowercased generator in docstring.

* Removed page_query usage in storage.py (#3271)

* Update sizes of columns (#3267)

* Update sizes of columns

* Update sizes of columns

* Updated to use yield_per for all the scanners except CV. (#3274)

* Fix bigtable CAI resources (#3297)

* Fix bigtable CAI resources

* lint fixes

* Fix CAI Disk resource (#3298)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants