Skip to content
Switch branches/tags
Go to file

Latest commit

* Prevent updater threads from crashing when access is denied by SCP, as described in #123

* Fix flake8 complaints re whitespace & comments.

* Remove errant ] somehow added to vars in f strings in two log messages.

* Updating urllib3, pylint, pyyaml (and astroid due to dependency) to resolve vulnerabilities reported by safety.:
|                                                                              |
|                               /$$$$$$            /$$                         |
|                              /$$__  $$          | $$                         |
|           /$$$$$$$  /$$$$$$ | $$  \__//$$$$$$  /$$$$$$   /$$   /$$           |
|          /$$_____/ |____  $$| $$$$   /$$__  $$|_  $$_/  | $$  | $$           |
|         |  $$$$$$   /$$$$$$$| $$_/  | $$$$$$$$  | $$    | $$  | $$           |
|          \____  $$ /$$__  $$| $$    | $$_____/  | $$ /$$| $$  | $$           |
|          /$$$$$$$/|  $$$$$$$| $$    |  $$$$$$$  |  $$$$/|  $$$$$$$           |
|         |_______/  \_______/|__/     \_______/   \___/   \____  $$           |
|                                                          /$$  | $$           |
|                                                         |  $$$$$$/           |
|  by                                              \______/            |
|                                                                              |
| REPORT                                                                       |
| checked 64 packages, using default DB                                        |
| package                    | installed | affected                 | ID       |
| pylint                     | 2.4.2     | <2.5.0                   | 38224    |
| Pylint 2.5.0 no longer allows ``python -m pylint ...`` to import user code.  |
| Previously, it added the current working directory as the first element of   |
| ``sys.path``. This opened up a potential security hole where ``pylint``      |
| would import user level code as long as that code resided in modules having  |
| the same name as stdlib or pylint's own modules.                             |
| pyyaml                     | 5.1.2     | <5.3.1                   | 38100    |
| A vulnerability was discovered in the PyYAML library in versions before      |
| 5.3.1, where it is susceptible to arbitrary code execution when it processes |
| untrusted YAML files through the full_load method or with the FullLoader     |
| loader. Applications that use the library to process untrusted input may be  |
| vulnerable to this flaw. An attacker could use this flaw to execute          |
| arbitrary code on the system by abusing the python/object/new constructor.   |
| See: CVE-2020-1747.                                                          |
| pyyaml                     | 5.1.2     | >=5.1,<=5.1.2            | 38639    |
| CVE-2019-20477: PyYAML 5.1 through 5.1.2 has insufficient restrictions on    |
| the load and load_all functions because of a class deserialization issue,    |
| e.g., Popen is a class in the subprocess module. NOTE: this issue exists     |
| because of an incomplete fix for CVE-2017-18342.                             |
| urllib3                    | 1.25.6    | <1.25.9                  | 38834    |
| urllib3 before 1.25.9 allows CRLF injection if the attacker controls the     |
| HTTP request method, as demonstrated by inserting CR and LF control          |
| characters in the first argument of putrequest(). See: CVE-2020-26137.       |
| (NOTE: this is similar to CVE-2020-26116.)                                   |
| urllib3                    | 1.25.6    | >=1.25.2,<=1.25.7        | 27519    |
| The _encode_invalid_chars function in util/ in the urllib3 library     |
| 1.25.2 through 1.25.7 for Python allows a denial of service (CPU             |
| consumption) because of an inefficient algorithm. The percent_encodings      |
| array contains all matches of percent encodings. It is not deduplicated. For |
| a URL of length N, the size of percent_encodings may be up to O(N). The next |
| step (normalize existing percent-encoded bytes) also takes up to O(N) for    |
| each step, so the total time is O(N^2). If percent_encodings were            |
| deduplicated, the time to compute _encode_invalid_chars would be O(kN),      |
| where k is at most 484 ((10+6*2)^2). See: CVE-2020-7212.                     |

Git stats


Failed to load latest commit information.
Latest commit message
Commit time


NetflixOSS Lifecycle Discord chat

Aardvark Logo

Aardvark is a multi-account AWS IAM Access Advisor API (and caching layer).


Ensure that you have Python 3.6 or later. Python 2 is no longer supported.

git clone
cd aardvark
python3 -m venv env
. env/bin/activate
python develop

Known Dependencies

  • libpq-dev

Configure Aardvark

The Aardvark config wizard will guide you through the setup.

% aardvark config

Aardvark can use SWAG to look up accounts.
Do you use SWAG to track accounts? [yN]: no
ROLENAME: Aardvark
DATABASE [sqlite:////home/github/aardvark/aardvark.db]:
# Threads [5]:

>> Writing to
  • Whether to use SWAG to enumerate your AWS accounts. (Optional, but useful when you have many accounts.)
  • The name of the IAM Role to assume into in each account.
  • The Database connection string. (Defaults to sqlite in the current working directory. Use RDS Postgres for production.)

Create the DB tables

aardvark create_db

IAM Permissions:

Aardvark needs an IAM Role in each account that will be queried. Additionally, Aardvark needs to be launched with a role or user which can sts:AssumeRole into the different account roles.


  • Only create one.
  • Needs the ability to call sts:AssumeRole into all of the AardvarkRole's


  • Must exist in every account to be monitored.
  • Must have a trust policy allowing AardvarkInstanceProfile.
  • Has these permissions:

So if you are monitoring n accounts, you will always need n+1 roles. (n AardvarkRoles and 1 AardvarkInstanceProfile).

Note: For locally running aardvark, you don't have to take care of the AardvarkInstanceProfile. Instead, just attach a policy which contains "sts:AssumeRole" to the user you are using on the AWS CLI to assume Aardvark Role. Also, the same user should be mentioned in the trust policy of Aardvark Role for proper assignment of the privileges.

Gather Access Advisor Data

You'll likely want to refresh the Access Advisor data regularly. We recommend running the update command about once a day. Cron works great for this.

Without SWAG:

If you don't have SWAG you can pass comma separated account numbers:

aardvark update -a 123456789012,210987654321

With SWAG:

Aardvark can use SWAG to look up accounts, so you can run against all with:

aardvark update

or by account name/tag with:

aardvark update -a dev,test,prod


Start the API

aardvark start_api -b

In production, you'll likely want to have something like supervisor starting the API for you.

Use the API

Swagger is available for the API at <Aardvark_Host>/apidocs/#!.

Aardvark responds to get/post requests. All results are paginated and pagination can be controlled by passing count and/or page arguments. Here are a few example queries:

curl localhost:5000/api/1/advisors
curl localhost:5000/api/1/advisors?phrase=SecurityMonkey
curl localhost:5000/api/1/advisors?arn=arn:aws:iam::000000000000:role/SecurityMonkey&arn=arn:aws:iam::111111111111:role/SecurityMonkey
curl localhost:5000/api/1/advisors?regex=^.*Monkey$


Aardvark can also be deployed with Docker and Docker Compose. The Aardvark services are built on a shared container. You will need Docker and Docker Compose installed for this to work.

To configure the containers for your set of accounts create a .env file in the root of this directory. Define the environment variables within this file. This example uses AWS Access Keys. We recommend using instance roles in production.

AWS_ACCESS_KEY_ID=<your access key>
AWS_SECRET_ACCESS_KEY=<you secret key>
Name Service Description
AARDVARK_ROLE collector The name of the role for Aardvark to assume so that it can collect the data.
AARDVARK_ACCOUNTS collector Optional if using SWAG, otherwise required. Set this to a list of SWAG account name tags or a list of AWS account numbers from which to collect Access Advisor records.
AWS_ARN_PARTITION collector Required if not using an AWS Commercial region. For example, aws-us-gov. By default, this is aws.
AWS_DEFAULT_REGION collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AWS_ACCESS_KEY_ID collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AWS_SECRET_ACCESS_KEY collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AARDVARK_DATABASE_URI collector and apiserver Specify a custom database URI supported by SQL Alchemy. By default, this will use the AARDVARK_DATA_DIR value to create a SQLLite Database. Example: sqlite:///$AARDVARK_DATA_DIR/aardvark.db

Once this file is created, then build the containers and start the services. Aardvark consists of three services:

  • Init - The init container creates the database within the storage volume.
  • API Server - This is the HTTP webserver will serve the data. By default, this is listening on http://localhost:5000/apidocs/#!.
  • Collector - This is a daemon that will fetch and cache the data in the local SQL database. This should be run periodically.
# build the containers
docker-compose build

# start up the containers
docker-compose up

Finally, to clean up the environment

# bring down the containers
docker-compose down

# remove the containers
docker-compoes rm



Aardvark will launch the number of threads specified in the configuration. Each of these threads will retrieve Access Advisor data for an account and then persist the data.


The regex query is only supported in Postgres (natively) and SQLite (via some magic courtesy of Xion in the sqla_regex file).


We recommend enabling TLS for any service. Instructions for setting up TLS are out of scope for this document.


New in v0.3.1

Aardvark uses Blinker for signals in its update process. These signals can be used for things like emitting metrics, additional logging, or taking more actions on accounts. You can use them by writing a script that defines your handlers and calls aardvark.manage.main(). For example, create a file called with the following contents:

import logging

from aardvark.manage import main
from aardvark.updater import AccountToUpdate

logger = logging.getLogger('aardvark_signals')

def handle_on_ready(sender):"got on_ready from {sender}")

def handle_on_complete(sender):"got on_complete from {sender}")

if __name__ == "__main__":

This file can now be invoked in the same way as

python update -a cool_account

The log output will be similar to the following:

INFO: getting bucket swag-bucket
INFO: Thread #1 updating account 123456789012 with all arns
INFO: got on_ready from <aardvark.updater.AccountToUpdate object at 0x10c379b50>
INFO: got on_complete from <aardvark.updater.AccountToUpdate object at 0x10c379b50>
INFO: Thread #1 persisting data for account 123456789012
INFO: Thread #1 FINISHED persisting data for account 123456789012

Available signals

Class Signals
manage.UpdateAccountThread on_ready, on_complete, on_failure
updater.AccountToUpdate on_ready, on_complete, on_error, on_failure