Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



2 Commits

Repository files navigation


EMR cost utility.

This utility is greatly inspired by and some of its forks, particularly the EMRPriceMeta class was adapted from their previous work.

Auth relays local AWS context, such as through the official AWS CLI and its established local configurations.


% pip install git+
% # or
% python -m pip install git+

Note: you may need to switch to pip3 or python3 depending on your OS. You can also use pipenv (or other similar tools) in place of the above pip example.


You can get help by running:

% emr-cost --help

CLI Examples

To get the costs of EMR clusters that ran ('ClusterStates': ['TERMINATED']) in the current month so far, and pretty-print the output to shell/console:

% emr-cost
  2%|█▎                                                          | 18/795 [01:48<1:05:46,  5.08s/it]

# output here after done, that would look like
 {'ClusterId': 'total',
  'CreationDateTime': None,
  'EbsBlockDevices': None,
  'EndDateTime': None,
  'InstanceGroupType': None,
  'InstanceType': None,
  'Market': None,
  'Name': None,
  'cost_ebs': 93,
  'cost_ec2': 321,
  'cost_emr': 96}]

To get the costs of EMR clusters that ran in month of July, 2021, and output to a CSV file (2021-07.csv):

% emr-cost --month 2021-07-01 --output 2021-07.csv
 11%|██████▍                                                   | 134/1206 [06:35<1:06:47,  3.74s/it]

To get the cost of a specific EMR cluster by ClusterId:

% emr-cost --cluster_id <REDACTED>
100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.65it/s]
[{'ClusterId': '<REDACTED>',
  'CreationDateTime': datetime.datetime(2021, 8, 11, 10, 1, 33, 557000, tzinfo=tzlocal()),
  'EbsBlockDevices': 500,
  'EndDateTime': datetime.datetime(2021, 8, 11, 15, 29, 0, 448000, tzinfo=tzlocal()),
  'InstanceGroupType': 'CORE',
  'InstanceType': 'r4.xlarge',
  'Market': 'ON_DEMAND',
  'Name': '<REDACTED>',
  'cost_ebs': 0.3789909529320988,
  'cost_ec2': 1.4516869461111113,
  'cost_emr': 0.3656504713888889},
 {'ClusterId': 'total',
  'CreationDateTime': None,
  'EbsBlockDevices': None,
  'EndDateTime': None,
  'InstanceGroupType': None,
  'InstanceType': None,
  'Market': None,
  'Name': None,
  'cost_ebs': 2.273945717592593,
  'cost_ec2': 8.349928675000001,
  'cost_emr': 2.1557005402777776}]

To get the costs of a batch of multiple clusters from text file third.txt (one ClusterId per-line), and output to a JSON file (third.json):

% emr-cost --batch third.txt --output third.json
 16%|██████████▎                                                     | 5/31 [00:07<00:22,  1.18it/s]


WIP. See the libary and its CLI application for now.