# Cloud Carbon Coefficients

This notebook calculates the coefficients used in [Cloud Carbon Footprint](https://www.cloudcarbonfootprint.org/), an application that estimates the energy (kilowatt hours) and carbon emissions (metric tons CO2e) of public cloud provider utilization.

#### Imports

In [1]:
%pip install -r requirements.txt

import csv
import numpy as np
import pandas as pd

Note: you may need to restart the kernel to use updated packages.


## Processor types

Processors are grouped into types by each vendor e.g. Intel Broadwell CPUs. Cloud providers provide the CPU types for each of their instance types, but not the precise CPU details. As such, we calculate the average wattage by processor type.

In [2]:
# Loads a CSV file then returns each row appended to an array
def load_append_data(file_name):
    with open(f'data/{file_name}', 'r') as csvfile:
        reader = csv.reader(csvfile)

        data = []
        for row in reader:
            data.append(row[0])
        
        return data

cpus_intel_sandybridge = load_append_data('intel-sandybridge.csv')
cpus_intel_ivybridge = load_append_data('intel-ivybridge.csv')
cpus_intel_haswell = load_append_data('intel-haswell.csv')
cpus_intel_broadwell = load_append_data('intel-broadwell.csv')
cpus_intel_skylake = load_append_data('intel-skylake.csv')
cpus_intel_cascadelake = load_append_data('intel-cascadelake.csv')
cpus_intel_coffeelake = load_append_data('intel-coffeelake.csv')

## Processor lists

Now we know which processors are in which type, we can group all the tested servers by their CPU type to calculate: average idle watts, average watts at 100% utilization, average GB/chip.

In [3]:
# Load all servers from SPECpower results CSV
servers = pd.read_csv('data/SPECpower-full-results.csv', na_values=['NC'])

#### Regex match

The regex to match the CPU names matches to the end of the line using `$` because some chips have version numbers after, so we can't just do a substring match e.g. `Intel E3-1230` is a Sandy Bridge chip but `Intel E3-1230 v3` is Haswell.

#### Clean data

The SPECpower results often appends extra info to the `CPU Description` column which is unecessary. For example, `Intel Xeon E5-2470 (Intel Turbo Boost Technology up to 3.10 GHz)`. This extra info needs to be stripped e.g. to `Intel Xeon E5-2470` otherwise the regex match will not work.

The check below will error if the data is not clean.

In [4]:
if len(servers[servers['CPU Description'].str.contains('Ghz')]) > 0:
    print('Data not clean')
    import sys
    sys.exit(1)

### Intel: Sandy Bridge

In [5]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_sandybridge]
servers_intel_sandybridge = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_sandybridge = {}
intel_sandybridge['idle_watts'] = (servers_intel_sandybridge['avg. watts @ active idle'].astype(float) / servers_intel_sandybridge['Total Threads']).mean()
intel_sandybridge['100% watts'] = (servers_intel_sandybridge['avg. watts @ 100%'].astype(float) / servers_intel_sandybridge['Total Threads']).mean()
intel_sandybridge['GB/Chip'] = (servers_intel_sandybridge['Total Memory (GB)'] / servers_intel_sandybridge['Chips']).mean()
intel_sandybridge

  return func(self, *args, **kwargs)


{'idle_watts': 2.1694411458333334,
 '100% watts': 8.575357663690477,
 'GB/Chip': 16.480916030534353}

### Intel: Ivy Bridge

In [6]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_ivybridge]
servers_intel_ivybridge = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_ivybridge = {}
intel_ivybridge['idle_watts'] = (servers_intel_ivybridge['avg. watts @ active idle'].astype(float) / servers_intel_ivybridge['Total Threads']).mean()
intel_ivybridge['100% watts'] = (servers_intel_ivybridge['avg. watts @ 100%'].astype(float) / servers_intel_ivybridge['Total Threads']).mean()
intel_ivybridge['GB/Chip'] = (servers_intel_ivybridge['Total Memory (GB)'] / servers_intel_ivybridge['Chips']).mean()
servers_intel_ivybridge

Unnamed: 0,Hardware Vendor Test Sponsor,System Enclosure (if applicable),Date Submitted,Nodes,JVM Vendor,CPU Description,MHz,Chips,Cores,Total Threads,...,avg. watts @ 100%,avg. watts @ active idle,Result (Overall ssj_ops/watt),Sandy Bridge,Ivy Bridge,Haswell,Broadwell,Skylake,Cascade Lake,Coffee Lake
427,"Hitachi, Ltd.",HA8000/RS110 (DL2),"Mar 6, 2013",1.0,IBM Corporation,Intel Xeon E3-1280 v2,3600,1,4,8,...,94.0,45.8,3744.0,False,True,False,False,False,False,False
428,"Hitachi, Ltd.",HA8000/SS10 (DL2),"Mar 20, 2013",1.0,IBM Corporation,Intel Xeon E3-1220 v2,3100,1,4,4,...,69.7,36.6,3963.0,False,True,False,False,False,False,False
429,"Hitachi, Ltd.",HA8000/TS10 (DL2),"Mar 20, 2013",1.0,IBM Corporation,Intel Xeon E3-1280 v2,3600,1,4,8,...,85.2,37.2,4342.0,False,True,False,False,False,False,False
445,Fujitsu,PRIMERGY RX300 S8 (Intel Xeon E5-2660 v2),"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False
446,Fujitsu,PRIMERGY RX350 S8 (Intel Xeon E5-2660 v2),"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False
447,Fujitsu,PRIMERGY TX300 S8 (Intel Xeon E5-2660 v2),"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False
448,Fujitsu,PRIMERGY RX200 S8 (Intel Xeon E5-2660 v2),"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False
449,IBM Corporation,IBM NeXtScale nx360 M4 IBM NeXtScale n1200 Enc...,"Oct 2, 2013",12.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,24,240,480,...,,,,False,True,False,False,False,False,False
450,IBM Corporation,IBM System x3650 M4,"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False
451,IBM Corporation,IBM System x3500 M4,"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E5-2660 v2,2200,2,20,40,...,,,,False,True,False,False,False,False,False


### Intel: Haswell

In [7]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_haswell]
servers_intel_haswell = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_haswell = {}
intel_haswell['idle_watts'] = (servers_intel_haswell['avg. watts @ active idle'].astype(float) / servers_intel_haswell['Total Threads']).mean()
intel_haswell['100% watts'] = (servers_intel_haswell['avg. watts @ 100%'].astype(float) / servers_intel_haswell['Total Threads']).mean()
intel_haswell['GB/Chip'] = (servers_intel_haswell['Total Memory (GB)'] / servers_intel_haswell['Chips']).mean()
servers_intel_haswell

  return func(self, *args, **kwargs)


Unnamed: 0,Hardware Vendor Test Sponsor,System Enclosure (if applicable),Date Submitted,Nodes,JVM Vendor,CPU Description,MHz,Chips,Cores,Total Threads,...,avg. watts @ 100%,avg. watts @ active idle,Result (Overall ssj_ops/watt),Sandy Bridge,Ivy Bridge,Haswell,Broadwell,Skylake,Cascade Lake,Coffee Lake
444,Fujitsu,PRIMERGY TX140 S2 (Intel Xeon E3-1265Lv3),"Oct 2, 2013",1.0,IBM Corporation,Intel Xeon E3-1265L v3,2500,1,4,8,...,58.5,19.0,6797.0,False,False,True,False,False,False,False
458,Fujitsu,PRIMERGY RX100 S8 (Intel Xeon E3-1265Lv3),"Nov 13, 2013",1.0,IBM Corporation,Intel Xeon E3-1265L v3,2500,1,4,8,...,69.8,17.7,6137.0,False,False,True,False,False,False,False
470,Fujitsu,FUJITSU Server PRIMERGY RX1330 M1,"Aug 22, 2014",1.0,IBM Corporation,Intel Xeon E3-1275L v3,2500,1,4,8,...,,,,False,False,True,False,False,False,False
472,Fujitsu,FUJITSU Server PRIMERGY RX2540 M1,"Oct 21, 2014",1.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,2,36,72,...,271.0,39.0,10654.0,False,False,True,False,False,False,False
473,"Huawei Technologies Co., Ltd",XH628 V3 Huawei FusionServer X6800,"Oct 21, 2014",3.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,6,108,216,...,,,,False,False,True,False,False,False,False
475,Inspur Corporation,NF5280M4,"Nov 12, 2014",1.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,2,36,72,...,,,,False,False,True,False,False,False,False
477,Fujitsu,FUJITSU Server PRIMERGY CX2550 M1 PRIMERGY CX4...,"Dec 10, 2014",4.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,8,144,288,...,1152.0,171.0,9971.0,False,False,True,False,False,False,False
478,Quanta Computer Inc.,QuantaGrid D51B-2U,"Dec 10, 2014",1.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,2,36,72,...,272.0,49.6,10238.0,False,False,True,False,False,False,False
480,Fujitsu,FUJITSU Server PRIMERGY TX1320 M1,"Feb 5, 2015",1.0,IBM Corporation,Intel Xeon E3-1275L v3,2700,1,4,8,...,60.0,13.3,7535.0,False,False,True,False,False,False,False
481,Fujitsu,FUJITSU Server PRIMERGY RX2530 M1,"Feb 5, 2015",1.0,IBM Corporation,Intel Xeon E5-2699 v3,2300,2,36,72,...,289.0,47.3,9811.0,False,False,True,False,False,False,False


### Intel: Broadwell

In [8]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_broadwell]
servers_intel_broadwell = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_broadwell = {}
intel_broadwell['idle_watts'] = (servers_intel_broadwell['avg. watts @ active idle'].astype(float) / servers_intel_broadwell['Total Threads']).mean()
intel_broadwell['100% watts'] = (servers_intel_broadwell['avg. watts @ 100%'].astype(float) / servers_intel_broadwell['Total Threads']).mean()
intel_broadwell['GB/Chip'] = (servers_intel_broadwell['Total Memory (GB)'] / servers_intel_broadwell['Chips']).mean()
intel_broadwell

  return func(self, *args, **kwargs)


{'idle_watts': 0.5256774475524476,
 '100% watts': 3.048841783216784,
 'GB/Chip': 46.76923076923077}

### Intel: Sky Lake

In [9]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_skylake]
servers_intel_skylake = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_skylake = {}
intel_skylake['idle_watts'] = (servers_intel_skylake['avg. watts @ active idle'].astype(float) / servers_intel_skylake['Total Threads']).mean()
intel_skylake['100% watts'] = (servers_intel_skylake['avg. watts @ 100%'].astype(float) / servers_intel_skylake['Total Threads']).mean()
intel_skylake['GB/Chip'] = (servers_intel_skylake['Total Memory (GB)'] / servers_intel_skylake['Chips']).mean()
intel_skylake

{'idle_watts': 0.6522727665317579,
 '100% watts': 4.255505654535773,
 'GB/Chip': 81.32432432432432}

### Intel: Cascade Lake

In [10]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_cascadelake]
servers_intel_cascadelake = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_cascadelake = {}
intel_cascadelake['idle_watts'] = (servers_intel_cascadelake['avg. watts @ active idle'].astype(float) / servers_intel_cascadelake['Total Threads']).mean()
intel_cascadelake['100% watts'] = (servers_intel_cascadelake['avg. watts @ 100%'].astype(float) / servers_intel_cascadelake['Total Threads']).mean()
intel_cascadelake['GB/Chip'] = (servers_intel_cascadelake['Total Memory (GB)'] / servers_intel_cascadelake['Chips']).mean()
intel_cascadelake

{'idle_watts': 0.6389493581523519,
 '100% watts': 3.9673047343937564,
 'GB/Chip': 98.11764705882354}

### Intel: Coffee Lake

In [11]:
# Construct regex to match the chip name exactly to the end of the line
# (See notes above on regex and clean data)
cpus_re = [rf'(\b{string}$)' for string in cpus_intel_coffeelake]
servers_intel_coffeelake = servers[servers['CPU Description'].str.contains('|'.join(cpus_re))]

intel_coffeelake = {}
intel_coffeelake['idle_watts'] = (servers_intel_coffeelake['avg. watts @ active idle'].astype(float) / servers_intel_coffeelake['Total Threads']).mean()
intel_coffeelake['100% watts'] = (servers_intel_coffeelake['avg. watts @ 100%'].astype(float) / servers_intel_coffeelake['Total Threads']).mean()
intel_coffeelake['GB/Chip'] = (servers_intel_coffeelake['Total Memory (GB)'] / servers_intel_coffeelake['Chips']).mean()
intel_coffeelake

{'idle_watts': 1.138425925925926,
 '100% watts': 5.421759259259258,
 'GB/Chip': 19.555555555555557}

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=2bb85156-92ac-4c34-9a24-94671c8f593c' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>