# Data science: CHIP Dataset Analysis

This notebook deals with a dataset with 2185 CPUs and 2668 GPUs from https://chip-dataset.vercel.app/.

# Index

<ol>
  <li>Import libraries</li>
</ol>

## Import libraries

In the following section we import some libraries which will be used in the code

In [1]:
import numpy as np # perform efficient operation with numpy array 
import pandas as pd # perform operation on dataframe
import math as mt # perform mathematical operation
import matplotlib.pyplot as plt # perform data plots

### Read data
We first read the .csv file, then show the first elements in the dataset to see how it is built. Pandas "head" function is used.

In [16]:
chips = pd.read_csv('data/chip_dataset.csv', delimiter = ',', index_col=0)
chips.head()

Unnamed: 0,Product,Type,Release Date,Process Size (nm),TDP (W),Die Size (mm^2),Transistors (million),Freq (MHz),Foundry,Vendor,FP16 GFLOPS,FP32 GFLOPS,FP64 GFLOPS
0,AMD Athlon 64 3500+,CPU,2007-02-20,65.0,45.0,77.0,122.0,2200.0,Unknown,AMD,,,
1,AMD Athlon 200GE,CPU,2018-09-06,14.0,35.0,192.0,4800.0,3200.0,Unknown,AMD,,,
2,Intel Core i5-1145G7,CPU,2020-09-02,10.0,28.0,,,2600.0,Intel,Intel,,,
3,Intel Xeon E5-2603 v2,CPU,2013-09-01,22.0,80.0,160.0,1400.0,1800.0,Intel,Intel,,,
4,AMD Phenom II X4 980 BE,CPU,2011-05-03,45.0,125.0,258.0,758.0,3700.0,Unknown,AMD,,,


<br />

## Data analysis

From this point onward, some analysis on the dataset are made.

In [22]:
print('Total number of chips in this dataset is:', chips.Product.count())

Total number of chips in this dataset is: 4854


In [33]:
index = (chips.duplicated(subset=['Product']) == True)
chips[index]

Unnamed: 0,Product,Type,Release Date,Process Size (nm),TDP (W),Die Size (mm^2),Transistors (million),Freq (MHz),Foundry,Vendor,FP16 GFLOPS,FP32 GFLOPS,FP64 GFLOPS
24,AMD Athlon 64 3500+,CPU,2001-01-01,90.0,67.0,115.0,105.0,2200.0,Unknown,AMD,,,
77,AMD Athlon 64 3500+,CPU,2005-05-31,90.0,67.0,156.0,154.0,2200.0,Unknown,AMD,,,
110,AMD Athlon 64 3500+,CPU,2007-02-20,90.0,62.0,230.0,227.0,2200.0,Unknown,AMD,,,
122,AMD Athlon 64 3000+,CPU,2005-04-14,130.0,62.0,,105.0,1800.0,Unknown,AMD,,,
146,Intel Pentium 4 506,CPU,2005-06-01,90.0,84.0,109.0,125.0,2660.0,Intel,Intel,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4834,ATI Mobility Radeon 9200,GPU,2003-03-01,150.0,,81.0,36.0,200.0,UMC,ATI,,,
4838,NVIDIA GeForce 6800 GS,GPU,2005-12-08,130.0,,287.0,222.0,350.0,TSMC,NVIDIA,,,
4850,Intel GMA 950,GPU,2005-06-01,90.0,7.0,,,250.0,Intel,Intel,,,
4851,NVIDIA GeForce GT 320M,GPU,2010-03-03,40.0,23.0,100.0,486.0,500.0,TSMC,NVIDIA,,52.8,


In [28]:
index = (chips.Foundry == 'Unknown')
chips[index]

Unnamed: 0,Product,Type,Release Date,Process Size (nm),TDP (W),Die Size (mm^2),Transistors (million),Freq (MHz),Foundry,Vendor,FP16 GFLOPS,FP32 GFLOPS,FP64 GFLOPS
0,AMD Athlon 64 3500+,CPU,2007-02-20,65.0,45.0,77.0,122.0,2200.0,Unknown,AMD,,,
1,AMD Athlon 200GE,CPU,2018-09-06,14.0,35.0,192.0,4800.0,3200.0,Unknown,AMD,,,
4,AMD Phenom II X4 980 BE,CPU,2011-05-03,45.0,125.0,258.0,758.0,3700.0,Unknown,AMD,,,
6,AMD Phenom X4 9750 (125W),CPU,2008-03-27,65.0,125.0,285.0,450.0,2400.0,Unknown,AMD,,,
9,AMD Athlon 64 X2 4200+,CPU,2006-05-23,90.0,89.0,156.0,154.0,2200.0,Unknown,AMD,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4739,ATI Radeon HD 4290 IGP,GPU,2010-03-01,55.0,,67.0,181.0,500.0,Unknown,ATI,,40.0,
4763,ATI Radeon Xpress 2100 IGP,GPU,2008-03-04,65.0,,85.0,180.0,500.0,Unknown,ATI,,40.0,
4806,NVIDIA GeForce 320M Mac Edition,GPU,2010-04-01,40.0,23.0,100.0,486.0,450.0,Unknown,NVIDIA,,91.2,
4816,NVIDIA GeForce 9400M,GPU,2008-10-15,65.0,12.0,144.0,314.0,580.0,Unknown,NVIDIA,,44.8,


In [20]:
chips.isnull().sum()

Product                     0
Type                        0
Release Date                0
Process Size (nm)           9
TDP (W)                   626
Die Size (mm^2)           715
Transistors (million)     711
Freq (MHz)                  0
Foundry                     0
Vendor                      0
FP16 GFLOPS              4318
FP32 GFLOPS              2906
FP64 GFLOPS              3548
dtype: int64