# 3. Electrical & Electronic Equipment: key description, market introduction and average weight change

This dataset includes the data regarding the description provided by the United Nations University (UNU) about the type of electronic/electrical equipment (EEE), the year of introduction of EEE into the market and the average weight of EEE over the years. (Source: https://github.com/Statistics-Netherlands/ewaste)

Relevant variables include:

1. **UNU_Key**: keys referenced to EEE items description as defined by the United Nations University (UNU)
2. **Description**: description associated to the UNU_Key for EEE.  
3. **IntroductionYear**: the year of introduction of EEE into the market.
4. **AverageWeight**: average weight of a given EEE item in a given year.
5. **Year**: year associated to the average weight of a given EEE item.

A new data frame is created from the raw data. The relevant variables listed above are analyzed and cleaned when necessary. The columns are renamed to avoid conlficts with other data frames. The final data frame is finally saved into a csv file.

### Data loading and cleaning

In [1]:
import pandas as pd
import numpy as np

#### 1. UNU key description 

In [2]:
# Import raw data

description = pd.read_csv("../data/raw/3_htbl_Key_Description.csv")

In [3]:
# Rename columns for easier handling

description = description.rename(columns={"UNU_Key": "eee_key", "Description": "eee_description"})

In [4]:
description.dtypes

eee_key            object
eee_description    object
dtype: object

In [5]:
# Check for missing values

description.isnull().sum()

eee_key            0
eee_description    0
dtype: int64

In [6]:
# Delete eee_key entries that have alpha characters and set index to eee_key

description = description[~description["eee_key"].str.contains("[A-Za-z]")]

In [7]:
# Locate entries with "phone" in eee_description 

description.loc[description["eee_description"].str.contains("phone")]

Unnamed: 0,eee_key,eee_description
23,305,Telecommunication equipment (e.g. (cordless) p...
24,306,"Mobile Phones (incl. smartphones, pagers)"
28,401,"Small Consumer Electronics (e.g. headphones, r..."


In [8]:
# We can see that "mobile phones" are associated to eee_key = 306

#### 2. EEE introduction year in the market 

In [9]:
# Import raw data

intro = pd.read_csv("../data/raw/4_htbl_Key_IntroductionYear.csv")

In [10]:
# Rename columns for easier handling

intro = intro.rename(columns={"UNU_Key": "eee_key", "IntroductionYear": "eee_intro_year"})

In [11]:
intro.dtypes

eee_key           int64
eee_intro_year    int64
dtype: object

In [12]:
# Transform eee_key and eee_intro_year to string object

intro["eee_key"] = intro["eee_key"].astype(str)
intro["eee_intro_year"] = intro["eee_intro_year"].astype(str)

In [13]:
intro.dtypes

eee_key           object
eee_intro_year    object
dtype: object

In [14]:
# Check for missing values

intro.isnull().sum()

eee_key           0
eee_intro_year    0
dtype: int64

In [15]:
intro

Unnamed: 0,eee_key,eee_intro_year
0,1,1960
1,2,1990
2,101,1960
3,102,1970
4,103,1960
5,104,1960
6,105,1978
7,106,1970
8,108,1960
9,109,1974


In [16]:
# There are entries for 54 eee_keys

In [17]:
# Locate entries with eee_key = 306 (mobile phones)

intro.loc[intro["eee_key"].str.contains("306")]

Unnamed: 0,eee_key,eee_intro_year
24,306,1985


In [18]:
# we can see that mobile phones were introduced into the market in 1985
# It would be interesting to analyze the timeline series on mobile phones

#### 3. EEE average weight over the years

In [19]:
# Import raw data

weight = pd.read_csv("../data/raw/5_htbl_Key_Weight.csv")

In [20]:
# Rename columns for easier handling and remove useless columns

weight = weight.rename(columns={"UNU_Key": "eee_key",
                              "AverageWeight": "eee_avg_weight", 
                              "Year": "year"}).drop(["Country"], axis=1)

In [21]:
weight.dtypes

eee_key             int64
eee_avg_weight    float64
year                int64
dtype: object

In [22]:
# Transform eee_key and year to string object

weight["eee_key"] = weight["eee_key"].astype(str)
weight["year"] = weight["year"].astype(str)

In [23]:
weight.dtypes

eee_key            object
eee_avg_weight    float64
year               object
dtype: object

In [24]:
# Check for missing values

weight.isnull().sum()

eee_key           0
eee_avg_weight    0
year              0
dtype: int64

In [25]:
weight["eee_key"].nunique()

54

In [26]:
weight["year"].unique()

array(['2014', '2009', '2010', '2006', '2007', '2008', '1993', '1994',
       '1995', '1996', '1997', '1998', '1999', '2000', '2001', '2002',
       '2003', '2004', '2005', '2011', '2012', '2013', '2015', '2016',
       '2017', '2018', '2019', '2020', '2021', '1992', '1991', '1990',
       '1989', '1988', '1987', '1986', '1985', '1984', '1983', '1982',
       '1981', '1980'], dtype=object)

In [27]:
# There are 54 entries for eee_key and for 42 years

In [28]:
# Locate entries with eee_key = 306 (mobile phones)

weight.loc[weight["eee_key"].str.contains("306")]

Unnamed: 0,eee_key,eee_avg_weight,year
22,306,0.09,2009
72,306,0.09,2010
353,306,0.12,1993
354,306,0.12,1994
355,306,0.12,1995
356,306,0.12,1996
357,306,0.11,1997
358,306,0.11,1998
359,306,0.11,1999
360,306,0.11,2000


#### 4. 