# 2. e-waste data

This dataset includes the amount of electronic/electrical equipment waste. (Source: https://github.com/Statistics-Netherlands/ewaste) Waste Electrical & Electronic Equipment (WEEE) will be referred as ***e-waste*** for simplicity. 

Relevant variables include:

1. **Country**: three-letter code of each country
2. **UNU_Key**: keys referenced to electronic items description as defined by the United Nations University (UNU)
3. **Year**: year for the e-waste entry
4. **WEEE_t**: amount of e-waste generated in the country, in tonnes 
5. **WEEE_pieces**: amount of e-waste generated in the country, in pieces count
6. **Inhabitants**: total population of a given country in a given year
7. **kpi**: amount of e-waste per inhabitant, in kg per inhabitant
8. **ppi**: amount of e-waste per inhabitant, in pieces per inhabitant


A new data frame is created from the raw data. The relevant variables listed above are analyzed and cleaned when necessary. The columns are renamed to avoid conlficts with other data frames. The final data frame is finally saved into a csv file.

### Data loading and cleaning

In [1]:
import pandas as pd
import numpy as np

#### 1. e-waste in Europe 
In detail, from 1980 to 2021.

In [2]:
# Import raw data

ewaste = pd.read_csv("../data/raw/2_tbl_WEEE.csv")

In [3]:
# Transform column names into lower case for easier handling

ewaste.columns = map(str.lower, ewaste.columns)

In [4]:
# Rename kpi and ppi columns to avoid conflict with data for POM

ewaste = ewaste.rename(columns={"unu_key": "eee_key",
                                "weee_t": "ewaste_t",
                                "weee_pieces": "ewaste_pieces",
                                "kpi": "ewaste_kpi", 
                                "ppi": "ewaste_ppi"})

In [5]:
# Drop useless columns

ewaste = ewaste.drop(["stratum"], axis=1)

In [6]:
ewaste.dtypes

country           object
eee_key            int64
year               int64
ewaste_t         float64
ewaste_pieces      int64
inhabitants      float64
ewaste_kpi       float64
ewaste_ppi       float64
dtype: object

In [7]:
# Transform unu_key and year to string object

ewaste["eee_key"] = ewaste["eee_key"].astype(str)
ewaste["year"] = ewaste["year"].astype(str)

In [8]:
ewaste.dtypes

country           object
eee_key           object
year              object
ewaste_t         float64
ewaste_pieces      int64
inhabitants      float64
ewaste_kpi       float64
ewaste_ppi       float64
dtype: object

In [9]:
# Check for missing values

ewaste.isnull().sum()

country              0
eee_key              0
year                 0
ewaste_t             0
ewaste_pieces        0
inhabitants      10584
ewaste_kpi       10584
ewaste_ppi       10584
dtype: int64

In [10]:
# there are 10584 missing values in inhabitants, kpi and ppi

In [11]:
ewaste[ewaste["inhabitants"].isnull()]["year"].value_counts()

2026    1512
2022    1512
2028    1512
2027    1512
2024    1512
2023    1512
2025    1512
Name: year, dtype: int64

In [12]:
# checking the data, we can see that population estimates from 2022 on 
# are not available

In [13]:
ewaste["country"].nunique()

28

In [14]:
# There are 28 countries, which are the countries of the EU

In [15]:
# Verify the data for each country

pd.Series(ewaste["country"]).value_counts()

CZE    2646
IRL    2646
ITA    2646
LTU    2646
FRA    2646
SWE    2646
EST    2646
NLD    2646
HRV    2646
BGR    2646
SVN    2646
PRT    2646
MLT    2646
DEU    2646
BEL    2646
LVA    2646
HUN    2646
ROU    2646
CYP    2646
SVK    2646
AUT    2646
GBR    2646
DNK    2646
FIN    2646
LUX    2646
POL    2646
ESP    2646
GRC    2646
Name: country, dtype: int64

In [16]:
ewaste.head()

Unnamed: 0,country,eee_key,year,ewaste_t,ewaste_pieces,inhabitants,ewaste_kpi,ewaste_ppi
0,AUT,1,1980,13.658216,443,7540000.0,0.001811,5.9e-05
1,AUT,1,1981,54.90978,1780,7556000.0,0.007267,0.000236
2,AUT,1,1982,123.893263,4017,7565000.0,0.016377,0.000531
3,AUT,1,1983,219.884724,7127,7543000.0,0.029151,0.000945
4,AUT,1,1984,341.413262,11067,7544000.0,0.045256,0.001467


In [17]:
# Save the data frame to a csv file

ewaste.to_csv("../Data/clean_data/2_ewaste_generation.csv", index=False)

#### 2. Extra data: ewaste worldwide from 2000 to 2021 

Total values only. btained from Kees Balde @ United Nations University.

In [None]:
# Import raw data

ewaste_world = pd.read_csv("../data/raw/3_extra_ewaste_world.csv")

In [None]:
ewaste_world.head()

In [None]:
ewaste_world.isnull().sum()

In [None]:
ewaste_world.dtypes

In [None]:
ewaste_world["year"] = ewaste_world["year"].astype(str)

In [None]:
ewaste_world.nunique()

In [None]:
# Save the data frame to a csv file

ewaste_world.to_csv("../Data/clean_data/3_ewaste_world_generation.csv", index=False)