# THE WORLD BANK WASTE MANAGEMENT CENSUS

Analysis of  Global Waste Managment data available.

Data Source: [https://datacatalog.worldbank.org/search/dataset/0039597]

The World Bank data inventory from 2019.

## Definitions

- **iso3c** - Abbrevation of the country
- **country_name** - Name of the Country
- **gdp** - GDP of the country in Billions for the year 2019
- **composition_food_organic_waste_percent** - Percentage of organic waste generated per yer.
- **composition_glass_percent** - Percentage of glass waste 
- **composition_metal_percent** - Percentage of metal waste
- **composition_other_percent** - Percentage of all other waste
- **composition_paper_cardboard_percent** - Percentage of paper cardboard waste
- **composition_plastic_percent** - Percentage of plastic waste
- **other_information_national_agency_to_enforce_solid_waste_laws_and_regulations** - If there exists a national agency in the country to enforce the laws.
- **other_information_national_law_governing_solid_waste_management_in_the_country** - If there exists a national law for waste mangement in the country.
- **other_information_summary_of_key_solid_waste_information_made_available_to_the_public** - If the information of the waste managemnet is made available to public.
- **population_number_of_people** - Total population of the country.
- **e_waste_tons_year** - Amount of e-waste generated/year.
- **total_waste_generated** - Amount of total waste generated by the country/year


### Please make sure to install the altair library by using command
pip install altair vega_datasets

# Exploring the Dataset

In [1]:
import pandas as pd
from pathlib import Path

world_df = pd.read_csv('countrydata.csv')
world_df.shape

(217, 17)

In [2]:
world_df.columns

Index(['iso3c', 'region_id', 'country_name', 'income_id', 'gdp',
       'composition_food_organic_waste_percent', 'composition_glass_percent',
       'composition_metal_percent', 'composition_other_percent',
       'composition_paper_cardboard_percent', 'composition_plastic_percent',
       'other_information_national_agency_to_enforce_solid_waste_laws_and_regulations',
       'other_information_national_law_governing_solid_waste_management_in_the_country',
       'other_information_summary_of_key_solid_waste_information_made_available_to_the_public',
       'population_number_of_people', 'e_waste_tons_year',
       'total_waste_generated'],
      dtype='object')

In [3]:
world_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 217 entries, 0 to 216
Data columns (total 17 columns):
 #   Column                                                                                 Non-Null Count  Dtype  
---  ------                                                                                 --------------  -----  
 0   iso3c                                                                                  217 non-null    object 
 1   region_id                                                                              217 non-null    object 
 2   country_name                                                                           216 non-null    object 
 3   income_id                                                                              217 non-null    object 
 4   gdp                                                                                    216 non-null    float64
 5   composition_food_organic_waste_percent                                        

#### Checking for Null Values 

In [4]:
world_df.isnull().sum().sort_values(ascending=False)

other_information_summary_of_key_solid_waste_information_made_available_to_the_public    144
other_information_national_agency_to_enforce_solid_waste_laws_and_regulations             53
composition_metal_percent                                                                 47
composition_glass_percent                                                                 46
composition_other_percent                                                                 42
composition_plastic_percent                                                               42
composition_food_organic_waste_percent                                                    41
composition_paper_cardboard_percent                                                       41
e_waste_tons_year                                                                         35
other_information_national_law_governing_solid_waste_management_in_the_country            23
total_waste_generated                                                 

In [5]:
world_df


Unnamed: 0,iso3c,region_id,country_name,income_id,gdp,composition_food_organic_waste_percent,composition_glass_percent,composition_metal_percent,composition_other_percent,composition_paper_cardboard_percent,composition_plastic_percent,other_information_national_agency_to_enforce_solid_waste_laws_and_regulations,other_information_national_law_governing_solid_waste_management_in_the_country,other_information_summary_of_key_solid_waste_information_made_available_to_the_public,population_number_of_people,e_waste_tons_year,total_waste_generated
0,ABW,LCN,Aruba,HIC,35563.312500,,,,,,,Yes,Yes,Yes,103187,,8.813202e+04
1,AFG,SAS,Afghanistan,LIC,2057.062256,,,,,,,Yes,Yes,,34656032,20000.0,5.628525e+06
2,AGO,SSF,Angola,LMC,8036.690430,51.800000,6.700000,4.400000,11.500000,11.900000,13.500000,,Yes,,25096150,92000.0,4.213644e+06
3,ALB,ECS,Albania,UMC,13724.058590,51.400000,4.500000,4.800000,15.210000,9.900000,9.600000,Yes,Yes,No,2854191,20000.0,1.087447e+06
4,AND,ECS,Andorra,HIC,43711.800780,31.200000,8.200000,2.600000,11.600000,35.100000,11.300000,Yes,Yes,,82431,,4.300000e+04
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
212,XKX,ECS,Kosovo,LMC,9723.561523,42.000000,6.000000,6.000000,20.000000,8.000000,11.000000,Yes,Yes,,1801800,23.0,3.190000e+05
213,YEM,MEA,"Yemen, Rep.",LIC,8269.671875,65.000000,1.000000,6.000000,6.000000,7.000000,10.000000,Yes,Yes,,27584212,42000.0,4.836820e+06
214,ZAF,SSF,South Africa,UMC,12666.607420,16.381655,5.200216,16.910461,45.020646,9.396918,7.090104,Yes,Yes,,51729344,321000.0,1.845723e+07
215,ZMB,SSF,Zambia,LMC,3201.289307,,,,,,,,Yes,,14264756,15000.0,2.608268e+06


## Exploring the Columns

The columns of interest for this analysis include:

- iso3c
- country_name
- gdp
- composition_food_organic_waste_percent    
- other_information_national_law_governing_solid_waste_management_in_the_country
- other_information_summary_of_key_solid_waste_information_made_available_to_the_public
- population_number_of_people
- e_waste_tons_year
- total_waste_generated


The following column will be removed from the clean data:
- other_information_national_agency_to_enforce_solid_waste_laws_and_regulations
- composition_glass_percent 
- composition_metal_percent
- composition_other_percent
- composition_paper_cardboard_percent
- composition_plastic_percent

#### Checking for percentage of food organic waste

In [6]:
world_df['composition_food_organic_waste_percent'].value_counts()

43.600000    6
30.000000    6
57.000000    5
46.000000    5
45.000000    3
            ..
35.880000    1
87.600000    1
36.700000    1
49.000000    1
16.381655    1
Name: composition_food_organic_waste_percent, Length: 135, dtype: int64

#### Checking amount of countries that come under High income, medium income, low to medium income and low income countries.

In [7]:
world_df['income_id'].value_counts()

HIC    81
UMC    56
LMC    47
LIC    33
Name: income_id, dtype: int64

In [8]:
bins = [10000000, 20000000, 30000000, 40000000, 50000000, 600000000000000000]
labels = ['<10M', '<20M', '<30M', '<40M', '<50M']
world_df['population_buckets'] = pd.cut(world_df['population_number_of_people'], bins=bins, labels=labels, right=False)
world_df['population_buckets'].value_counts().apply(lambda x: format(int(x), 'n'))

<10M    27
<50M    25
<20M    14
<30M    10
<40M     8
Name: population_buckets, dtype: object

In [9]:
world_df[world_df['total_waste_generated'] < 100000]

Unnamed: 0,iso3c,region_id,country_name,income_id,gdp,composition_food_organic_waste_percent,composition_glass_percent,composition_metal_percent,composition_other_percent,composition_paper_cardboard_percent,composition_plastic_percent,other_information_national_agency_to_enforce_solid_waste_laws_and_regulations,other_information_national_law_governing_solid_waste_management_in_the_country,other_information_summary_of_key_solid_waste_information_made_available_to_the_public,population_number_of_people,e_waste_tons_year,total_waste_generated,population_buckets
0,ABW,LCN,Aruba,HIC,35563.3125,,,,,,,Yes,Yes,Yes,103187,,88132.0167,
4,AND,ECS,Andorra,HIC,43711.80078,31.2,8.2,2.6,11.6,35.1,11.3,Yes,Yes,,82431,,43000.0,
8,ASM,EAS,American Samoa,UMC,11113.44238,19.7,3.4,7.9,25.6,26.4,12.8,Yes,,,55599,,18989.49,
9,ATG,LCN,Antigua and Barbuda,HIC,17965.50195,46.0,7.0,7.0,12.0,15.0,13.0,Yes,Yes,No,96777,1100.0,30585.0,
24,BMU,NAC,Bermuda,HIC,80982.36719,17.0,9.0,6.0,26.0,29.0,13.0,Yes,Yes,,64798,,82000.0,
42,COM,SSF,Comoros,LIC,2959.540039,50.0,2.0,4.0,22.0,7.0,5.0,,Yes,,777424,600.0,91013.0,
46,CUW,LCN,Curacao,HIC,27503.78906,,,,,,,Yes,Yes,,153822,,24703.8132,
47,CYM,LCN,Cayman Islands,HIC,66207.44531,10.9,3.5,6.2,11.4,31.1,11.0,Yes,Yes,,59172,,60000.0,
52,DMA,LCN,Dominica,UMC,11708.63965,45.0,8.0,5.0,14.0,12.0,16.0,Yes,Yes,,72400,,13176.0,
65,FRO,ECS,Faeroe Islands,HIC,44402.87891,,,,,,,,Yes,,48842,,61000.0,


In [10]:
world_df[world_df['total_waste_generated'] > 10000000]

Unnamed: 0,iso3c,region_id,country_name,income_id,gdp,composition_food_organic_waste_percent,composition_glass_percent,composition_metal_percent,composition_other_percent,composition_paper_cardboard_percent,composition_plastic_percent,other_information_national_agency_to_enforce_solid_waste_laws_and_regulations,other_information_national_law_governing_solid_waste_management_in_the_country,other_information_summary_of_key_solid_waste_information_made_available_to_the_public,population_number_of_people,e_waste_tons_year,total_waste_generated,population_buckets
6,ARG,LCN,Argentina,HIC,23550.09961,38.74,3.16,1.84,15.36,13.96,14.61,,Yes,,42981516,291700.0,17910550.0,<40M
10,AUS,EAS,Australia,HIC,47784.17969,48.44,3.81,19.38,3.46,17.3,7.61,Yes,Yes,Yes,23789338,574000.0,13345000.0,<20M
17,BGD,SAS,Bangladesh,LMC,3195.737061,80.58,0.44,0.5,8.59,3.04,4.67,Yes,Yes,,155727056,142000.0,14778500.0,<50M
26,BRA,LCN,Brazil,UMC,14596.24609,51.4,2.4,2.9,16.7,13.1,13.5,Yes,Yes,,208494896,1411900.0,79069580.0,<50M
32,CAN,NAC,Canada,HIC,47672.07813,24.0,6.0,13.0,8.0,47.0,3.0,Yes,Yes,Yes,35544564,724000.0,25103030.0,<30M
36,CHN,EAS,China,UMC,16092.30078,61.2,2.1,1.1,13.1,9.6,9.8,Yes,Yes,,1400050048,7211000.0,395081400.0,<50M
39,COD,SSF,"Congo, Dem. Rep.",LIC,1055.572998,,,,,,,,Yes,,78736152,,14385230.0,<50M
41,COL,LCN,Colombia,UMC,12523.00684,59.58,2.35,1.1,15.74,8.4,12.83,,No,,46406648,252200.0,12150120.0,<40M
50,DEU,ECS,Germany,HIC,53784.78125,30.0,10.0,1.4,17.7,24.0,13.0,Yes,Yes,Yes,83132800,2465085.0,50627880.0,<50M
55,DZA,MEA,Algeria,UMC,11826.16504,54.4,1.2,2.8,0.8,9.8,16.9,Yes,Yes,Yes,40606052,252000.0,12378740.0,<40M


In [11]:
world_df[world_df['e_waste_tons_year'].isna()]


Unnamed: 0,iso3c,region_id,country_name,income_id,gdp,composition_food_organic_waste_percent,composition_glass_percent,composition_metal_percent,composition_other_percent,composition_paper_cardboard_percent,composition_plastic_percent,other_information_national_agency_to_enforce_solid_waste_laws_and_regulations,other_information_national_law_governing_solid_waste_management_in_the_country,other_information_summary_of_key_solid_waste_information_made_available_to_the_public,population_number_of_people,e_waste_tons_year,total_waste_generated,population_buckets
0,ABW,LCN,Aruba,HIC,35563.3125,,,,,,,Yes,Yes,Yes,103187,,88132.02,
4,AND,ECS,Andorra,HIC,43711.80078,31.2,8.2,2.6,11.6,35.1,11.3,Yes,Yes,,82431,,43000.0,
8,ASM,EAS,American Samoa,UMC,11113.44238,19.7,3.4,7.9,25.6,26.4,12.8,Yes,,,55599,,18989.49,
24,BMU,NAC,Bermuda,HIC,80982.36719,17.0,9.0,6.0,26.0,29.0,13.0,Yes,Yes,,64798,,82000.0,
34,CHI,ECS,Channel Islands,HIC,46672.59375,,,,,,,Yes,Yes,,164541,,178933.0,
39,COD,SSF,"Congo, Dem. Rep.",LIC,1055.572998,,,,,,,,Yes,,78736152,,14385230.0,<50M
40,COG,SSF,"Congo, Rep.",LMC,4899.57959,,,,,,,No,Yes,,2648507,,451200.0,
45,CUB,LCN,Cuba,UMC,12984.71094,68.9,4.6,1.6,3.0,12.3,9.6,No,No,Yes,11303687,,2692692.0,<10M
46,CUW,LCN,Curacao,HIC,27503.78906,,,,,,,Yes,Yes,,153822,,24703.81,
47,CYM,LCN,Cayman Islands,HIC,66207.44531,10.9,3.5,6.2,11.4,31.1,11.0,Yes,Yes,,59172,,60000.0,


In [12]:
world_df['total_waste_generated'].describe().apply(lambda x: format(int(x), 'f'))

count          215.000000
mean       9638989.000000
std       36002157.000000
min           3989.000000
25%         213879.000000
50%        1787400.000000
75%        5282711.000000
max      395081376.000000
Name: total_waste_generated, dtype: object

In [13]:
world_df['gdp'].describe().apply(lambda x: format(int(x), 'f'))

count       216.000000
mean      22645.000000
std       22663.000000
min         822.000000
25%        4799.000000
50%       13465.000000
75%       35911.000000
max      117335.000000
Name: gdp, dtype: object

# Cleaning

#### Cleaned the data by keeping all the columns required for the analysis in a columns_to_keep dataframe and assigning it to new variable world_df

In [14]:
columns_to_keep = ['iso3c','region_id','country_name','income_id','gdp','composition_food_organic_waste_percent',
                   'composition_plastic_percent',
                   'other_information_national_law_governing_solid_waste_management_in_the_country',
                   'other_information_summary_of_key_solid_waste_information_made_available_to_the_public',
                   'population_number_of_people','e_waste_tons_year','total_waste_generated'
]

world_df = world_df[columns_to_keep]

##### Replacing the NA and None value with "No"

In [15]:
world_df['other_information_summary_of_key_solid_waste_information_made_available_to_the_public'].replace("None", pd.NA, inplace=True)
world_df['other_information_summary_of_key_solid_waste_information_made_available_to_the_public'].replace(pd.NA, "No", inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  world_df['other_information_summary_of_key_solid_waste_information_made_available_to_the_public'].replace("None", pd.NA, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  world_df['other_information_summary_of_key_solid_waste_information_made_available_to_the_public'].replace(pd.NA, "No", inplace=True)


##### Dropping the rows with NA values in columns "composition_food_organic_waste_percent", "composition_plastic_percent", "e_waste_tons_year"

In [16]:
world_df = world_df.dropna(subset=["composition_food_organic_waste_percent", "composition_plastic_percent", "e_waste_tons_year"])


##### Replacing income_id codes with understandble Income Labels

In [17]:
world_df['income_id'].replace("HIC", 'High-Income-Country', inplace=True)
world_df['income_id'].replace("LIC", 'Low-Income-Country', inplace=True)
world_df['income_id'].replace("LMC", 'LowToMedium-Income-Country', inplace=True)
world_df['income_id'].replace("UMC", 'Medium-Income-Country', inplace=True)

world_df['other_information_national_law_governing_solid_waste_management_in_the_country'].replace("yes", 'Yes', inplace=True)

# Analyzing and Plotting

We are analyzing the amount of waste generated by the countries based on their Income levels

In [18]:
analyze_df = world_df.groupby(['income_id']).agg({'total_waste_generated':"sum"}).reset_index()
analyze_df["total_waste_generated"] = analyze_df["total_waste_generated"].astype("int64")
analyze_df.rename(columns={"income_id": "income_slab"},inplace=True)

In [19]:
import altair as alt

alt.renderers.enable("mimetype")
bars = alt.Chart(analyze_df).mark_bar().encode(
    x="income_slab",
    y="total_waste_generated",
    color=alt.Color('income_slab:N', scale=alt.Scale(scheme='category10'))
).properties(
    width=600,
    height=300
)

labels = bars.mark_text(
    align='center',
    baseline='bottom',
    dy=-5  # adjust offset to avoid overlapping with bars
).encode(
    text='total_waste_generated'
)

(bars + labels).properties(
    width=600,
    height=300
)


<VegaLite 4 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


The donut chart below describes it the data on waste managment is made available to the population of their country. You can see the data labels and the percentages if you hover over the donut chart

In [20]:
world_df['other_information_summary_of_key_solid_waste_information_made_available_to_the_public'].value_counts()

No     104
Yes     48
Name: other_information_summary_of_key_solid_waste_information_made_available_to_the_public, dtype: int64

In [21]:
# Aggregate data
analyze_df2 = world_df.groupby("other_information_summary_of_key_solid_waste_information_made_available_to_the_public")["country_name"].count().reset_index()
analyze_df2.rename(columns={"country_name": "count"}, inplace=True)

# Calculate total count
total_count = analyze_df2["count"].sum()

# Calculate percentage share
analyze_df2["share"] = analyze_df2["count"]/total_count * 100

# Create donut chart with legend
donut_chart = alt.Chart(analyze_df2, title='Donut Chart').mark_arc().encode(
    theta='count',
    color=alt.Color('other_information_summary_of_key_solid_waste_information_made_available_to_the_public', legend=alt.Legend(title='Data Available To Public?', orient='right')),
    tooltip=['other_information_summary_of_key_solid_waste_information_made_available_to_the_public', 'count', 'share']
).transform_calculate(
    end_angle='2 * PI * datum.count / {}'.format(total_count)
).transform_calculate(
    start_angle='datum.end_angle - PI * datum.count / {}'.format(total_count)
).transform_filter(
    alt.datum.count > 0
).properties(
    width=300,
    height=300
).configure_arc(
    innerRadius=70,
    strokeWidth=2
)

donut_chart


<VegaLite 4 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


Analyzing here what percentage ofcountries have a centralized governing body for their waste management in their country. Hover over the chart to see data labels

In [22]:
world_df['other_information_national_law_governing_solid_waste_management_in_the_country'].value_counts()

Yes    132
No      14
Name: other_information_national_law_governing_solid_waste_management_in_the_country, dtype: int64

In [23]:
analyze_df3 = world_df.groupby("other_information_national_law_governing_solid_waste_management_in_the_country")["country_name"].count().reset_index()
analyze_df3.rename(columns={"country_name": "count"}, inplace=True)

In [24]:
chart = alt.Chart(analyze_df3, title='Pie Chart').mark_arc().encode(
    theta='count:Q',
    color=alt.Color('other_information_national_law_governing_solid_waste_management_in_the_country:N', scale=alt.Scale(range=['#6baed6', '#fdbf6f'])),
    tooltip=['other_information_national_law_governing_solid_waste_management_in_the_country:N', 'count:Q'],  # Data labels to be displayed on hover
    text=alt.Text('count:Q', format='.2s'),  # Data labels to be displayed on the chart
).properties(
    width=400,
    height=300
)
chart

<VegaLite 4 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


Breaking down by GDP

In [25]:
world_df["gdp"].describe()

count       152.000000
mean      24379.931221
std       23138.064409
min         839.778503
25%        7311.291138
50%       16120.015135
75%       37380.665040
max      117335.585900
Name: gdp, dtype: float64

Making brackets of GPD for each country

In [26]:
def add_gpd_bracket(gdp: int):
    if gdp<10000:
        return "low_gdp"
    elif gdp<15000:
        return "medium_gdp" 
    else:
        return "high_gdp"     
world_df["gdp_bracket"] = world_df["gdp"].apply(add_gpd_bracket)

world_df[["gdp","gdp_bracket"]]

Unnamed: 0,gdp,gdp_bracket
2,8036.690430,low_gdp
3,13724.058590,medium_gdp
5,67119.132810,high_gdp
6,23550.099610,high_gdp
7,11019.838870,medium_gdp
...,...,...
211,6210.983398,low_gdp
212,9723.561523,low_gdp
213,8269.671875,low_gdp
214,12666.607420,medium_gdp


In [27]:
world_df['gdp_bracket'].value_counts()

high_gdp      79
low_gdp       48
medium_gdp    25
Name: gdp_bracket, dtype: int64

Analyzing how many of these waste managment countries has low , medium or high GDP

In [28]:
analyze_df4 = world_df.groupby("gdp_bracket")["country_name"].count().reset_index()
analyze_df4.rename(columns={"country_name": "count"}, inplace=True)

In [29]:
alt.Chart(analyze_df4).mark_bar().encode(
    x="gdp_bracket",
    y="count",
    color=alt.Color('gdp_bracket:N', scale=alt.Scale(scheme='category10'))

).properties(
    width=300,
    height=300
)

<VegaLite 4 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


In [30]:
world_df["ewaste_perc"]=(world_df['e_waste_tons_year'] * 100)/world_df['total_waste_generated']
world_df[["ewaste_perc", "e_waste_tons_year", "total_waste_generated"]]

Unnamed: 0,ewaste_perc,e_waste_tons_year,total_waste_generated
2,2.183384,92000.0,4.213644e+06
3,1.839171,20000.0,1.087447e+06
5,2.385325,134000.0,5.617682e+06
6,1.628649,291700.0,1.791055e+07
7,2.840909,14000.0,4.928000e+05
...,...,...,...
211,1.824878,500.0,2.739909e+04
212,0.007210,23.0,3.190000e+05
213,0.868339,42000.0,4.836820e+06
214,1.739156,321000.0,1.845723e+07


In [31]:
world_df["waste_per_person_in_lbs"] = (world_df['total_waste_generated']/world_df['population_number_of_people'])*2204
world_df[["country_name","waste_per_person_in_lbs"]]

Unnamed: 0,country_name,waste_per_person_in_lbs
2,Angola,370.051600
3,Albania,839.723984
5,United Arab Emirates,1267.216046
6,Argentina,918.414609
7,Armenia,373.726421
...,...,...
211,Samoa,321.784000
212,Kosovo,390.207570
213,"Yemen, Rep.",386.465681
214,South Africa,786.395809


In [32]:
top_10 = world_df.nlargest(10, "waste_per_person_in_lbs")[["country_name", "waste_per_person_in_lbs"]]

# Build the line chart
chart = alt.Chart(top_10).mark_bar().encode(
    x='country_name:N',
    y='waste_per_person_in_lbs:Q',
    color=alt.Color('country_name:N', scale=alt.Scale(scheme='category10')),
    text=alt.Text('waste_per_person_in_lbs:Q', format='.1f')

).properties(
    title='Top Ten Countries by Waste per Person',
    width=600,
    height=400
)

chart

<VegaLite 4 object>

If you see this message, it means the renderer has not been properly enabled
for the frontend that you are using. For more information, see
https://altair-viz.github.io/user_guide/troubleshooting.html


# THE PROJECT ENDS HERE             

In [33]:
world_df["waste_per_person_in_lbs"].describe()


count     152.000000
mean      826.439620
std       479.253377
min       110.966909
25%       462.771157
50%       764.141701
75%      1113.413702
max      2646.789906
Name: waste_per_person_in_lbs, dtype: float64

In [34]:
def add_wastepp_bracket(waste_per_person_in_lbs: int):
    if waste_per_person_in_lbs<500:
        return "low_waste_per_person"
    elif waste_per_person_in_lbs<1000:
        return "medium_waste_per_person" 
    else:
        return "high_waste_per_person"     
world_df["wastepp_bracket"] = world_df["waste_per_person_in_lbs"].apply(add_wastepp_bracket)

world_df[["waste_per_person_in_lbs","wastepp_bracket"]]

Unnamed: 0,waste_per_person_in_lbs,wastepp_bracket
2,370.051600,low_waste_per_person
3,839.723984,medium_waste_per_person
5,1267.216046,high_waste_per_person
6,918.414609,medium_waste_per_person
7,373.726421,low_waste_per_person
...,...,...
211,321.784000,low_waste_per_person
212,390.207570,low_waste_per_person
213,386.465681,low_waste_per_person
214,786.395809,medium_waste_per_person


In [35]:
world_df['wastepp_bracket'].value_counts()

medium_waste_per_person    64
high_waste_per_person      45
low_waste_per_person       43
Name: wastepp_bracket, dtype: int64

In [36]:
world_df[['country_name','wastepp_bracket']].value_counts()

country_name  wastepp_bracket        
Albania       medium_waste_per_person    1
Morocco       low_waste_per_person       1
Netherlands   high_waste_per_person      1
New Zealand   high_waste_per_person      1
Niger         low_waste_per_person       1
                                        ..
Grenada       medium_waste_per_person    1
Guatemala     low_waste_per_person       1
Guinea        low_waste_per_person       1
Guyana        medium_waste_per_person    1
Zimbabwe      low_waste_per_person       1
Length: 151, dtype: int64

In [37]:
# Generate data
# world_df["ewaste_perc"]=(world_df['e_waste_tons_year'] * 100)/world_df['total_waste_generated']
# data = world_df[["ewaste_perc", "e_waste_tons_year", "total_waste_generated"]]

# # Melt data to long format
# melted_data = pd.melt(data, var_name='variable', value_name='value')

# # Create chart
# # alt.Chart(data).mark_bar().encode(
# #     x=alt.X('variable:N', axis=alt.Axis(title='Variables')),
# #     y=alt.Y('value:Q', axis=alt.Axis(title='Value')),
# #     color=alt.Color('variable:N', scale=alt.Scale(scheme='category10'))
# # ).properties(
# #     width=600,
# #     height=400
# # ).configure_axis(
# #     labelFontSize=14,
# #     titleFontSize=16
# # )

# # creating a custom dataframe
# # data = pd.DataFrame([[264, 'Rohit', 'ODI'], 
# #                      [183, 'Virat', 'ODI'], 
# #                      [118, 'Rohit', 'T20'], 
# #                      [94, 'Virat', 'T20'],
# #                      [212, 'Rohit','Test'],
# #                      [254, 'Virat','Test']],
# #                      columns=['Highest Score', 'Player', 'Format'])
  
# # print(data)

# gp_chart = alt.Chart(data).mark_bar().encode(
#   alt.Column('Format'), alt.X('e_waste_tons_year'),
#   alt.Y('total_waste_generated', axis=alt.Axis(grid=False)), 
#   alt.Color('ewaste_perc'))
  
# gp_chart.display()


In [38]:
# Generate data
# world_df["ewaste_perc"]=(world_df['e_waste_tons_year'] * 100)/world_df['total_waste_generated']
# data = world_df[["ewaste_perc", "e_waste_tons_year", "total_waste_generated"]]


# gp_chart = alt.Chart(data).mark_bar().encode(
#   alt.Column('Format'), alt.X('e_waste_tons_year'),
#   alt.Y('total_waste_generated', axis=alt.Axis(grid=False)), 
#   alt.Color('ewaste_perc'))
  
# gp_chart.display()