## Read the data

This code will give you the data you will use for this PC:

In [28]:
# read web table into pandas DF
import pandas as pd

co2Link='https://docs.google.com/spreadsheets/d/e/2PACX-1vRXjfxeKHQBBCh_oHP-D6RIHEX4eduLjydHb6ZGsU4jo0IK0KKQSoYX_X1FGssC_9hnqCgjKN0K4AVf/pub?gid=775098192&single=true&output=csv'
carbon=pd.read_csv(co2Link)


In [29]:
# here it is:
carbon

Unnamed: 0,name,metric tonnes of CO2,date_of_information,ranking,region
0,China,1.219600e+10,2023,1,East and Southeast Asia
1,United States,4.795000e+09,2023,2,North America
2,India,2.821000e+09,2023,3,South Asia
3,Russia,1.844000e+09,2023,4,Central Asia
4,Japan,9.602300e+08,2023,5,East and Southeast Asia
...,...,...,...,...,...
211,Falkland Islands (Islas Malvinas),3.600000e+01,2023,212,South America
212,Montserrat,2.400000e+01,2023,213,Central America and the Caribbean
213,Antarctica,1.500000e+01,2023,214,Antarctica
214,"Saint Helena, Ascension, and Tristan da Cunha",1.200000e+01,2023,215,Africa


As you see, some columns have space, which should be deleted:

In [30]:
# also
carbon.columns

Index(['name', 'metric tonnes of CO2', 'date_of_information', 'ranking',
       'region'],
      dtype='object')

In [31]:
# Like this
carbon.rename(columns={'metric tonnes of CO2':'metric_tonnes_of_CO2'},inplace=True)

# see
carbon.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 216 entries, 0 to 215
Data columns (total 5 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   name                  216 non-null    object 
 1   metric_tonnes_of_CO2  216 non-null    float64
 2   date_of_information   216 non-null    int64  
 3   ranking               216 non-null    int64  
 4   region                216 non-null    object 
dtypes: float64(1), int64(2), object(2)
memory usage: 8.6+ KB


You always make a copy:

In [32]:
carbon_copy=carbon.copy()

## Questions

Complete the tasks requested using **carbon_copy**:

1. Keep all the columns but _Ranking_:
    * Tip: use [drop](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop.html), [loc](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html), and [iloc](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html) for the same purpose (three ways to accomplish the task).

2. Find the country with the minimum CO2 emission in the world.

3. Find the country with the minimum CO2 emission per Region.

4. Create a new column by keeping the square root of the the original CO2  column.

5. Compute the average of CO2. Then, create a new column. This new column is
the original CO2 minus the average computed.

You have changed **carbon_copy**, now save it as a file like this:

In [33]:
# 1. Keep all the columns but Ranking:
carbon_copy = carbon_copy.loc[:, carbon_copy.columns != 'ranking']

In [34]:
# 2. Find the country with the minimum CO2 emission in the world
min_co2_country = carbon_copy.loc[carbon_copy['metric_tonnes_of_CO2'].idxmin(), 'name']
print(f"Country with minimum CO2 emission: {min_co2_country}")

Country with minimum CO2 emission: Niue


In [35]:
# 3. Find the country with the minimum CO2 emission per Region
min_co2_per_region = carbon_copy.loc[carbon_copy.groupby('region')['metric_tonnes_of_CO2'].idxmin()]
print("\nCountry with minimum CO2 emission per region:")
print(min_co2_per_region[['region', 'name', 'metric_tonnes_of_CO2']])


Country with minimum CO2 emission per region:
                                region  \
214                             Africa   
213                         Antarctica   
215              Australia and Oceania   
212  Central America and the Caribbean   
121                       Central Asia   
184            East and Southeast Asia   
175                             Europe   
141                        Middle East   
210                      North America   
211                      South America   
177                         South Asia   

                                              name  metric_tonnes_of_CO2  
214  Saint Helena, Ascension, and Tristan da Cunha                  12.0  
213                                     Antarctica                  15.0  
215                                           Niue                   9.0  
212                                     Montserrat                  24.0  
121                                        Armenia             7144000.0 

In [36]:
# 4. Create a new column with the square root of the original CO2 column
carbon_copy['sqrt_CO2'] = carbon_copy['metric_tonnes_of_CO2'].apply(lambda x: x**0.5)

In [37]:
# 5. Compute the average of CO2 and create a new column with the difference
avg_co2 = carbon_copy['metric_tonnes_of_CO2'].mean()
carbon_copy['CO2_minus_avg'] = carbon_copy['metric_tonnes_of_CO2'] - avg_co2

In [38]:
carbon_copy.to_csv("carbon_copy.csv", index=False)