# Data Understanding

## Summary

This notebook performs initial data exploration and understanding of the Global Forest Watch raw dataset. The primary objectives are to:

- **Load and inspect** the raw Excel file structure and available sheets
- **Examine data quality** including missing values, data types, and inconsistencies
- **Understand relationships** between different country-level datasets
- **Identify patterns** that will inform the data preparation phase
- **Document findings** about data structure, column meanings, and potential issues

## Key Activities

1. **Data Loading**: Loads the Excel workbook and identifies all available sheets
2. **Sheet Exploration**: Examines each country-level sheet (Tree Cover Loss, Primary Loss, Drivers, Carbon Data)
3. **Data Quality Assessment**: Checks for missing values, duplicates, and data type issues
4. **Structure Analysis**: Understands column formats, value ranges, and data distributions
5. **Relationship Mapping**: Identifies common keys and relationships between datasets

## Output

This notebook produces insights and documentation that guide the data preparation process in Notebook 2. No processed data files are created at this stage.

---


## Part 1: Data Exploration - Raw Data Sheets

### Goal: Understand the raw data structure, quality, and characteristics from each Excel sheet

We will explore each country-level sheet separately to understand:
- Data structure and format
- Column names and types
- Data quality issues
- Relationships between sheets
- Patterns that will inform data preparation

## Step 1: Loading the dataset

In [3]:
import pandas as pd
from warnings import filterwarnings
filterwarnings('ignore')

RAW_PATH = "../data/raw/global_forest_watch_raw_data.xlsx"

print("="*80)
print("LOADING RAW DATA FROM EXCEL FILE")
print("="*80)

excel_file = pd.ExcelFile(RAW_PATH)
print(f"Excel file loaded: {RAW_PATH}")
print(f"\nAvailable sheets: {len(excel_file.sheet_names)}")
print("\nSheet names:")
for i, sheet in enumerate(excel_file.sheet_names, 1):
    print(f"  {i}. {sheet}")

country_sheets = [s for s in excel_file.sheet_names if s.startswith('Country')]
print(f"\nCountry-level sheets to explore: {len(country_sheets)}")
for sheet in country_sheets:
    print(f"  - {sheet}")

LOADING RAW DATA FROM EXCEL FILE
Excel file loaded: ../data/raw/global_forest_watch_raw_data.xlsx

Available sheets: 9

Sheet names:
  1. Read_Me
  2. Country tree cover loss
  3. Country primary loss
  4. Country drivers
  5. Country carbon data
  6. Subnational 1 tree cover loss
  7. Subnational 1 primary loss
  8. Subnational 1 drivers
  9. Subnational 1 carbon data

Country-level sheets to explore: 4
  - Country tree cover loss
  - Country primary loss
  - Country drivers
  - Country carbon data


## Step 2: Exploring Country Tree Cover Loss Data


In [4]:
df_tree_cover_loss = excel_file.parse("Country tree cover loss")

print("="*80)
print("COUNTRY TREE COVER LOSS - RAW DATA")
print("="*80)
print(f"Shape: {df_tree_cover_loss.shape[0]:,} rows × {df_tree_cover_loss.shape[1]} columns")
print(f"\nColumn names ({len(df_tree_cover_loss.columns)}):")
for i, col in enumerate(df_tree_cover_loss.columns, 1):
    print(f"  {i:2d}. {col}")

print("\n" + "="*80)
print("First 5 rows:")
print("="*80)
display(df_tree_cover_loss.head())

print("\n" + "="*80)
print("Data Types:")
print("="*80)
print(df_tree_cover_loss.dtypes)

print("\n" + "="*80)
print("Basic Statistics:")
print("="*80)
display(df_tree_cover_loss.describe())


COUNTRY TREE COVER LOSS - RAW DATA
Shape: 1,328 rows × 30 columns

Column names (30):
   1. country
   2. threshold
   3. area_ha
   4. extent_2000_ha
   5. extent_2010_ha
   6. gain_2000-2012_ha
   7. tc_loss_ha_2001
   8. tc_loss_ha_2002
   9. tc_loss_ha_2003
  10. tc_loss_ha_2004
  11. tc_loss_ha_2005
  12. tc_loss_ha_2006
  13. tc_loss_ha_2007
  14. tc_loss_ha_2008
  15. tc_loss_ha_2009
  16. tc_loss_ha_2010
  17. tc_loss_ha_2011
  18. tc_loss_ha_2012
  19. tc_loss_ha_2013
  20. tc_loss_ha_2014
  21. tc_loss_ha_2015
  22. tc_loss_ha_2016
  23. tc_loss_ha_2017
  24. tc_loss_ha_2018
  25. tc_loss_ha_2019
  26. tc_loss_ha_2020
  27. tc_loss_ha_2021
  28. tc_loss_ha_2022
  29. tc_loss_ha_2023
  30. tc_loss_ha_2024

First 5 rows:


Unnamed: 0,country,threshold,area_ha,extent_2000_ha,extent_2010_ha,gain_2000-2012_ha,tc_loss_ha_2001,tc_loss_ha_2002,tc_loss_ha_2003,tc_loss_ha_2004,...,tc_loss_ha_2015,tc_loss_ha_2016,tc_loss_ha_2017,tc_loss_ha_2018,tc_loss_ha_2019,tc_loss_ha_2020,tc_loss_ha_2021,tc_loss_ha_2022,tc_loss_ha_2023,tc_loss_ha_2024
0,Afghanistan,0,64383655,64383655,64383655,10738,103,214,267,226,...,0,0,0,31,25,46,47,16,133,223
1,Afghanistan,10,64383655,432070,126231,10738,92,190,254,207,...,0,0,0,28,19,40,37,9,32,32
2,Afghanistan,15,64383655,302629,106852,10738,91,186,248,205,...,0,0,0,28,19,39,32,7,23,17
3,Afghanistan,20,64383655,284330,105718,10738,89,181,245,203,...,0,0,0,28,18,39,32,7,22,16
4,Afghanistan,25,64383655,254843,72384,10738,89,180,244,202,...,0,0,0,27,18,38,27,6,21,14



Data Types:
country              object
threshold             int64
area_ha               int64
extent_2000_ha        int64
extent_2010_ha        int64
gain_2000-2012_ha     int64
tc_loss_ha_2001       int64
tc_loss_ha_2002       int64
tc_loss_ha_2003       int64
tc_loss_ha_2004       int64
tc_loss_ha_2005       int64
tc_loss_ha_2006       int64
tc_loss_ha_2007       int64
tc_loss_ha_2008       int64
tc_loss_ha_2009       int64
tc_loss_ha_2010       int64
tc_loss_ha_2011       int64
tc_loss_ha_2012       int64
tc_loss_ha_2013       int64
tc_loss_ha_2014       int64
tc_loss_ha_2015       int64
tc_loss_ha_2016       int64
tc_loss_ha_2017       int64
tc_loss_ha_2018       int64
tc_loss_ha_2019       int64
tc_loss_ha_2020       int64
tc_loss_ha_2021       int64
tc_loss_ha_2022       int64
tc_loss_ha_2023       int64
tc_loss_ha_2024       int64
dtype: object

Basic Statistics:


Unnamed: 0,threshold,area_ha,extent_2000_ha,extent_2010_ha,gain_2000-2012_ha,tc_loss_ha_2001,tc_loss_ha_2002,tc_loss_ha_2003,tc_loss_ha_2004,tc_loss_ha_2005,...,tc_loss_ha_2015,tc_loss_ha_2016,tc_loss_ha_2017,tc_loss_ha_2018,tc_loss_ha_2019,tc_loss_ha_2020,tc_loss_ha_2021,tc_loss_ha_2022,tc_loss_ha_2023,tc_loss_ha_2024
count,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,...,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0,1328.0
mean,28.125,78148050.0,30380200.0,29943470.0,786730.8,79065.76,96442.43,84788.51,116282.2,106803.6,...,115358.0,175681.5,174534.6,146977.3,143857.2,155330.5,149812.1,135673.1,168680.9,177610.7
std,22.499791,201575200.0,105670400.0,104738200.0,3417373.0,312454.8,411666.9,376891.8,494717.7,423154.5,...,390674.0,645335.3,607919.8,545900.9,458810.6,579976.9,613902.0,490937.7,750234.1,692486.9
min,0.0,2094.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,13.75,5117777.0,548025.5,541270.0,13832.0,514.5,393.25,312.0,532.75,548.25,...,323.0,618.75,727.25,582.5,537.75,657.5,527.5,386.0,828.75,673.75
50%,22.5,20225870.0,3622986.0,3499126.0,94359.0,6940.5,5172.0,3940.5,6147.5,7207.5,...,7497.5,14212.5,16314.0,11565.0,11082.0,11954.5,9935.0,9190.5,13963.0,10951.0
75%,35.0,62019970.0,18319770.0,18198860.0,388240.0,31545.25,32422.25,28489.25,38602.75,40362.5,...,51400.25,106330.2,97937.25,76692.5,76894.25,80744.0,70723.75,65272.0,76778.0,81784.0
max,75.0,1689455000.0,1689455000.0,1689455000.0,37220540.0,2933201.0,3715945.0,3489258.0,4133606.0,3675951.0,...,2925679.0,6407238.0,6143920.0,6621833.0,4847068.0,8217252.0,8559449.0,5126743.0,10176020.0,7421840.0


## Step 3: Checking the quality of country loss data

In [5]:
print("="*80)
print("MISSING VALUES ANALYSIS:")
print("="*80)
missing = df_tree_cover_loss.isnull().sum()
missing_pct = (missing / len(df_tree_cover_loss)) * 100
missing_df = pd.DataFrame({
    'Column': missing.index,
    'Missing Count': missing.values,
    'Missing Percentage': missing_pct.values
}).sort_values('Missing Count', ascending=False)

missing_df = missing_df[missing_df['Missing Count'] > 0]
if len(missing_df) > 0:
    display(missing_df)
else:
    print("No missing values found!")

print("\n" + "="*80)
print("UNIQUE VALUES:")
print("="*80)
if 'country' in df_tree_cover_loss.columns:
    print(f"Number of countries: {df_tree_cover_loss['country'].nunique()}")
    print(f"Countries: {sorted(df_tree_cover_loss['country'].unique())}")

if 'threshold' in df_tree_cover_loss.columns:
    print(f"\nThreshold values: {sorted(df_tree_cover_loss['threshold'].unique())}")

year_cols = [col for col in df_tree_cover_loss.columns if '200' in col or '201' in col or '202' in col]
print(f"\nYear columns found: {len(year_cols)}")
if year_cols:
    print(f"Year range: {year_cols[0]} to {year_cols[-1]}")


MISSING VALUES ANALYSIS:
No missing values found!

UNIQUE VALUES:
Number of countries: 166
Countries: ['Afghanistan', 'Albania', 'Algeria', 'Angola', 'Argentina', 'Australia', 'Austria', 'Azerbaijan', 'Bangladesh', 'Belarus', 'Belgium', 'Belize', 'Benin', 'Bhutan', 'Bolivia', 'Bosnia and Herzegovina', 'Botswana', 'Brazil', 'Brunei', 'Bulgaria', 'Burkina Faso', 'Burundi', 'Cambodia', 'Cameroon', 'Canada', 'Central African Republic', 'Chad', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cuba', 'Czechia', "Côte d'Ivoire", 'Democratic Republic of the Congo', 'Denmark', 'Djibouti', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia', 'Faroe Islands', 'Fiji', 'Finland', 'France', 'French Guiana', 'Gabon', 'Gambia', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guadeloupe', 'Guatemala', 'Guinea', 'Guinea-Bissau', 'Guyana', 'Haiti', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland', 'Italy', 'Japan

## Step 4: Exploring Country Primary Loss Data and checking quality


In [6]:
df_primary_loss = excel_file.parse("Country primary loss")

print("="*80)
print("COUNTRY PRIMARY LOSS - RAW DATA")
print("="*80)
print(f"Shape: {df_primary_loss.shape[0]:,} rows × {df_primary_loss.shape[1]} columns")
print(f"\nColumn names ({len(df_primary_loss.columns)}):")
for i, col in enumerate(df_primary_loss.columns, 1):
    print(f"  {i:2d}. {col}")

print("\n" + "="*80)
print("First 5 rows:")
print("="*80)
display(df_primary_loss.head())

print("\n" + "="*80)
print("Data Types:")
print("="*80)
print(df_primary_loss.dtypes)

print("\n" + "="*80)
print("Basic Statistics:")
print("="*80)
display(df_primary_loss.describe())

print("\n" + "="*80)
print("MISSING VALUES ANALYSIS:")
print("="*80)
missing = df_primary_loss.isnull().sum()
missing_pct = (missing / len(df_primary_loss)) * 100
missing_df = pd.DataFrame({
    'Column': missing.index,
    'Missing Count': missing.values,
    'Missing Percentage': missing_pct.values
}).sort_values('Missing Count', ascending=False)

missing_df = missing_df[missing_df['Missing Count'] > 0]
if len(missing_df) > 0:
    display(missing_df)
else:
    print("No missing values found!")

print("\n" + "="*80)
print("UNIQUE VALUES:")
print("="*80)
if 'country' in df_primary_loss.columns:
    print(f"Number of countries: {df_primary_loss['country'].nunique()}")
    print(f"Countries: {sorted(df_primary_loss['country'].unique())}")

if 'threshold' in df_primary_loss.columns:
    print(f"\nThreshold values: {sorted(df_primary_loss['threshold'].unique())}")

year_cols = [col for col in df_primary_loss.columns if '200' in col or '201' in col or '202' in col]
print(f"\nYear columns found: {len(year_cols)}")
if year_cols:
    print(f"Year range: {year_cols[0]} to {year_cols[-1]}")


COUNTRY PRIMARY LOSS - RAW DATA
Shape: 76 rows × 26 columns

Column names (26):
   1. country
   2. threshold
   3. area__ha
   4. tc_loss_ha_2002
   5. tc_loss_ha_2003
   6. tc_loss_ha_2004
   7. tc_loss_ha_2005
   8. tc_loss_ha_2006
   9. tc_loss_ha_2007
  10. tc_loss_ha_2008
  11. tc_loss_ha_2009
  12. tc_loss_ha_2010
  13. tc_loss_ha_2011
  14. tc_loss_ha_2012
  15. tc_loss_ha_2013
  16. tc_loss_ha_2014
  17. tc_loss_ha_2015
  18. tc_loss_ha_2016
  19. tc_loss_ha_2017
  20. tc_loss_ha_2018
  21. tc_loss_ha_2019
  22. tc_loss_ha_2020
  23. tc_loss_ha_2021
  24. tc_loss_ha_2022
  25. tc_loss_ha_2023
  26. tc_loss_ha_2024

First 5 rows:


Unnamed: 0,country,threshold,area__ha,tc_loss_ha_2002,tc_loss_ha_2003,tc_loss_ha_2004,tc_loss_ha_2005,tc_loss_ha_2006,tc_loss_ha_2007,tc_loss_ha_2008,...,tc_loss_ha_2015,tc_loss_ha_2016,tc_loss_ha_2017,tc_loss_ha_2018,tc_loss_ha_2019,tc_loss_ha_2020,tc_loss_ha_2021,tc_loss_ha_2022,tc_loss_ha_2023,tc_loss_ha_2024
0,Angola,30,2458061,3499,2963,2354,3110,1400,8060,2699,...,8998,12040,11166,13507,9995,8895,24326,15576,17627,13660
1,Argentina,30,4418724,9318,14459,28090,31429,24095,18687,47067,...,10547,15247,17202,9496,8983,20847,11921,21388,11473,12103
2,Australia,30,13977,0,0,0,0,25,0,0,...,5,0,0,0,5,0,0,0,0,0
3,Bangladesh,30,101114,619,266,347,306,677,369,240,...,205,345,414,358,387,459,308,307,743,467
4,Belize,30,1165487,5570,2993,2108,3206,1899,4140,3632,...,6606,11511,6616,4781,8772,16087,4560,4033,11667,21137



Data Types:
country            object
threshold           int64
area__ha            int64
tc_loss_ha_2002     int64
tc_loss_ha_2003     int64
tc_loss_ha_2004     int64
tc_loss_ha_2005     int64
tc_loss_ha_2006     int64
tc_loss_ha_2007     int64
tc_loss_ha_2008     int64
tc_loss_ha_2009     int64
tc_loss_ha_2010     int64
tc_loss_ha_2011     int64
tc_loss_ha_2012     int64
tc_loss_ha_2013     int64
tc_loss_ha_2014     int64
tc_loss_ha_2015     int64
tc_loss_ha_2016     int64
tc_loss_ha_2017     int64
tc_loss_ha_2018     int64
tc_loss_ha_2019     int64
tc_loss_ha_2020     int64
tc_loss_ha_2021     int64
tc_loss_ha_2022     int64
tc_loss_ha_2023     int64
tc_loss_ha_2024     int64
dtype: object

Basic Statistics:


Unnamed: 0,threshold,area__ha,tc_loss_ha_2002,tc_loss_ha_2003,tc_loss_ha_2004,tc_loss_ha_2005,tc_loss_ha_2006,tc_loss_ha_2007,tc_loss_ha_2008,tc_loss_ha_2009,...,tc_loss_ha_2015,tc_loss_ha_2016,tc_loss_ha_2017,tc_loss_ha_2018,tc_loss_ha_2019,tc_loss_ha_2020,tc_loss_ha_2021,tc_loss_ha_2022,tc_loss_ha_2023,tc_loss_ha_2024
count,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0,...,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0,76.0
mean,30.0,13497960.0,35009.86,32682.53,44629.41,43705.5,36953.67,38147.87,35632.92,36765.184211,...,38528.013158,80509.46,65324.87,47926.33,49310.18,55317.2,49265.11,54079.01,49111.96,88496.16
std,0.0,43001720.0,188356.5,181678.3,236723.2,215722.9,170600.6,145239.2,135475.6,116080.074035,...,126231.483498,343014.0,253695.7,168091.2,170268.9,205885.2,188559.1,214962.4,154958.8,367170.4
min,30.0,1653.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,30.0,227151.2,132.5,122.5,203.5,217.0,245.5,184.5,222.75,445.25,...,253.75,501.5,514.5,341.75,324.75,306.5,301.25,292.75,274.0,366.0
50%,30.0,1833100.0,2029.5,1582.5,2277.0,2320.0,2528.5,3063.5,3677.5,3729.5,...,3392.5,4904.0,7123.0,4061.5,5047.5,4094.0,4050.0,3952.0,4538.5,4598.0
75%,30.0,7487047.0,10132.5,10143.75,11972.75,12021.0,14184.5,18432.0,15289.25,19996.25,...,16938.25,45902.25,31105.25,27098.5,30709.75,34334.0,24449.5,23500.0,24598.25,40371.75
max,30.0,343261000.0,1621738.0,1570540.0,2016350.0,1824217.0,1415536.0,1149515.0,1075087.0,700115.0,...,828839.0,2830943.0,2134474.0,1347176.0,1361053.0,1703491.0,1546964.0,1772214.0,1136250.0,2823646.0



MISSING VALUES ANALYSIS:
No missing values found!

UNIQUE VALUES:
Number of countries: 76
Countries: ['Angola', 'Argentina', 'Australia', 'Bangladesh', 'Belize', 'Benin', 'Bhutan', 'Bolivia', 'Brazil', 'Brunei', 'Burundi', 'Cambodia', 'Cameroon', 'Central African Republic', 'China', 'Colombia', 'Costa Rica', 'Cuba', "Côte d'Ivoire", 'Democratic Republic of the Congo', 'Dominican Republic', 'Ecuador', 'El Salvador', 'Equatorial Guinea', 'Ethiopia', 'Fiji', 'French Guiana', 'Gabon', 'Ghana', 'Guadeloupe', 'Guatemala', 'Guinea', 'Guinea-Bissau', 'Guyana', 'Haiti', 'Honduras', 'India', 'Indonesia', 'Kenya', 'Laos', 'Liberia', 'Madagascar', 'Malawi', 'Malaysia', 'Martinique', 'Mozambique', 'Myanmar', 'México', 'Nepal', 'Nicaragua', 'Nigeria', 'Panama', 'Papua New Guinea', 'Paraguay', 'Peru', 'Philippines', 'Republic of the Congo', 'Rwanda', 'Senegal', 'Sierra Leone', 'Solomon Islands', 'South Africa', 'South Sudan', 'Sri Lanka', 'Suriname', 'Tanzania', 'Thailand', 'Togo', 'Uganda', 'United

## Step 5: Exploring Country Drivers Data and Checking Data Quality

In [7]:
df_drivers = excel_file.parse("Country drivers")

print("="*80)
print("COUNTRY DRIVERS - RAW DATA")
print("="*80)
print(f"Shape: {df_drivers.shape[0]:,} rows × {df_drivers.shape[1]} columns")
print(f"\nColumn names ({len(df_drivers.columns)}):")
for i, col in enumerate(df_drivers.columns, 1):
    print(f"  {i:2d}. {col}")

print("\n" + "="*80)
print("First 5 rows:")
print("="*80)
display(df_drivers.head())

driver_type_cols = [c for c in df_drivers.columns if c not in ['country', 'threshold'] and not any(str(year) in c for year in range(2001, 2025))]
print(f"\nDriver type columns: {driver_type_cols}")

year_cols = [col for col in df_drivers.columns if any(str(year) in col for year in range(2001, 2025))]
print(f"Year-based columns: {len(year_cols)}")

print("\n" + "="*80)
print("Missing Values:")
print("="*80)
missing = df_drivers.isnull().sum()
if missing.sum() > 0:
    display(pd.DataFrame({'Missing Count': missing[missing > 0]}))
else:
    print("No missing values!")

COUNTRY DRIVERS - RAW DATA
Shape: 21,897 rows × 5 columns

Column names (5):
   1. country
   2. threshold
   3. driver
   4. year
   5. tc_loss_ha

First 5 rows:


Unnamed: 0,country,threshold,driver,year,tc_loss_ha
0,Afghanistan,30,Hard commodities,2014,0.0
1,Afghanistan,30,Logging,2001,3.0
2,Afghanistan,30,Logging,2002,64.0
3,Afghanistan,30,Logging,2003,73.0
4,Afghanistan,30,Logging,2004,143.0



Driver type columns: ['driver', 'year', 'tc_loss_ha']
Year-based columns: 0

Missing Values:
No missing values!


## Step 6: Exploring Country Carbon Data and Checking Data Quality


In [8]:
df_carbon = excel_file.parse("Country carbon data")

print("="*80)
print("COUNTRY CARBON DATA - RAW DATA")
print("="*80)
print(f"Shape: {df_carbon.shape[0]:,} rows × {df_carbon.shape[1]} columns")
print(f"\nColumn names ({len(df_carbon.columns)}):")
for i, col in enumerate(df_carbon.columns, 1):
    print(f"  {i:2d}. {col}")

print("\n" + "="*80)
print("First 5 rows:")
print("="*80)
display(df_carbon.head())

print("\n" + "="*80)
print("Missing Values:")
print("="*80)
missing = df_carbon.isnull().sum()
if missing.sum() > 0:
    display(pd.DataFrame({'Missing Count': missing[missing > 0]}))
else:
    print("✅ No missing values!")

carbon_metric_cols = [c for c in df_carbon.columns if 'carbon' in c.lower() or 'emission' in c.lower()]
print(f"\nCarbon-related columns: {carbon_metric_cols}")


COUNTRY CARBON DATA - RAW DATA
Shape: 498 rows × 32 columns

Column names (32):
   1. country
   2. umd_tree_cover_density_2000__threshold
   3. umd_tree_cover_extent_2000__ha
   4. gfw_aboveground_carbon_stocks_2000__Mg_C
   5. avg_gfw_aboveground_carbon_stocks_2000__Mg_C_ha-1
   6. gfw_forest_carbon_gross_emissions__Mg_CO2e_yr-1
   7. gfw_forest_carbon_gross_removals__Mg_CO2_yr-1
   8. gfw_forest_carbon_net_flux__Mg_CO2e_yr-1
   9. gfw_forest_carbon_gross_emissions_2001__Mg_CO2e
  10. gfw_forest_carbon_gross_emissions_2002__Mg_CO2e
  11. gfw_forest_carbon_gross_emissions_2003__Mg_CO2e
  12. gfw_forest_carbon_gross_emissions_2004__Mg_CO2e
  13. gfw_forest_carbon_gross_emissions_2005__Mg_CO2e
  14. gfw_forest_carbon_gross_emissions_2006__Mg_CO2e
  15. gfw_forest_carbon_gross_emissions_2007__Mg_CO2e
  16. gfw_forest_carbon_gross_emissions_2008__Mg_CO2e
  17. gfw_forest_carbon_gross_emissions_2009__Mg_CO2e
  18. gfw_forest_carbon_gross_emissions_2010__Mg_CO2e
  19. gfw_forest_carbon_gros

Unnamed: 0,country,umd_tree_cover_density_2000__threshold,umd_tree_cover_extent_2000__ha,gfw_aboveground_carbon_stocks_2000__Mg_C,avg_gfw_aboveground_carbon_stocks_2000__Mg_C_ha-1,gfw_forest_carbon_gross_emissions__Mg_CO2e_yr-1,gfw_forest_carbon_gross_removals__Mg_CO2_yr-1,gfw_forest_carbon_net_flux__Mg_CO2e_yr-1,gfw_forest_carbon_gross_emissions_2001__Mg_CO2e,gfw_forest_carbon_gross_emissions_2002__Mg_CO2e,...,gfw_forest_carbon_gross_emissions_2015__Mg_CO2e,gfw_forest_carbon_gross_emissions_2016__Mg_CO2e,gfw_forest_carbon_gross_emissions_2017__Mg_CO2e,gfw_forest_carbon_gross_emissions_2018__Mg_CO2e,gfw_forest_carbon_gross_emissions_2019__Mg_CO2e,gfw_forest_carbon_gross_emissions_2020__Mg_CO2e,gfw_forest_carbon_gross_emissions_2021__Mg_CO2e,gfw_forest_carbon_gross_emissions_2022__Mg_CO2e,gfw_forest_carbon_gross_emissions_2023__Mg_CO2e,gfw_forest_carbon_gross_emissions_2024__Mg_CO2e
0,Afghanistan,30,205771,12409398,123,15339,376800,-361461,27986.0,41762.0,...,0.0,0.0,0.0,4893.0,3708.0,11409.0,6772.0,1913.0,3435.0,2636.0
1,Afghanistan,50,148417,9765465,134,12657,275855,-263199,25603.0,32691.0,...,0.0,0.0,0.0,3920.0,3343.0,10321.0,6045.0,1664.0,2530.0,2106.0
2,Afghanistan,75,75480,5571655,150,6147,151074,-144926,15780.0,15308.0,...,0.0,0.0,0.0,1962.0,1743.0,6451.0,2477.0,668.0,1857.0,1512.0
3,Albania,30,648459,40958831,238,721806,5103589,-4381783,1417747.0,348556.0,...,120041.0,334094.0,448993.0,724335.0,429556.0,427420.0,506228.0,649874.0,948758.0,308121.0
4,Albania,50,534671,37239867,263,682919,4294627,-3611709,1358272.0,338279.0,...,113553.0,304691.0,403366.0,669011.0,404887.0,391385.0,449937.0,591504.0,895138.0,275104.0



Missing Values:
✅ No missing values!

Carbon-related columns: ['gfw_aboveground_carbon_stocks_2000__Mg_C', 'avg_gfw_aboveground_carbon_stocks_2000__Mg_C_ha-1', 'gfw_forest_carbon_gross_emissions__Mg_CO2e_yr-1', 'gfw_forest_carbon_gross_removals__Mg_CO2_yr-1', 'gfw_forest_carbon_net_flux__Mg_CO2e_yr-1', 'gfw_forest_carbon_gross_emissions_2001__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2002__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2003__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2004__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2005__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2006__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2007__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2008__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2009__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2010__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2011__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2012__Mg_CO2e', 'gfw_forest_carbon_gross_emissions_2013__Mg_CO2e', 'gfw_forest_carbon_gross_emissio