<a href="https://colab.research.google.com/github/mssay710/earth-analytics-python-env/blob/main/Anthropogenic_Deforestation_of_Amazon_Rainforest_in_Brazil_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Description**

One of the most important environmental issues is the degradation of the Amazon Rainforest, particularly in Brazil. Approximately 60% of the Amazon is located in Brazil and has one of the highest rates of biodiversity on Earth, with 10% to 20 % of all known species.

Researchers have identified the following human activity as causes of rainforest loss:

*   Forest fires
*   Illegal logging and mining
*   Slash and burn land clearing for agricultural purposes

Additionally, rising global atmospheric CO2 levels affect the water and weather cycles, which results in desertification due to higher temperatures and reduced rainfall in the Amazon Rainforest.


In [1]:
#Geographic area - add map
#Legal Amazon states: Acre (AC), Amapa (AP), Amazonas (AM), Maranhao (MA), Mato Grosso (MT), Para (PA), Rondonia (RO), Roraima (RR), Tocantins (TO)

#Latitudes: [-9.070003236, -0.039598369, -3.289580873, -5.809995505, -15.65001504, -1.190019105, -11.64002724, -1.816231505, -6.319576804]
#Longitudes: [-68.66997929, -51.17998743, -60.6199797, -46.14998438, -56.14002059, -47.17999903, -61.20999536, -61.12767481, -47.41998438]       ]

The primary goal of this project is to explore datasets to assess which human activity leads to the most significant amount of rainforest loss in the Amazonian states of Brazil, which include Acre (AC), Amapa (AP), Amazonas (AM), Maranhao (MA), Mato Grosso (MT), Para (PA), Rondonia (RO), Roraima (RR), and Tocantins (TO).

# **Overview**


It is important to track deforestation data because scientists have determined that the health of the Amazon Rainforest has implications for the health of the planet. The Amazon Rainforest captures (approximately 5%) of total global CO2 emissions.

Drastically elevated levels of atmospheric greenhouse gases such as CO2 cause global warming, which can lead to rising temperatures, more natural disasters, disrupted food production due to changing crop cycles, melting glaciers, rising sea levels, and respiratory health complications in populations exposed to high pollution rates. (Plant response to excess CO2 can potentially affect cooling and rain cycles in tropical rainforests and result in an increase in drought and forest fires across the Amazon. - plant stomata study, controversial because it depends on computer climate models which are generally not reliable for water cycles)  

*   Intergenerational equity
*   Ability of future generations to survive and thrive
* Resilience and adaptation to effects of anthropogenically induced climate change

Tracking and predicting deforestation can help guide policy makers to take action to maintain the health of the Amazon Rainforest.

The goal of this project is to demonstrate how data sets can be refined to answer questions about anthropogenic causes of deforestation in the Legal Amazon area of Brazil. Once a workable model is created, it can be adapted to suit the analytical and visualization needs of different stakeholders.


# **Related Work**

* Find scholarly articles on anthropogenic causes of deforestation in Legal Amazon area of Brazil
* Reforestation projects

# **Questions About Anthropogenic Causes of Deforestation in Legal Amazon area of Brazil**

* **How are fires affecting the Legal Amazon area of Brazil?**
* **Why is species loss an important indicator of deforestation?**
* **Are the exports of specialty lumber from the Legal Amazon area affecting the resilience of the rainforest and its ability to regenerate?**
* **Why is beef and soy production in the Legal Amazon area a negative indicator of deforestation?**
* **What are potential strategies for reforestation?**
* **Is it possible for economic growth to continue while maintaining the integrity of the Amazon Rainforest?**

# **How data was collected**

Data was collected from open data sets containing official statistics published by Brazilian government organizations, where possible. Other sources of data include greenhouse gas measurements published by NASA's Earth Observing System Data and Information System (EOSDIS), as well as statistics published by international NGOs.

The relevant spreadsheets were downloaded as Excel files. The data sets were uploaded to Google Colab for cleaning, exploratory data analysis, statistical modeling and visualization using the Python programming language along with packages such as pandas, numpy, scikitlearn, matplotlib, and geopandas.

# **Install and Import Relevant Packages**





In [2]:
# Important library for many geopython libraries
!apt install gdal-bin python-gdal python3-gdal
# Install rtree - Geopandas requirment
!apt install python3-rtree
# Install Geopandas
!pip install geopandas
# Install descartes - Geopandas requirment
!pip install descartes
# Install Folium for Geographic data visualization
!pip install folium
# Install plotlyExpress
!pip install plotly_express

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Package python-gdal is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
  gdal-bin

[1;31mE: [0mPackage 'python-gdal' has no installation candidate[0m
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libspatialindex-c6 libspatialindex-dev libspatialindex6
The following NEW packages will be installed:
  libspatialindex-c6 libspatialindex-dev libspatialindex6 python3-rtree
0 upgraded, 4 newly installed, 0 to remove and 33 not upgraded.
Need to get 365 kB of archives.
After this operation, 1,799 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libspatialindex6 amd64 1.9.3-2 [247 kB]
Get:2

In [3]:
!pip install mplcursors

Collecting mplcursors
  Downloading mplcursors-0.5.3.tar.gz (88 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.8/88.8 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting matplotlib!=3.7.1,>=3.1 (from mplcursors)
  Downloading matplotlib-3.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: mplcursors
  Building wheel for mplcursors (pyproject.toml) ... [?25l[?25hdone
  Created wheel for mplcursors: filename=mplcursors-0.5.3-py3-none-any.whl size=20728 sha256=38487626b42bfc4e57cf7541d887b8cfde58eb38b8a9442a8beec818d3bbaab4
  Stored in directory: /roo

In [4]:
# import libraries

import pandas as pd
import numpy as np
import scipy.stats as stats
from scipy.stats import norm
import statsmodels.stats.api as sm
import seaborn as sns
from math import ceil
from datascience import *
from sklearn import metrics
import geopandas as gpd
from shapely.geometry import Point
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
plt.style.use('seaborn-whitegrid')
import folium
import plotly_express as px
import plotly.graph_objects as go
import statsmodels.api as sm

  plt.style.use('seaborn-whitegrid')


In [5]:
# mount google drive to read the data set
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [6]:
# closes all plots if there is too much memory being used
%matplotlib inline
plt.close('all')

# **Code Snippets Temp Section**

In [7]:
# drop NaN values

#df.dropna(subset=['col1', 'col2'])

In [8]:
# filter data to include only cities in the United States

#new_df = df.loc[df['col'] == 'filter criteria']

In [9]:
# drop NaN from 'City' column

#large_cities_USA.dropna(subset=['City'], inplace=True)
#large_cities_USA

In [10]:
# remaining NaN values are replaced by 0 to facilitate numerical analysis

#large_cities_USA = large_cities_USA.fillna(0)

In [11]:
# rename columns to shorten column names

#total_emissions = large_cities_USA.rename(columns={'Direct emissions (metric tonnes CO2e) for Total generation of grid-supplied energy':'Grid Emissions (tons CO2e)', 'Direct emissions (metric tonnes CO2e) for Total emissions (excluding generation of grid-supplied energy)':'Other Sources (tons CO2e)'}, inplace=False)

In [12]:
# drop specific rows

#df = df.drop([row number,row number,...])

In [13]:
# generate sum of columns

#df['sum col'] = df['sum col 1'] + df['sum col 2']

In [14]:
#sector_lat = sector_data.groupby(['Sector City', 'Latitude', 'Longitude'], as_index=False).sum()
#sector_lat

In [15]:
#fig = px.scatter_geo(sector_lat,lat='Latitude',lon='Longitude', hover_name='Sector City', size='Sector CO2 Emissions')
#fig.update_geos(
    #visible=False, resolution=110, scope="usa",
    #showcountries=True, countrycolor="Black",
    #showsubunits=True, subunitcolor="Black")
#fig.update_layout(title = 'CO2 Emissions by City in the United States', title_x=0.5)
#fig.show()

In [16]:
#generate map showing CO2 emissions by sector

#fig = px.scatter_geo(sector_data,lat='Latitude',lon='Longitude', color='Sector', hover_name='Sector City', size='Sector CO2 Emissions')
#fig.update_geos(
    #visible=False, resolution=110, scope="usa",
    #showcountries=True, countrycolor="Black",
    #showsubunits=True, subunitcolor="Black")
#fig.update_layout(title = 'CO2 Emissions by Sector in the United States', title_x=0.5)
#fig.show()

# **Import and Clean Data**

The goal during this phase is to refine the data to include only relevant statistics to be analyzed. Data sets consist of information about deforestation, species loss, population, lumber exports, and agriculture in the Legal Amazon region of Brazil.

In [17]:
# load Brazil dataset
brazil_data = pd.read_excel("/content/gdrive/MyDrive/Intro to Data Sci Final Project Spring 2023/data/Brazil_Amazon_Data.xlsx")
brazil_data.head()

FileNotFoundError: [Errno 2] No such file or directory: '/content/gdrive/MyDrive/Intro to Data Sci Final Project Spring 2023/data/Brazil_Amazon_Data.xlsx'

In [None]:
#load Brazil expanded dataset
brazil_data = pd.read_excel("/content/gdrive/MyDrive/Intro to Data Sci Final Project Spring 2023/data/Brazil_Amazon_Data.xlsx")
brazil_data.head()

In [None]:
#Rename columns to avoide spacing errors
brazil_data = brazil_data.rename(columns={'Forest Loss (km^2)':'ForestLosskm2','CO2 Emissions (t)':'CO2EmissionsTon',
'Soybean Production (t)':'SoybeanProductionTon','Atmospheric CO2 (ppm)':'AtmosphericCO2ppm',
'Industrial Tropical Roundwood (m^3)':'IndustrialTropicalRoundwoodm3',
'Iron Ore (thousand metric ton)':'IronOreProdmetton1000', 'Forest Fires':'ForestFires'})
brazil_data.head()

In [None]:
brazil_data = brazil_data.astype(int)
print(brazil_data.dtypes)

# **Hypothesis**

**Null hypothesis:**
There is no relationship between human activity and deforestation in the Brazilian Amazon Rainforest.

**Alternative hypothesis:**
There is a relationship between human activity and deforestation in the Brazilian Amazon Rainforest.

# **Exploratory Data Analysis**

**Another dataset that contains Tropical Wood exports, Iron Ore production, and Forest Fires in addition to previous data was explored to test the relationship of other factors to Brazil Amazon Rainforest loss. This data was only available for years 2000-2021, so it was split from the initial dataset.**

In [None]:
brazil_data_exp = pd.read_excel('/content/gdrive/MyDrive/Intro to Data Sci Final Project Spring 2023/data/Brazil_Amazon_Data_Expanded.xlsx')
brazil_data_exp.head()

In [None]:
brazil_data_exp = brazil_data_exp.rename(columns={'Forest Loss (km^2)':'Deforestation','CO2 Emissions (t)':'CO2e',
'Soybean Production (t)':'Soy','Atmospheric CO2 (ppm)':'AtmCO2ppm',
'Industrial Tropical Roundwood (m^3)':'Tropicalwood',
'Iron Ore (thousand metric ton)':'Iron', 'Forest Fires':'ForestFires'})
brazil_data_exp.head()

In [None]:
#descriptive statistics for expanded dataset
brazil_data_exp.describe()

## **Central Tendency**

In [None]:
#calculate the mean Brazil Amazon Rainforest loss
brazil_data_exp['Deforestation'].mean()

In [None]:
#calculate the median Brazil Amazon Rainforest loss
brazil_data_exp['Deforestation'].median()

## **Variability of Brazil Amazon Rainforest Loss**

In [None]:
# using max() and min() method to get the range of Brazil Amazon Rainforest loss
brazil_data_exp['Deforestation'].max()-brazil_data_exp['Deforestation'].min()

In [None]:
# kde plot to show distribution
brazil_data_exp['Deforestation'].plot.kde()

In [None]:
# histogram plot to show distribution
brazil_data_exp.hist(alpha=0.5, figsize=(10, 10))
plt.tight_layout()
plt.show()

## **Correlation**

**The table below displays the correlation between each of the variables.**

In [None]:
brazil_data_exp.corr(method='pearson')

## **Simple Linear Regression**

This simple linear regression model (OLS) explores the relationship between deforestation and forest fires.

In [None]:
model = sm.OLS.from_formula('Deforestation ~ ForestFires', data = brazil_data_exp)
result = model.fit()
result.summary()

## **Multiple Linear Regression**

The multiple linear regression model best describes how each of the independent variables relates to the dependent variable, which is the annual amount of Brazil Amazon Rainforest loss in km^2. The unit for CO2 emissions is tons, the unit for tropical wood is m^3 and the unit for iron ore production is 1000 metric tons.

In [None]:
brazil_data_exp.head()

In [None]:
#run multiple linear regression
model = sm.OLS.from_formula('Deforestation ~	CO2e +	Population +	Cattle +	Soy	+ AtmCO2ppm +	Tropicalwood +	Iron +	ForestFires', data = brazil_data_exp)
result = model.fit()
result.summary()

The linear model is:

Deforestation = 3.7\*CO2e + 1.24e-05\*Cattle - 5.93\*Soy + 349.42\*AtmCO2ppm + 6.62\*ForestFires - 6.60e+04

The R-Squared value for the multiple regression model is 0.716, compared with the simple linear regression model, where the R-squared value is 0.359. The simple linear regression model only explored the relationship between deforestation and forest fires, whereas the multiple regression model explored the relationships between the dependent variable - deforestation - and the independent variables in the dataset.  

More variance of the dependent variable is explained with this multiple linear model, since it considers all potential factors for deforestation. From the result summary, we can tell that soybean production and forest fires both have P values that less than .05, so we can reject the null hypothesis that human activity has no effect on deforestation.

###Machine learning with linear regression

Here, the linear regression model is trained and tested with scikit-learn.


In [None]:
#split data into x-array, y-array

x = brazil_data_exp[['CO2e',	'Population',	'Cattle',	'Soy',	'AtmCO2ppm',	'Tropicalwood',	'Iron',	'ForestFires']]
y = brazil_data_exp['Deforestation']

Next, we split our Data Set into Training Data and Test Data.
scikit-learn makes it very easy to divide our data set into training data and test data. To do this, we’ll need to import the function train_test_split from the model_selection module of scikit-learn.

In [None]:
#import train_test_split moduce from scikit-learn

from sklearn.model_selection import train_test_split

In [None]:
#split dataset into training and test data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

In [None]:
#import LinearRegression estimator from scikit-learn and assign to MLmodel variable
from sklearn.linear_model import LinearRegression
MLmodel = LinearRegression()

In [None]:
MLmodel.fit(x_train, y_train)
pd.DataFrame(MLmodel.coef_, x.columns, columns = ['Coeff'])

From the table above, if all other variables are held constant, a 1 unit increase of forest fires would result in a 8.66 unit increase of deforestation.

# Now, let's make predictions from our model. We simply call the predict method on the MLmodel we created earlier. predict() method takes an x-array parameter and generates the y values.

In [None]:
predictions = MLmodel.predict(x_test)

In [None]:
plt.scatter(y_test, predictions)

As we can see, our predicted values (y-axis) are kind of close to the actual values for the observations in the data set (x-axis). A  diagonal line in this scatterplot would indicate that our model predicted the y-array values.

**Residuals Plot**

In [None]:
plt.hist(y_test - predictions)

We can see that the residuals from our machine learning model appear to be normally distributed. This is a very good!

scikit-learn does not actually have a built-in method for calculating root mean squared error. However, it provides calcuation of mean squared error. We can use NumPy's sqrt method to easily calculated root mean squared error.

In [None]:
from sklearn import metrics

In [None]:
metrics.mean_squared_error(y_test, predictions)
np.sqrt(metrics.mean_squared_error(y_test, predictions))


In [None]:
#calculate r2 metric
r2 = metrics.r2_score(y_test, predictions)

print('R2 score is {}'.format(r2))

## **Experiments**

In [None]:
#Create dataset of deforestation by brazil state
#Map of deforestation hotspots
#enter data for latitude and longitude of brazil states, columns have to be the same length
#Latitude = [lat1, lat2]
#Longitude = [long1,long2]
#brazil_states['Latitude'] = Latitude
#brazil_states['Longitude'] = Longitude

# map of CO2 emissions by city population

#fig = px.scatter_geo(total_emissions_citiesUSA,lat='Latitude',lon='Longitude', color='Population', hover_name='City', size='Total_Emissions')
#fig.update_geos(
    #visible=False, resolution=110, scope="usa",
    #showcountries=True, countrycolor="Black",
    #showsubunits=True, subunitcolor="Black")
#fig.update_layout(title = 'Deforestation in Brazil's Amazon Rainforest by State', title_x=0.5)
#fig.show()

## **Applied Statistical Methods**



*   **Estimation**
* **Confidence Interval**
*   **Linear Regression**

* *multiple regression*
* *machine learning*



In [None]:
brazil_data.describe()

In [None]:
# defining the variables

x = brazil_data['AtmosphericCO2ppm'].tolist()
y = brazil_data['ForestLosskm2'].tolist()

# adding the constant term

x = sm.add_constant(x)

# performing the regression
# and fitting the model

result = sm.OLS(y, x).fit()

# printing the summary table

print(result.summary())

In [None]:
#MLmodel.fit(x_train, y_train)
#pd.DataFrame(MLmodel.coef_, x.columns, columns = ['Coeff'])

In [None]:
#model3 = sm.OLS.from_formula('COUNT ~ WEATHERSIT + TEMP + ATEMP + WINDSPEED', bikeshare)
#result3 = model3.fit()
#result3.summary()

## **Decision Tree**

In [None]:
y_data = brazil_data['ForestLosskm2']
x_data = brazil_data.drop(['ForestLosskm2', 'Year'], axis = 1)

In [None]:
from sklearn.model_selection import train_test_split
x_training_data, x_test_data, y_training_data, y_test_data = train_test_split(x_data, y_data, test_size = 0.60)

In [None]:
from sklearn import tree
classifier = tree.DecisionTreeClassifier()
dt_model = classifier.fit(x_training_data, y_training_data)

In [None]:
classifier.score(x_test_data,y_test_data)

In [None]:
target = list(brazil_data['ForestLosskm2'].unique())
feature_names=list(x_data.columns)

In [None]:
#textual plot
from sklearn.tree import export_text
r = export_text(dt_model, feature_names=feature_names)
print(r)

In [None]:
## tree plot
import graphviz
tree_data = tree.export_graphviz(dt_model, feature_names=feature_names, filled=True,
                                out_file=None
                                )
graph = graphviz.Source(tree_data)

graph

**Comparison of different classification models**

In [None]:
#import all classifers
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split, KFold, cross_val_score, GridSearchCV
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, BaggingClassifier, ExtraTreesClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from xgboost import XGBClassifier, plot_importance

In [None]:
# assign dependent variable data and independent variable data
npX = np.array(x_data).copy()
npy = np.array(y_data).copy()

In [None]:
#initiate the classifiers
clf_rf = RandomForestClassifier()
clf_et = ExtraTreesClassifier()
clf_bc = BaggingClassifier()
clf_ada = AdaBoostClassifier()
clf_dt = DecisionTreeClassifier()
clf_xg = XGBClassifier()
clf_lr = LogisticRegression()
clf_svm = SVC()

In [None]:
# 10-fold cross validation of the list of classifiers and calcuate mean accuracy.
Classifiers = ['RandomForest','ExtraTrees','Bagging','AdaBoost','DecisionTree','XGBoost','LogisticRegression','SVM']
scores = []
models = [clf_rf, clf_et, clf_bc, clf_ada, clf_dt, clf_xg, clf_lr, clf_svm]
for model in models:
    score = cross_val_score(model, npX, npy, scoring = 'accuracy', cv = 5, n_jobs = -1).mean()
    scores.append(score)

In [None]:
maccuracy = pd.DataFrame(scores, index = Classifiers, columns = ['score']).sort_values(by = 'score',
             ascending = False)
maccuracy

**Accuracy score of decision tree model**
      
      The accuracy of the decision tree model is (...)

**Most important variable for predicting forest loss**

      (...) appears at the top of the decision tree, so it appears to be the most important for predicting forest loss in the Amazon Rainforest in Brazil.

**Most accurate classification model**

      The (...) model is the best one with a high score of (...)

## **Estimation**

In [None]:
#total_emissions

In [None]:
#ratios = total_emissions[['Grid Emissions (tons CO2e)', 'Total Emissions (tons CO2e)']]
#ratios['ratio'] = total_emissions['Grid Emissions (tons CO2e)']/total_emissions['Total Emissions (tons CO2e)']
#ratios

In [None]:
#ratios = ratios.drop([379,413,681])

In [None]:
#ratios['ratio'].hist()

In [None]:
#ratios_median = ratios["ratio"].median()
#ratios_median

The median ratio of (grid emissions) was approximately (0.10), which means that (grid emissions generally represent 10% of total emissions.)

In [None]:
#our_sample = ratios.sample(500, replace=True)
#our_sample['ratio'].hist()

In [None]:
#our_sample["ratio"].median()

(A random sample of 500 data points also predicted that grid emissions represent 10% of total emissions.)

In [None]:
#resample_1 = our_sample.sample(500, replace=True)
#resample_1['ratio'].hist()

In [None]:
#resample_1["ratio"].median()

The second set of random samples generated (an even lower approximate median of .07), which indicates that (a larger sample would predict grid emissions to be 7% of total emissions.)

In [None]:
#resample_2 = our_sample.sample(500, replace=True)
#resample_2["ratio"].median()

In [None]:
#def one_bootstrap_median():
  #resampled_table = our_sample.sample(500, replace=True)
  #return resampled_table["ratio"].median()

In [None]:
#one_bootstrap_median()

In [None]:
#num_repetitions = 5000
#bstrap_medians = make_array()
#for i in np.arange(num_repetitions):
    bstrap_medians = np.append (bstrap_medians, one_bootstrap_median())

In [None]:
#left = percentile(2.5, bstrap_medians)
#left.round(4)

In [None]:
#right = percentile(97.5, bstrap_medians)
#right.round(4)

Initial median was (0.1024). 95% confidence interval is from (0.0664 to 0.1384). (Grid emissions are likely to be within the range of 6.64% to 13.84% of total emissions.)

## **Empirical histogram of bootstrapped medians with confidence interval**

In [None]:
#resampled_medians = Table().with_column('Bootstrap Sample Median', bstrap_medians)
#median_bins=np.arange(0.410, 0.450, 0.002)
#resampled_medians.hist(bins = median_bins)

# Plotting parameters;
#parameter_green = '#32CD32'
#plt.ylim(-10, 200)
#plt.scatter(ratios_median, 0, color=parameter_green, s=40, zorder=2)

In [None]:
#resampled_medians.hist(bins = median_bins)

# Plotting parameters
#plt.ylim(-10, 200)
#plt.plot([left, right], [0, 0], color='yellow', lw=5, zorder=1)
#plt.scatter(ratios_median, 0, color=parameter_green, s=40, zorder=2);

# **Visualization**

- gini
- heat map
- geo map

# **Final Analysis and Conclusion**

- Relationship between deforestation and each of the variables
- Which variable/s showed the strongest correlation
- How is this useful?
- What questions developed from the findings?
- Possible future expansion of study

Land clearing and resource extraction are unavoidable because they lead to economic growth for a country where the vast majority has very little income to support themselves and their families. Instead, the government needs to pass legislation and enforce accountability on those responsible for illegal mining and logging. Resources must be harvested sustainably so the Amazon Rainforest can become more resilient, and regenerate natural resources.

# **Data Sources and References**

In [None]:
#Data Sources:
#Deforestation in Brazil Legal Amazon by State 1987-2022
#http://terrabrasilis.dpi.inpe.br/app/dashboard/deforestation/biomes/legal_amazon/rates

#Bovine herds in Brazil Legal Amazon 1974-2021
#https://sidra.ibge.gov.br/tabela/3939

#Most exported products Brazil 2021
#https://www.statista.com/statistics/1191541/products-exported-from-brazil/

#Timber Trade Portal
#https://www.timbertradeportal.com/en/brazil/11/timber-sector

#Statista Brazil Amazon Dossier
#https://www.statista.com/topics/6866/amazon-rainforest-in-brazil/#dossier-chapter2

#ACS Publication - Beef & Deforestation
#https://pubs.acs.org/doi/10.1021/es103240z#

#Our World in Data Brazil CO2
#https://ourworldindata.org/co2/country/brazil

#Our World in Data Soy
#https://ourworldindata.org/soy

#OEC Soy Data Brazil
#https://oec.world/en/profile/bilateral-product/soybeans/reporter/bra

#Iron Ore Production
#https://unctad.org/system/files/official-document/cn1ironore17.en.pdf
#https://www.usgs.gov/centers/national-minerals-information-center/iron-ore-statistics-and-information

#Fires
#https://www.statista.com/statistics/1044328/number-wildfires-legal-amazon/

#Mapping


#List of States in Brazil with Latitude and Longitude
#https://www.distancelatlong.com/country/brazil/




**References**

In Depth Analysis from European Parliament: Brazil and the Amazon Rainforest: Deforestation, Biodiversity and Cooperation with the EU and International Forums
https://www.europarl.europa.eu/RegData/etudes/IDAN/2020/648792/IPOL_IDA(2020)648792_EN.pdf

Global Carbon Project
https://www.globalcarbonproject.org/about/index.htm

NASA EOSDIS
https://earthobservatory.nasa.gov/blogs/earthmatters/category/climate/

Brazilian Institute of Geography and Statistics (IBGE)
https://sidra.ibge.gov.br

Amazon Conservation Association
https://www.amazonconservation.org/the-challenge/threats/

Rainforest Foundation US
https://rainforestfoundation.org/our-work/where-we-work/brazil/
