## **Project Title: A Visual Exploration of the Relationship Between Trade Tariffs and Germany's Economic Growth from 1988 to 2021**

> Author: Abdullah Akintobi  
> Date: September 20, 2025


### **Table of Contents**

<ul>
    <li><a href="#phase-1-project-scoping--data-preparation">Phase 1: Project Scoping & Data Preparation</a>
        <ul>
            <li><a href="#11-introduction">1.1 Introduction</a></li>
            <li><a href="#12-import-dependencies">1.2 Import Dependencies</a></li>
            <li><a href="#13-data-loading-and-initial-observations">1.3 Data Loading and Initial Observations</a></li>
            <li><a href="#14-data-preprocessing">1.4 Data Preprocessing</a></li>
        </ul>
    </li>
    <li><a href="#phase-2-exploratory-data-analysis-eda--visualization">Phase 2: Exploratory Data Analysis (EDA) & Visualization</a>
        <ul>
            <li><a href="#21-descriptive-statistics">2.1 Descriptive Statistics</a></li>
            <li><a href="#22-advanced-data-analysis">2.2 Advanced Data Analysis</a>
                <ul>
                    <li><a href="#221-univariate-analysis">2.2.1 Univariate Analysis</a></li>
                    <li><a href="#222-bivariate-analysis">2.2.2 Bivariate Analysis</a></li>
                    <li><a href="#223-multivariate-analysis">2.2.3 Multivariate Analysis</a></li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="#phase-3-research-questions-analysis--conclusion">Phase 3: Research Questions Analysis & Conclusion</a>
        <ul>
            <li><a href="#31-research-questions-analysis">3.1 Research Questions Analysis</a>
                <ul>
                    <li><a href="#311-temporal-analysis-1988-2021">3.1.1 Temporal Analysis (1988-2021)</a></li>
                    <li><a href="#312-correlation-analysis">3.1.2 Correlation Analysis</a></li>
                    <li><a href="#313-trade-volume-and-policy-analysis">3.1.3 Trade Volume and Policy Analysis</a></li>
                </ul>
            </li>
            <li><a href="#32-insights--conclusion">3.2 Insights & Conclusion</a></li>
        </ul>
    </li>
</ul>

## **Phase 1: Project Scoping & Data Preparation** <a id='phase-1-project-scoping--data-preparation'></a>

In this initial phase, we will setup all the requirements for our analysis, define the project's purpose, and ensure our data is clean and ready to use.

### **1.1 Introduction** <a id='11-introduction'></a>

This project conducts an exploratory data analysis of Germany's trade and economic performance from 1988 to 2021 to evaluate the influence of tariff policy changes on key economic indicators.

The dataset used for this analysis is the "World Export and Import Dataset (1988-2021)", which was sourced from Kaggle. This comprehensive dataset, compiled from reputable sources such as the World Bank and the World Trade Organization, provides detailed information on Germany's trade volumes, economic growth, and key tariff metrics over the 33-year period. Click [here](https://www.kaggle.com/datasets/muhammadtalhaawan/world-export-and-import-dataset) to access the dataset directly from Kaggle. Kindly note that this dataset was downloaded from kaggle on September 2, 2025 in cases the dataset was updated by its authour. 

The central problem is to determine the correlation between Germany's tariff rate fluctuations and its economic growth and trade volumes over the 33-year period.

The analysis will address this problem by answering the following specific research questions:

- **Correlation with Economic Growth:** What is the statistical correlation between changes in Germany's tariff rates (both `AHS Simple Average (%)` and `MFN Simple Average (%)`) and its `Country Growth (%)`?

- **Impact on Trade Volume:** How do changes in tariff policies correspond with fluctuations in Germany's total trade volumes, as measured by `Export (US$ Thousand)` and `Import (US$ Thousand)`?

- **Historical Analysis:** Can distinct periods be identified where a shift in trade policy appears to have a measurable impact on Germany's economic trajectory?

The insights gained will provide empirical evidence of the intricate relationship between a nation's trade policy and economic health, serving as a comprehensive case study.

### **1.2 Import Dependencies** <a id='12-import-dependencies'></a>

In [1]:
# Import necessary libraries
import numpy as np
import pandas as pd
import plotly.express as px
from scipy import stats

# Set display options
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)
pd.set_option("display.float_format", "{:,.2f}".format)

### **1.3 Data Loading and Initial Observations** <a id='13-data-loading-and-initial-observations'></a>

In this section, we will:

- Load and preview the raw dataset.

- Select the features to be used for the analysis.

In [2]:
# Load and preview the raw dataset
df_raw = pd.read_csv("../data/raw/34_years_world_export_import_dataset.csv")
df_raw.head()

Unnamed: 0,Partner Name,Year,Export (US$ Thousand),Import (US$ Thousand),Export Product Share (%),Import Product Share (%),Revealed comparative advantage,World Growth (%),Country Growth (%),AHS Simple Average (%),AHS Weighted Average (%),AHS Total Tariff Lines,AHS Dutiable Tariff Lines Share (%),AHS Duty Free Tariff Lines Share (%),AHS Specific Tariff Lines Share (%),AHS AVE Tariff Lines Share (%),AHS MaxRate (%),AHS MinRate (%),AHS SpecificDuty Imports (US$ Thousand),AHS Dutiable Imports (US$ Thousand),AHS Duty Free Imports (US$ Thousand),MFN Simple Average (%),MFN Weighted Average (%),MFN Total Tariff Lines,MFN Dutiable Tariff Lines Share (%),MFN Duty Free Tariff Lines Share (%),MFN Specific Tariff Lines Share (%),MFN AVE Tariff Lines Share (%),MFN MaxRate (%),MFN MinRate (%),MFN SpecificDuty Imports (US$ Thousand),MFN Dutiable Imports (US$ Thousand),MFN Duty Free Imports (US$ Thousand)
0,Aruba,1988,3498.1,328.49,100.0,100,,,,2.8,2.92,155.0,18.06,60.0,20.0,1.94,50.0,0.0,1867.0,2346.37,781.65,13.59,8.46,1152.0,63.54,22.74,70.32,31.61,352.69,0.0,2186.0,3128.02,0.0
1,Afghanistan,1988,213030.4,54459.52,100.0,100,,,,0.88,1.83,548.0,8.76,82.66,8.03,0.55,35.0,0.0,30863.03,70204.13,23987.37,17.68,12.43,4142.0,69.41,15.64,72.45,40.51,2029.66,0.0,78436.91,94191.5,0.0
2,Angola,1988,375527.89,370702.76,100.0,100,,,,2.02,3.89,633.0,25.43,69.19,5.37,0.0,40.0,0.0,723819.51,754183.84,167297.68,12.7,6.14,5438.0,76.0,16.27,41.55,24.8,451.15,0.0,727741.99,921481.52,0.0
3,Anguila,1988,366.98,4.0,100.0,100,,,,3.71,1.09,33.0,6.06,72.73,21.21,0.0,35.0,0.0,60.0,65.0,518.0,16.63,14.75,322.0,66.15,22.05,78.79,36.36,100.0,0.0,94.0,583.0,0.0
4,Albania,1988,30103.56,47709.3,100.0,100,,,,1.84,2.38,744.0,20.83,60.48,17.61,1.08,25.0,0.0,18806.15,62294.53,38901.42,19.2,9.68,5684.0,66.87,19.19,57.93,48.52,3000.0,0.0,37904.09,101195.95,0.0


In [3]:
# Check the shape of the dataset
df_raw.shape

(8096, 33)

In [4]:
# Get unique values in Partner Name column and sort them alphabetically
unique_partners = sorted(df_raw["Partner Name"].unique())

# Print partners that start with 'Ger'
for partner in unique_partners:
    if partner.startswith("Ger"):
        print(partner)

German Democratic Republic
Germany


In [5]:
# Check the years for 'German Democratic Republic'
df_raw[df_raw["Partner Name"] == "German Democratic Republic"]["Year"].unique()

array([1988, 1989, 1990])

**Observations:**

The dataset contains records for both the **[German Democratic Republic](https://en.wikipedia.org/wiki/East_Germany)** (from 1988 to 1990) and **Germany** (from 1988 to 2021). We will combine the records for both countries from 1988 to 1990 into a single unified record for Germany. This is crucial for our analysis because it accurately reflects the political and economic reality of **Germany's reunification in 1990**. Merging these records allows for a **continuous time-series analysis** over the entire 1988-2021 period, preventing a misleading split in the data and providing a more historically representative view of the country's economic trends.

We also observed that the dataset contains 8,096 rows and 33 columns. For the purpose of this project, we will focus on a subset of the data that is essential for our analysis.

Below are the seven key features we will use:

  - **`Partner Name`**: This column will be used to filter the data specifically for 'Germany'.

  - **`Year`**: This time-series variable is essential for plotting data over the 1988-2021 period.

  - **`Country Growth (%)`**: The primary variable for measuring Germany's economic growth.

  - **`AHS Simple Average (%)`**: One of two key tariff variables, representing the "Applied Harmonized System" tariff rate.

  - **`MFN Simple Average (%)`**: The second key tariff variable, representing the "Most Favored Nation" tariff rate.

  - **`Export (US$ Thousand)`**: Used to analyze the volume of trade over time.

  - **`Import (US$ Thousand)`**: Also used to analyze the volume of trade over time.

You can click [here](../data/meta/data_dictionary.csv) to see the full description of each column in the dataset. In the next section we will preprocess the data by perfoming feature selection, removing unwanted records, and merging the record for Germany and German Democratic Republic.

### **1.4 Data Preprocessing** <a id='14-data-preprocessing'></a>

In this section, we will:

- Drop the columns that are not necessary for our analysis.

- Remove the records where Partner Name is not German Democratic Republic or Germany.

- Merge the German Democratic Republic and Germany records as Germany from the years 1988 to 1990.

In [6]:
# Define the columns (features) required for the analysis
FEATURES_TO_KEEP = [
    "Partner Name",
    "Year",
    "Country Growth (%)",
    "AHS Simple Average (%)",
    "MFN Simple Average (%)",
    "Export (US$ Thousand)",
    "Import (US$ Thousand)",
]

# Drop columns that are not necessary for the analysis
df_filtered = df_raw[FEATURES_TO_KEEP]

# Define the countries of interest
countries_of_interest = ["Germany", "German Democratic Republic"]

# Filter the DataFrame to include only records for the countries of interest
df_germany_gdr = df_filtered[
    df_filtered["Partner Name"].isin(countries_of_interest)
].copy()

# Preview the cleaned and filtered DataFrame
df_germany_gdr.head(10)

Unnamed: 0,Partner Name,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$ Thousand),Import (US$ Thousand)
47,German Democratic Republic,1988,,7.12,14.39,1457269.16,2119791.68
48,Germany,1988,,10.87,14.06,34936844.6,41497301.75
254,German Democratic Republic,1989,,9.08,23.22,2219953.81,2777570.21
255,Germany,1989,3.78,14.66,23.62,56006031.96,74625428.25
464,German Democratic Republic,1990,,38.61,20.13,2292162.27,2934573.27
465,Germany,1990,12.71,14.32,21.52,71575355.58,94023242.66
677,Germany,1991,6.56,14.67,15.62,103376006.33,121689855.99
892,Germany,1992,2.46,14.58,15.84,166404685.94,190486407.63
1124,Germany,1993,-8.4,12.3,16.1,190361676.92,209670232.14
1358,Germany,1994,5.54,15.87,18.65,315745347.69,348752497.1


In [7]:
# Define the years for merging
YEARS_TO_MERGE = [1988, 1989, 1990]

# Split data: merge years vs keep others
df_merge = df_germany_gdr[df_germany_gdr["Year"].isin(YEARS_TO_MERGE)].copy()
df_remaining = df_germany_gdr[~df_germany_gdr["Year"].isin(YEARS_TO_MERGE)].copy()

# Aggregate numeric data for merge years
df_aggregated = df_merge.groupby("Year", as_index=False).sum(numeric_only=True)

# Fix incorrect 1988 growth value (was summed as 0)
mask_1988 = (df_aggregated["Year"] == 1988) & (df_aggregated["Country Growth (%)"] == 0)
df_aggregated.loc[mask_1988, "Country Growth (%)"] = np.nan

# Assign unified partner name and keep required features
df_aggregated["Partner Name"] = "Germany"
df_aggregated = df_aggregated[FEATURES_TO_KEEP]

# Combine merged and remaining data
df_final = (
    pd.concat([df_aggregated, df_remaining])
    .sort_values(by="Year")
    .reset_index(drop=True)
)

# Preview final dataset
df_final.head()

Unnamed: 0,Partner Name,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$ Thousand),Import (US$ Thousand)
0,Germany,1988,,17.99,28.45,36394113.76,43617093.43
1,Germany,1989,3.78,23.74,46.84,58225985.77,77402998.46
2,Germany,1990,12.71,52.93,41.65,73867517.85,96957815.93
3,Germany,1991,6.56,14.67,15.62,103376006.33,121689855.99
4,Germany,1992,2.46,14.58,15.84,166404685.94,190486407.63


In [8]:
# Check data information
df_final.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 34 entries, 0 to 33
Data columns (total 7 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Partner Name            34 non-null     object 
 1   Year                    34 non-null     int64  
 2   Country Growth (%)      33 non-null     float64
 3   AHS Simple Average (%)  34 non-null     float64
 4   MFN Simple Average (%)  34 non-null     float64
 5   Export (US$ Thousand)   34 non-null     float64
 6   Import (US$ Thousand)   34 non-null     float64
dtypes: float64(5), int64(1), object(1)
memory usage: 2.0+ KB


In [9]:
# Check for duplicates in the processed data
print(f"Number of duplicate records: {df_final.duplicated().sum()}")

Number of duplicate records: 0


In [10]:
# Count the number of null (missing) values in each column
df_final.isnull().sum()

Partner Name              0
Year                      0
Country Growth (%)        1
AHS Simple Average (%)    0
MFN Simple Average (%)    0
Export (US$ Thousand)     0
Import (US$ Thousand)     0
dtype: int64

In [11]:
# Get growth rates for 1989 and 1990
growth_1989 = df_final.loc[df_final["Year"] == 1989, "Country Growth (%)"].values[0]
growth_1990 = df_final.loc[df_final["Year"] == 1990, "Country Growth (%)"].values[0]

# Apply weighted interpolation: 75% for 1989, 25% for 1990
weighted_growth_1988 = round(0.75 * growth_1989 + 0.25 * growth_1990, 2)

# Fill the missing value in 1988
df_final.loc[
    (df_final["Year"] == 1988) & (df_final["Country Growth (%)"].isna()),
    "Country Growth (%)",
] = weighted_growth_1988

# Convert Export and Import columns to the appropriate units
df_final["Export (US$)"] = df_final["Export (US$ Thousand)"] * 1000
df_final["Import (US$)"] = df_final["Import (US$ Thousand)"] * 1000

In [12]:
# Drop the specified columns as they are no longer needed
df_final.drop(
    columns=["Partner Name", "Export (US$ Thousand)", "Import (US$ Thousand)"],
    inplace=True,
)
df_final.head()

Unnamed: 0,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$),Import (US$)
0,1988,6.01,17.99,28.45,36394113760.0,43617093430.0
1,1989,3.78,23.74,46.84,58225985770.0,77402998460.0
2,1990,12.71,52.93,41.65,73867517850.0,96957815930.0
3,1991,6.56,14.67,15.62,103376006330.0,121689855990.0
4,1992,2.46,14.58,15.84,166404685940.0,190486407630.0


In [13]:
# Save the processed data
df_final.to_csv("../data/processed/germany_trade_data.csv", index=False)

**Observations**

After merging the records for the **German Democratic Republic** and **Germany** for the years 1988, 1989, and 1990, we observed that the `Country Growth (%)` record for the year 1988 is missing in the merged data. This occurred because the `Country Growth (%)` was unavailable for both the German Democratic Republic and Germany in 1988, resulting in a single null value in the `Country Growth (%)` column of our new DataFrame. We also confirmed that the new DataFrame contains no duplicate records and that all column datatypes are correct.

To address the missing value, we didn't remove the 1988 record because dropping a point in a time-series dataset can introduce discontinuities and potential bias into the analysis. Instead, we will fill the missing value using a **weighted interpolation approach**, assigning more weight to the 1989 growth rate since it is chronologically closer to 1988, and less weight to the 1990 growth rate, which was unusually high due to the economic effects of reunification. Specifically, we apply a 75%-25% weighting between 1989 and 1990 growth rates to derive a more historically representative estimate for 1988. This method ensures continuity in the dataset while minimizing the risk of overstating 1988's economic growth due to anomalies in 1990.

Finally, we dropped the `Partner Name` column because the dataset now focuses solely on Germany, making the column redundant since it contained only a single unique value. Removing it simplifies the dataset without affecting the analysis.

We also converted the **'Export (US$ Thousand)'** and **'Import (US$ Thousand)'** to **'Export (US$)'** and **'Import (US$)'** by multiplying them by 1000 to remove ambiguity in the analysis results. We then dropped the original **'Export (US$ Thousand)'** and **'Import (US$ Thousand)'** columns, along with the **'Partner Name'** column.

In the next section, we will load our [processed data](../data/processed/germany_trade_data.csv) and begin our exploratory data analysis and visualization.

## **Phase 2: Exploratory Data Analysis (EDA) & Visualization** <a id='phase-2-exploratory-data-analysis-eda--visualization'></a>

This is the core of the project. We'll use statistical and visual techniques to uncover patterns and relationships within the data.

### **2.1 Descriptive Statistics** <a id='21-descriptive-statistics'></a>

In [14]:
# Load and preview the processed data
ger_trade = pd.read_csv("../data/processed/germany_trade_data.csv")
ger_trade.tail()

Unnamed: 0,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$),Import (US$)
29,2017,2.73,5.66,8.36,1063050623690.0,1342854801130.0
30,2018,7.45,6.1,8.74,1182722787690.0,1442346350390.0
31,2019,-2.06,7.06,9.19,1138191259210.0,1367050118960.0
32,2020,-2.73,6.12,8.3,1092044662420.0,1291639475470.0
33,2021,10.2,5.76,8.52,1353626272540.0,1538830199200.0


In [15]:
# Calculate measures of central tendency
central_tendency = pd.DataFrame(
    {
        "Mean": ger_trade.mean(numeric_only=True),
        "Median": ger_trade.median(numeric_only=True),
        "Mode": ger_trade.mode(numeric_only=True).iloc[0],
    }
)

print("Measures of Central Tendency:")
central_tendency

Measures of Central Tendency:


Unnamed: 0,Mean,Median,Mode
Year,2004.5,2004.5,1988.0
Country Growth (%),2.93,2.85,2.85
AHS Simple Average (%),10.95,8.68,7.24
MFN Simple Average (%),13.33,9.7,8.3
Export (US$),685184509489.41,702868423590.0,36394113760.0
Import (US$),837879149312.65,894765282100.0,43617093430.0


In [16]:
# Calculate measures of dispersion
dispersion = pd.DataFrame(
    {
        "Variance": ger_trade.var(numeric_only=True),
        "Std Dev": ger_trade.std(numeric_only=True),
        "Minimum": ger_trade.min(numeric_only=True),
        "Maximum": ger_trade.max(numeric_only=True),
        "Range": ger_trade.max(numeric_only=True) - ger_trade.min(numeric_only=True),
        "IQR": ger_trade.quantile(0.75, numeric_only=True)
        - ger_trade.quantile(0.25, numeric_only=True),
    }
)

print("Measures of Dispersion:")
dispersion.T

Measures of Dispersion:


Unnamed: 0,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$),Import (US$)
Variance,99.17,33.08,70.93,78.19,1.522852520429829e+23,2.389952455260996e+23
Std Dev,9.96,5.75,8.42,8.84,390237430345.92,488871399783.32
Minimum,1988.0,-11.75,5.66,8.3,36394113760.0,43617093430.0
Maximum,2021.0,12.71,52.93,46.84,1353626272540.0,1538830199200.0
Range,33.0,24.46,47.27,38.54,1317232158780.0,1495213105770.0
IQR,16.5,7.5,4.19,4.26,631709721112.5,828307563655.0


In [17]:
# Calculate skewness and kurtosis
distribution_stats = pd.DataFrame(
    {
        "Skewness": ger_trade.skew(numeric_only=True),
        "Kurtosis": ger_trade.kurtosis(numeric_only=True),
    }
)

print("Distribution Statistics:")
distribution_stats

Distribution Statistics:


Unnamed: 0,Skewness,Kurtosis
Year,0.0,-1.2
Country Growth (%),-0.54,0.05
AHS Simple Average (%),4.06,19.4
MFN Simple Average (%),2.9,8.4
Export (US$),-0.17,-1.3
Import (US$),-0.2,-1.49


In [18]:
# Perform Shapiro-Wilk test for normality and store results in a DataFrame
results = []
for column in ger_trade.select_dtypes(include=[np.number]).columns:
    statistic, p_value = stats.shapiro(ger_trade[column])
    results.append(
        {
            "Column": column,
            "Statistic": round(statistic, 4),
            "P-value": round(p_value, 4),
            "Normal Distribution": "Yes" if p_value > 0.05 else "No",
        }
    )

# Convert to DataFrame
shapiro_df = pd.DataFrame(results)
shapiro_df

Unnamed: 0,Column,Statistic,P-value,Normal Distribution
0,Year,0.96,0.2,Yes
1,Country Growth (%),0.97,0.52,Yes
2,AHS Simple Average (%),0.54,0.0,No
3,MFN Simple Average (%),0.57,0.0,No
4,Export (US$),0.92,0.02,No
5,Import (US$),0.9,0.0,No


In [19]:
# Calculate correlation matrix
correlation_matrix = ger_trade.corr()

# Create correlation heatmap
fig = px.imshow(
    correlation_matrix,
    title="Correlation Matrix Heatmap",
    color_continuous_scale="RdBu",
    aspect="auto",
)
# fig.update_layout(width=1000, height=800)
fig.show()

In [20]:
# Print detailed correlation values
print("Correlation Matrix:")
correlation_matrix

Correlation Matrix:


Unnamed: 0,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$),Import (US$)
Year,1.0,-0.16,-0.62,-0.68,0.97,0.96
Country Growth (%),-0.16,1.0,0.33,0.23,-0.07,-0.07
AHS Simple Average (%),-0.62,0.33,1.0,0.86,-0.64,-0.63
MFN Simple Average (%),-0.68,0.23,0.86,1.0,-0.69,-0.69
Export (US$),0.97,-0.07,-0.64,-0.69,1.0,1.0
Import (US$),0.96,-0.07,-0.63,-0.69,1.0,1.0


In [21]:
# Calculate covariance matrix
covariance_matrix = ger_trade.cov()

# Create covariance heatmap
fig = px.imshow(
    covariance_matrix,
    title="Covariance Matrix Heatmap",
    color_continuous_scale="RdBu",
    aspect="auto",
)
# fig.update_layout(width=800, height=800)
fig.show()

In [22]:
# Print detailed covariance values
print("Covariance Matrix:")
covariance_matrix

Covariance Matrix:


Unnamed: 0,Year,Country Growth (%),AHS Simple Average (%),MFN Simple Average (%),Export (US$),Import (US$)
Year,99.17,-9.2,-52.4,-59.47,3756590509254.55,4687073779588.03
Country Growth (%),-9.2,33.08,15.98,11.8,-146642982413.14,-183567107406.18
AHS Simple Average (%),-52.4,15.98,70.93,63.89,-2109883026996.51,-2611170496994.75
MFN Simple Average (%),-59.47,11.8,63.89,78.19,-2392951683757.15,-2961358867357.72
Export (US$),3756590509254.55,-146642982413.14,-2109883026996.51,-2392951683757.15,1.522852520429829e+23,1.9015655413812234e+23
Import (US$),4687073779588.03,-183567107406.18,-2611170496994.75,-2961358867357.72,1.9015655413812234e+23,2.389952455260995e+23


**Key Insights from Descriptive Statistics**

* **Tariff Rates:**

  The average Applied Harmonized System (AHS) tariff rate was **10.95%** with moderate year-to-year variation, while the Most Favored Nation (MFN) rate averaged **13.33%** and showed greater volatility, reflecting less stability over time.


* **Economic Growth:**

  Germany’s economy grew at an average rate of **2.93%**, but with significant fluctuations—from a high of **12.71%** to a low of **-11.75%**—highlighting periods of both strong expansion and severe contraction.


* **Trade Volumes:**

    Exports averaged **\$685.18 billion** and imports **\$837.88 billion**, with both reaching peaks of over **\$1 trillion**, indicating substantial long-term growth in trade activity over the 33-year period.


### **2.2 Advanced Data Analysis** <a id='22-advanced-data-analysis'></a>

Here, we will perform univariate, bivariate, and multivariate analysis on the data using various visualizations.

#### **2.2.1 Univariate Analysis** <a id='221-univariate-analysis'></a>

Let's examine each variable individually through various visualizations to understand their distributions and patterns over time.

In [23]:
# Create time series plots for each variable
def create_time_series(data, y_col, title):
    fig = px.line(data, x="Year", y=y_col, title=title)
    fig.update_layout(showlegend=False)
    return fig


# Economic Growth Time Series
fig_growth = create_time_series(
    ger_trade,
    "Country Growth (%)",
    "Germany Economic Growth Rate Over Time (1988-2021)",
)
fig_growth.show()

# Distribution of Economic Growth
fig_growth_dist = px.histogram(
    ger_trade,
    x="Country Growth (%)",
    title="Distribution of Economic Growth Rates",
    marginal="box",
)
fig_growth_dist.show()

In [24]:
# Tariff Rates Analysis
# Time series for tariff rates
fig_tariffs = px.line(
    ger_trade,
    x="Year",
    y=["AHS Simple Average (%)", "MFN Simple Average (%)"],
    title="Tariff Rates Over Time (1988-2021)",
)
fig_tariffs.show()

# Distribution of tariff rates
fig_tariffs_dist = px.histogram(
    ger_trade,
    x=["AHS Simple Average (%)", "MFN Simple Average (%)"],
    title="Distribution of Tariff Rates",
    marginal="box",
)
fig_tariffs_dist.show()

In [25]:
# Trade Volumes Analysis (using log scale due to large values)
# Time series for trade volumes
ger_trade["Log Export"] = np.log10(ger_trade["Export (US$)"])
ger_trade["Log Import"] = np.log10(ger_trade["Import (US$)"])

fig_trade = px.line(
    ger_trade,
    x="Year",
    y=["Export (US$)", "Import (US$)"],
    title="Trade Volumes Over Time (Trillion US$)",
)
fig_trade.show()

# Distribution of trade volumes
fig_trade_dist = px.histogram(
    ger_trade,
    x=["Log Export", "Log Import"],
    title="Distribution of Trade Volumes (Log Scale)",
    marginal="box",
)
fig_trade_dist.show()

#### **2.2.2 Bivariate Analysis** <a id='222-bivariate-analysis'></a>

Let's examine relationships between pairs of variables to understand their interactions and correlations.

In [26]:
# Scatter plot: Growth vs Tariffs
fig_growth_tariffs = px.scatter(
    ger_trade,
    x="AHS Simple Average (%)",
    y="Country Growth (%)",
    title="Economic Growth vs AHS Tariff Rates",
    trendline="ols",
)
fig_growth_tariffs.show()

# Calculate correlation
correlation = ger_trade["Country Growth (%)"].corr(ger_trade["AHS Simple Average (%)"])
print(f"Correlation between Growth and AHS Tariff: {correlation:.3f}")

Correlation between Growth and AHS Tariff: 0.330


In [27]:
# Scatter plot: Trade Volumes
fig_trade_volumes = px.scatter(
    ger_trade,
    x="Export (US$)",
    y="Import (US$)",
    title="Exports vs Imports",
    trendline="ols",
)
fig_trade_volumes.show()

# Calculate correlation
correlation = ger_trade["Export (US$)"].corr(ger_trade["Import (US$)"])
print(f"Correlation between Exports and Imports: {correlation:.3f}")

Correlation between Exports and Imports: 0.997


#### **2.2.3 Multivariate Analysis** <a id='223-multivariate-analysis'></a>

Now let's examine relationships between multiple variables simultaneously.

In [28]:
# Create parallel coordinates plot
fig_parallel = px.parallel_coordinates(
    ger_trade,
    dimensions=[
        "Year",
        "Country Growth (%)",
        "AHS Simple Average (%)",
        "MFN Simple Average (%)",
        "Export (US$)",
        "Import (US$)",
    ],
    title="Parallel Coordinates Plot of All Variables",
)
fig_parallel.show()

# Create 3D scatter plot
fig_3d = px.scatter_3d(
    ger_trade,
    x="Country Growth (%)",
    y="AHS Simple Average (%)",
    z="Log Export",
    title="3D Relationship: Growth, Tariffs, and Exports",
    color="Year",
    labels={
        "Country Growth (%)": "Growth",
        "AHS Simple Average (%)": "Tariffs",
        "Log Export": "Exports",
    },
)
fig_3d.show()

**Key Insights from Advanced Data Analysis**

* **Tariffs and Economic Growth:**
  Correlation analysis shows a **weak positive relationship** between tariffs and economic growth:

  * AHS vs. Growth: **0.330**
  * MFN vs. Growth: **0.232**
    This indicates that **higher tariffs were only slightly associated with stronger growth**, and the relationship is not economically significant. The visualizations show that high tariffs correspond to earlier years with more volatile growth, while lower tariffs correspond to more recent periods with higher export volumes.

* **Trade Volume Relationships:**

  * **Exports and imports** are **almost perfectly correlated** (**0.997**), suggesting they move in near lockstep.
  * **Economic growth** and **trade volumes** have a **near-zero negative correlation** (**-0.065**), indicating **no meaningful relationship** between them.

* **Tariffs and Trade Volumes:**
  Tariffs are **moderately to strongly negatively correlated** with both exports and imports:

  * AHS vs. Exports: **-0.642**
  * MFN vs. Exports: **-0.693**
  * AHS vs. Imports: **-0.634**
  * MFN vs. Imports: **-0.685**
  
    This suggests **higher tariffs are consistently associated with lower trade volumes**. The parallel coordinates plot visually confirms this inverse relationship, showing that as tariffs trend downwards, both exports and imports tend to increase.

## **Phase 3: Research Questions Analysis & Conclusion** <a id='phase-3-research-questions-analysis--conclusion'></a>

In this final phase, we'll synthesize our findings, answer our research questions, and provide a thoughtful conclusion for our project.


### **3.1 Research Questions Analysis** <a id='31-research-questions-analysis'></a>

In this section, we'll address each research question using our comprehensive analysis results.

### **3.1.1 Temporal Analysis (1988-2021)** <a id='311-temporal-analysis-1988-2021'></a>

Let's examine how Germany's economic growth and tariff rates have evolved over the study period.

In [29]:
# Create a combined time series plot for growth and tariffs
fig = px.line(
    ger_trade,
    x="Year",
    y=["Country Growth (%)", "AHS Simple Average (%)", "MFN Simple Average (%)"],
    title="Germany's Economic Growth and Tariff Rates (1988-2021)",
)
fig.update_layout(yaxis_title="Percentage (%)")
fig.show()

In [30]:
# Calculate period averages
periods = {
    "Early Period (1988-1995)": (1988, 1995),
    "Middle Period (1996-2008)": (1996, 2008),
    "Recent Period (2009-2021)": (2009, 2021),
}

print("Period Averages:")

# Create a list to store summary stats for each period
summary_data = []

# Loop through periods and calculate metrics
for period, (start, end) in periods.items():
    mask = (ger_trade["Year"] >= start) & (ger_trade["Year"] <= end)
    period_data = ger_trade[mask]

    summary_data.append(
        {
            "Period": period,
            "Average Growth Rate (%)": period_data["Country Growth (%)"].mean(),
            "Average AHS Tariff (%)": period_data["AHS Simple Average (%)"].mean(),
            "Average MFN Tariff (%)": period_data["MFN Simple Average (%)"].mean(),
        }
    )

# Convert list of dicts to a DataFrame
summary_df = pd.DataFrame(summary_data)
summary_df

Period Averages:


Unnamed: 0,Period,Average Growth Rate (%),Average AHS Tariff (%),Average MFN Tariff (%)
0,Early Period (1988-1995),4.87,20.64,24.7
1,Middle Period (1996-2008),3.78,9.43,10.95
2,Recent Period (2009-2021),0.88,6.51,8.7


### **3.1.2 Correlation Analysis** <a id='312-correlation-analysis'></a>

Let's examine the relationship between economic growth and tariff rates through visual and statistical analysis.

In [31]:
# Create scatter plots with trend lines for both tariff types
fig_ahs = px.scatter(
    ger_trade,
    x="AHS Simple Average (%)",
    y="Country Growth (%)",
    title="Growth vs AHS Tariff Rates",
    trendline="ols",
)
fig_ahs.show()

fig_mfn = px.scatter(
    ger_trade,
    x="MFN Simple Average (%)",
    y="Country Growth (%)",
    title="Growth vs MFN Tariff Rates",
    trendline="ols",
)
fig_mfn.show()

# Calculate and display correlations
correlations = {
    "AHS Tariff vs Growth": ger_trade["Country Growth (%)"].corr(
        ger_trade["AHS Simple Average (%)"]
    ),
    "MFN Tariff vs Growth": ger_trade["Country Growth (%)"].corr(
        ger_trade["MFN Simple Average (%)"]
    ),
}

print("\nCorrelation Analysis:")
for relationship, corr in correlations.items():
    print(f"{relationship}: {corr:.3f}")


Correlation Analysis:
AHS Tariff vs Growth: 0.330
MFN Tariff vs Growth: 0.232


### **3.1.3 Trade Volume and Policy Analysis** <a id='313-trade-volume-and-policy-analysis'></a>

Let's analyze how trade volumes relate to tariff rates and economic growth.

In [32]:
# Create trade volume trends
fig_trade = px.line(
    ger_trade,
    x="Year",
    y=["Export (US$)", "Import (US$)"],
    title="Trade Volumes Over Time (Trillion US$)",
)
fig_trade.show()

# Calculate correlations between trade volumes and other variables
print("\nTrade Volume Correlations:")
variables = ["Country Growth (%)", "AHS Simple Average (%)", "MFN Simple Average (%)"]
for var in variables:
    corr_export = ger_trade["Export (US$)"].corr(ger_trade[var])
    corr_import = ger_trade["Import (US$)"].corr(ger_trade[var])
    print(f"\n{var}")
    print(f"Correlation with Exports: {corr_export:.3f}")
    print(f"Correlation with Imports: {corr_import:.3f}")

# Calculate trade balance trend
ger_trade["Trade Balance"] = ger_trade["Export (US$)"] - ger_trade["Import (US$)"]
print("\nTrade Balance Summary:")
print(f"Average Trade Balance: {ger_trade['Trade Balance'].mean():,.0f} US$")
print(
    f"Trade Balance Growth: {(ger_trade['Trade Balance'].iloc[-1] / ger_trade['Trade Balance'].iloc[0] - 1) * 100:.1f}%"
)


Trade Volume Correlations:

Country Growth (%)
Correlation with Exports: -0.065
Correlation with Imports: -0.065

AHS Simple Average (%)
Correlation with Exports: -0.642
Correlation with Imports: -0.634

MFN Simple Average (%)
Correlation with Exports: -0.693
Correlation with Imports: -0.685

Trade Balance Summary:
Average Trade Balance: -152,694,639,823 US$
Trade Balance Growth: 2464.1%


**Research Question Answers**

* **Temporal Analysis (Historical Periods):**

  Period averages reveal a **clear liberalization trend** over time, with tariffs falling sharply across decades:

  * **1988–1995:** AHS = **20.64%**, MFN = **24.70%**, Avg. Growth = **4.87%**
  * **1996–2008:** AHS = **9.43%**, MFN = **10.95%**, Avg. Growth = **3.78%**
  * **2009–2021:** AHS = **6.51%**, MFN = **8.70%**, Avg. Growth = **0.88%**
  
    As tariffs declined, **economic growth also slowed**, suggesting other factors beyond trade policy likely influenced growth dynamics.

* **Tariffs and Economic Growth:**

  A **weak positive correlation** indicates that **reducing tariffs alone did not guarantee stronger economic growth**, highlighting the role of broader macroeconomic conditions.

* **Tariffs and Trade Volumes:**

  **Negative correlations** between tariffs and both exports and imports confirm that **lower tariffs were consistently associated with higher trade volumes**.

### **3.2 Insights & Conclusion** <a id='32-insights--conclusion'></a>

#### **Summary of Key Findings**

* **Trade Liberalization Trend:**
  Over the 33-year period, Germany **significantly reduced trade barriers**, with AHS tariffs declining from **20.64%** to **6.51%** and MFN tariffs from **24.70%** to **8.70%**.

* **Tariffs and Trade Volumes:**
  As tariffs decreased, both **exports and imports expanded sharply**, reflected in the strong negative correlations between tariffs and trade volumes (e.g., MFN vs. Exports = **-0.693**).

* **Tariffs and Economic Growth:**
  The **weak positive correlations** (AHS = **0.330**, MFN = **0.232**) indicate that **lower tariffs alone were insufficient to explain economic growth trends**.


#### **Implications of the Results**

The findings suggest that while **trade liberalization clearly promoted trade expansion**, its **direct impact on GDP growth** was minimal. Economic growth likely relied on a combination of factors, including **domestic economic policies, technological advancements, global demand conditions, and investment dynamics**.


#### **Future Research Directions**

1. **Multivariate Regression Analysis:**
   To account for additional macroeconomic variables influencing growth and trade.
2. **Time-Series Causality Tests:**
   To explore the directionality of relationships (e.g., using Granger causality).
3. **Cross-Country Comparative Studies:**
   To distinguish **country-specific effects** from **broader global patterns**.