In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.linear_model import LinearRegression
from sklearn.feature_selection import SelectKBest, f_regression

In [7]:
merged_df = pd.read_csv('/content/drive/MyDrive/Dataset/merged_df.csv')

## Linear Regression Analysis

To further quantify the relationship between Political Stability and Foreign Direct Investment, we will use **Linear Regression**.

**Why Linear Regression?**

*   **Predicting a Continuous Outcome:** Linear regression is used when you want to predict a continuous outcome variable (in this case, FDI) based on one or more predictor variables.
*   **Examining Linear Relationships:** It models the linear relationship between the independent variable(s) and the dependent variable. Our scatter plot and correlation analysis suggested a potential linear component to the relationship between Political Stability and FDI.
*   **Quantifying the Relationship:** Linear regression provides coefficients that quantify the strength and direction of the relationship between the predictor and the outcome, allowing us to understand how much FDI is expected to change for a one-unit increase in political stability.
*   **Statistical Inference:** It allows for statistical inference, providing p-values and confidence intervals for the coefficients, which help determine if the relationship is statistically significant.

In [8]:
#Features & Target are defined.
X = merged_df[['Political_Stability']]   # Independent variable(s)
y = merged_df['FDI']                     # Dependent variable

In [9]:
# Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [10]:
# Feature Selection
selector = SelectKBest(score_func=f_regression, k='all')
X_train_selected = selector.fit_transform(X_train, y_train)
X_test_selected = selector.transform(X_test)

In [11]:
# Linear Regression Model
lr = LinearRegression()
lr.fit(X_train_selected, y_train)

# Predictions
y_pred = lr.predict(X_test_selected)

# Evaluation
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print("Linear Regression Results:")
print(f"R² Score: {r2:.3f}")
print(f"RMSE: {rmse:.3f}")
print(f"Intercept: {lr.intercept_:.3f}")
print(f"Coefficient: {lr.coef_[0]:.3f}")

Linear Regression Results:
R² Score: 0.008
RMSE: 38.978
Intercept: 8.860
Coefficient: 7.483


In [13]:
# Grid Search
param_grid = {'fit_intercept': [True, False]}
grid = GridSearchCV(LinearRegression(), param_grid, cv=5, scoring='r2')
grid.fit(X_train_selected, y_train)

print("\nBest Parameters from GridSearchCV:", grid.best_params_)
print("Best Cross-Validated R²:", grid.best_score_)


Best Parameters from GridSearchCV: {'fit_intercept': True}
Best Cross-Validated R²: 0.0103468881780449


## Model Performance Metrics

Here are the exact values from the linear regression and GridSearchCV outputs:

**Linear Regression Results :**

*   R² Score: 0.008
*   RMSE: 38.978
*   Intercept: 8.860
*   Coefficient: 7.483

**GridSearchCV Results :**

*   Best Cross-Validated R²: 0.0103468881780449

The linear regression results indicate a very weak relationship between Political Stability and FDI. The R² score of 0.008 means that only about 0.8% of the variation in FDI can be explained by Political Stability according to this model. The RMSE of 38.978 suggests that, on average, the model's predictions for FDI are off by about 39 percentage points of GDP. The coefficient of 7.483 implies that for every one-unit increase in Political Stability, FDI is predicted to increase by about 7.48 percentage points, with an intercept of 8.860 (predicted FDI when stability is zero).

The GridSearchCV's best cross-validated R² of 0.0103 confirms the model's poor performance in generalizing to new data. In essence, while there might be a statistically significant association (as seen in previous tests), this simple linear model is not practically useful for predicting FDI based on political stability alone.

## Business Interpretation of Regression Results

The linear regression analysis reveals a statistically significant, but weak, positive relationship between Political Stability and Foreign Direct Investment (FDI).

*   **For Businesses:** While the analysis shows that higher political stability is associated with a tendency for increased FDI (indicated by the positive coefficient and statistical significance), the very low R-squared value means that political stability alone explains only a tiny fraction of where foreign investment goes.
*   **Practical Implication:** This suggests that while political stability is a favorable factor for businesses considering foreign investment, it is far from the only or most important factor. Investors likely consider a wide range of other economic, market, and regulatory factors much more heavily when making investment decisions.

In summary, political stability is a positive signal, but businesses should conduct a comprehensive assessment of various factors when evaluating potential international investment opportunities, as stability by itself is not a strong predictor of high FDI.