# Investment Indicator

This project aims to determine the potential economic longevity of various nations by analyzing a range of metrics: financial openness index, income profits and capital gains taxes as a percentage of revenue, agricultural land by square kilometer, property rights index, economic freedom overall index, population growth as a percentage, political globalization, people practicing Judaism as a percentage of the population, Christians as a percent of the total population, human flight and brain drain, public services, cost of living, and government debt as a percentage of GDP.

The financial openness index measures a country's openness to international trade and investment, where high financial openness can boost foreign investment, economic growth, and innovation but also increase vulnerability to global financial crises. The metric of income, profits, and capital gains taxes as a percentage of revenue reflects the tax burden on businesses and individuals. While high taxes might discourage investment and economic activity, low taxes could stimulate growth but risk inadequate public services and infrastructure. Additionally, low taxes might indicate a country's profitability from other revenue sources, such as the UAE's oil wealth, suggesting greater economic sustainability.

Agricultural land by square kilometer indicates the extent of land used for agriculture, crucial for food security and export potential. While more agricultural land can support a larger agricultural sector, excessive reliance on agriculture may signal underdevelopment in other economic sectors. The property rights index, reflecting the protection of private property, is essential for encouraging investment and economic stability, as secure ownership and transactions are foundational to economic growth.

The economic freedom overall index measures the ease of doing business in a country, where higher economic freedom typically correlates with greater economic growth and prosperity due to fewer business restrictions. Population growth as a percentage affects labor market dynamics and economic demand; high population growth can provide a youthful workforce and expand markets but might also strain resources and infrastructure if not managed effectively.

Political globalization indicates the degree of a country's political integration into the global community. High political globalization can enhance diplomatic influence and access to international aid, but it can also expose countries to external political pressures. However, it can also lessen a country's likelihood of being preyed upon, as countries with limited political globalization, like many tax havens, have historically been pressured into changing their policies. Analyzing the percentage of people practicing Judaism and Christians in the population provides insight into cultural diversity and minority rights, influencing social cohesion, cultural identity, and historical economic patterns.
Human flight and brain drain measure the emigration of skilled individuals, where high levels of brain drain can deplete the talent pool and undermine economic growth, while low levels indicate a country's ability to retain its skilled workforce. Public services reflect the quality and accessibility of essential services like healthcare, education, and infrastructure, supporting economic stability and quality of life, thereby attracting investment and residents.

The cost of living indicates how expensive it is to live in a country. High living costs can reduce disposable income and deter immigration, while lower costs can attract workers and businesses seeking reduced operating expenses. Lastly, government debt as a percentage of GDP reflects a country's fiscal health and sustainability. High debt levels can limit economic flexibility and increase vulnerability to economic shocks, whereas low debt levels suggest more robust economic management.

By analyzing these indicators, this project seeks to provide a comprehensive understanding of the economic strengths and weaknesses of different nations, offering informed predictions about their future economic longevity. Each indicator provides a holistic view of a country's economic potential and challenges, laying the groundwork for strategic economic planning and decision-making.

#### Imports

In [2]:
import numpy as np
import pandas as pd
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt

###The additional data frame argument are instantiated because of how pandas interprets NA (the ContinentCode for North America, as not applicable/NaN)
df = pd.read_csv("../data/investmentIndicators.csv", na_values=[''], keep_default_na=False)
# print(df['ContinentCode'].unique())
# print(df['ContinentCode'].isnull().sum())
# print(df.info())

## Globals
- When calculating final values everything should be equally valued, their are currently 13 indicators, so the value per category is 0.07692307692307693.

In [3]:
indicators = [
    'Financial openness index',
    'Government debt as percent of GDP',
    'Income profits and capital gains taxes: percent of revenue',
    'Agricultural land sq. km.',
    'Property rights index (0-100)',
    'Economic freedom overall index (0-100)',
    'Population growth percent',
    'Political globalization index (0-100)',
    'Christians as percent of the total population',
    'Human flight and brain drain index 0 (low) - 10 (high)',
    'Public services index 0 (high) - 10 (low)',
    'Cost of living index world average = 100'
    ] 

totalIndicators = len(indicators)
indicatorValuePerCategory = 1 / totalIndicators

### Region Processing
- Nested dictionary, first k:v would be countries and sublist referencing averaged values for each category over all years combined,
including country to region and country to indicator value (1/4` "Stone Market, Unlikely to see much growth", 2/4 Cold Market, 
Possible growth, but likely with significant downsides, 3/4 Warm Market, likely to see growth, minimal downsides, 4/4 Hot Market, 
major growth likely with current trajectory

In [4]:
regions = {}
###Region key to value pipeline
###Remove all non relevant continent column data
regionKeptColumns = ['ContinentCode'] + indicators
###Preserve unremoved data as new data frame
regionFilteredDf = df[regionKeptColumns]
###Group new data frame by continent code
regionGroupedDf = regionFilteredDf.groupby('ContinentCode')
# print(regionGroupedDf.first())

for region, column in regionGroupedDf:
    # List to pair each continent with the median value for each indicator
    regionMode = {}
    for indicator in indicators:
        # Calculate median for each region per indicator
        medianValue = column[indicator].median()
        # If there are no values for the entire region in the given category, the stored value will be None
        if pd.isna(medianValue):
            medianValue = None
        regionMode[indicator] = medianValue
    regions[region] = regionMode

print(regions)

{'AF': {'Financial openness index': -1.242, 'Government debt as percent of GDP': 49.394999999999996, 'Income profits and capital gains taxes: percent of revenue': 26.759999999999998, 'Agricultural land sq. km.': 121020.0, 'Property rights index (0-100)': 35.0, 'Economic freedom overall index (0-100)': 55.0, 'Population growth percent': 2.41, 'Political globalization index (0-100)': 62.39, 'Christians as percent of the total population': 45.4, 'Human flight and brain drain index 0 (low) - 10 (high)': 7.0, 'Public services index 0 (high) - 10 (low)': 8.2, 'Cost of living index world average = 100': 58.34}, 'AS': {'Financial openness index': -0.042, 'Government debt as percent of GDP': 42.1, 'Income profits and capital gains taxes: percent of revenue': 23.24, 'Agricultural land sq. km.': 48449.0, 'Property rights index (0-100)': 48.0, 'Economic freedom overall index (0-100)': 61.0, 'Population growth percent': 1.26, 'Political globalization index (0-100)': 66.235, 'Christians as percent o

### Country Processing

In [5]:
# Create the countries dictionary to store country name, code, and megascore
countries = {}
### Country key to value pipeline.
### Remove all non relevant country column data.
countryKeptColumn = ['Country', 'Code', 'ContinentCode', 'Year'] + indicators
### Preserve unremoved data as new data frame.
countryFilteredDf = df[countryKeptColumn]

### Loop through each continent and its corresponding mode values.
for continent, regionModes in regions.items():
    for indicator, modeValue in regionModes.items():
        if modeValue is not None:
            ### Replace NaN values with the mode from the regions dictionary.
            ### Use loc to find rows where ContinentCode matches the current continent and the indicator value is NaN. Replace these NaN values directly with the mode value.
            countryFilteredDf.loc[
                (countryFilteredDf['ContinentCode'] == continent) & (
                    countryFilteredDf[indicator].isna()), indicator
            ] = modeValue

countryGroupedDf = countryFilteredDf.groupby(['Country', 'Year'])
# print(countryGroupedDf.first())

### Data Normalization

In [6]:
# Define the min and max values for normalization
normalizationRanges = {
    'Financial openness index': (-1.931, 2.999),
    'Government debt as percent of GDP': (238.7, 2.06), 
    'Income profits and capital gains taxes: percent of revenue': (64.47, 2.69), 
    'Agricultural land sq. km.': (3, 5206950),
    'Property rights index (0-100)': (0, 100),
    'Economic freedom overall index (0-100)': (0, 100),
    'Population growth percent': (-14.26, 3.71),
    'Political globalization index (0-100)': (0, 100),
    'Christians as percent of the total population': (0, 100),
    'Human flight and brain drain index 0 (low) - 10 (high)': (10, 0),
    'Public services index 0 (high) - 10 (low)': (10, 0),
    'Cost of living index world average = 100': (225.86, 27.37)
}

# Group by 'Country' to calculate the average for each indicator
averagedCountryValues = countryFilteredDf.groupby('Country')[indicators].mean()

# Normalize each indicator using the provided ranges
normalizedAverages = pd.DataFrame(index=averagedCountryValues.index)

for indicator in indicators:
    minValue, maxValue = normalizationRanges[indicator]
    # Correctly normalize to a 0-100 scale
    normalizedAverages[indicator] = 100 * \
        (averagedCountryValues[indicator] - minValue) / (maxValue - minValue)

# Print normalized averages for verification
print("Normalized Averages:\n", normalizedAverages)

# Calculate megascore by summing the normalized values across all indicators
normalizedAverages['Megascore'] = normalizedAverages.sum(axis=1)

# Print megascore for verification
print("Megascores:\n", normalizedAverages['Megascore'])

# Calculate percentiles
megascore_percentiles = pd.qcut(normalizedAverages['Megascore'], 4, labels=[
    "Stone Market", "Cold Market", "Warm Market", "Hot Market"])

# Create the countries dictionary to store country name, code, megascore, and market declaration
countries = {}

for country in normalizedAverages.index:
    country_code = countryFilteredDf[countryFilteredDf['Country']
                                     == country]['Code'].iloc[0]
    megascore = normalizedAverages.loc[country, 'Megascore']
    market_declaration = megascore_percentiles.loc[country]
    countries[country] = {'Code': country_code,
                          'Megascore': megascore, 'Market Declaration': market_declaration}

# Print the countries dictionary to check the values
print("Countries Dictionary:\n", countries)

Normalized Averages:
              Financial openness index  Government debt as percent of GDP  \
Country                                                                    
Afghanistan                 38.316430                          94.906468   
Albania                     46.482421                          72.059528   
Algeria                     13.975659                          87.259832   
Andorra                     85.801217                          82.779750   
Angola                       3.493915                          70.316022   
...                               ...                                ...   
Venezuela                    9.579108                          68.381473   
Vietnam                     36.430020                          78.431373   
Yemen                       73.930020                          72.461334   
Zambia                      67.844828                          70.514635   
Zimbabwe                    26.707235                          73.

In [7]:
# Create a DataFrame for Plotly
plotly_df = pd.DataFrame(countries).T.reset_index()
plotly_df.columns = ['Country', 'Code', 'Megascore', 'Market Declaration']

# Define the color mapping for market declarations
color_map = {
    "Stone Market": "gray",
    "Cold Market": "blue",
    "Warm Market": "orange",
    "Hot Market": "red"
}

# Create the choropleth map using Plotly
fig = px.choropleth(
    plotly_df,
    locations="Code",
    color="Market Declaration",
    hover_name="Country",
    hover_data=["Megascore"],
    color_discrete_map=color_map,
    projection="natural earth"
)

# Update layout for better visualization
fig.update_layout(
    title="Market Declarations by Country",
    geo=dict(
        showframe=True,
        showcoastlines=True,
        projection_type='orthographic'
    )
)

# Show the plot
fig.show()

In [8]:
# Sort countries by their Megascore in descending order
sorted_countries = normalizedAverages['Megascore'].sort_values(
    ascending=False).index

# Display the rankings per category for each country along with its Megascore and overall ranking
for rank, country in enumerate(sorted_countries, start=1):
    data = normalizedAverages.loc[country, indicators]
    megascore = normalizedAverages.loc[country, 'Megascore']
    print(f'Rank {rank} for {country} (Megascore: {megascore}):')
    ranked_data = data.sort_values(ascending=False)
    for indicator, score in ranked_data.items():
        print(f'  {indicator}: {score}')
    print("\n")

Rank 1 for Sweden (Megascore: 909.0488054910326):
  Political globalization index (0-100): 93.92291666666667
  Property rights index (0-100): 91.5
  Human flight and brain drain index 0 (low) - 10 (high): 86.5
  Public services index 0 (high) - 10 (low): 86.16666666666667
  Financial openness index: 85.80121703853958
  Population growth percent: 83.99647560749396
  Government debt as percent of GDP: 83.84395424836602
  Christians as percent of the total population: 82.77500000000002
  Income profits and capital gains taxes: percent of revenue: 78.60418689975181
  Economic freedom overall index (0-100): 75.08333333333333
  Cost of living index world average = 100: 60.30950341746856
  Agricultural land sq. km.: 0.5455516127460742


Rank 2 for Switzerland (Megascore: 894.787732197271):
  Political globalization index (0-100): 93.29791666666667
  Property rights index (0-100): 89.25
  Financial openness index: 85.80121703853958
  Public services index 0 (high) - 10 (low): 85.50000000000001