<a href="https://colab.research.google.com/github/Ogombo-collins/Analysis-of-Data-Centers-in-Africa-and-Middle-East-as-at-November-2025/blob/main/Analyzing_Data_Centers_Distribution_in_Africa_and_Middle_East.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Analytical Insights: Data Center Distribution in Africa and the Middle East

## Overview
This project analyzes the distribution of data centers across Africa and the Middle East to highlight regional disparities in digital infrastructure.

Augmenting the main dataset using country-level population and data center counts, the analysis reveals structural imbalances, regional leaders, and underserved markets shaping the digital economy.

## Objectives
- Compare data center distribution between Africa and the Middle East
- Normalize infrastructure access using data centers per million people
- Identify countries and regions outperforming or underperforming relative to size
- Highlight structural digital infrastructure gaps within Africa

## Data Sources

The key datasources used:

1.  Main dataset: Regional data center distribution focused on Africa and Middle East (2025). Obtained from [Visual Capitalist on All World‚Äôs Data Centers](https://www.visualcapitalist.com/visualizing-all-of-the-worlds-data-centers-in-2025/)


2.  Augmenteded main dataset with Population figures estimate provided by Woldometer. Link: [Population by Country](https://www.worldometers.info/world-population/population-by-country/)



In [37]:
import pandas as pd
import numpy as np

# Complete dataset
data = [
    {"Country": "Israel", "Country_Code": "IL", "Region": "Africa & Middle East", "Data_Centers": 61},
    {"Country": "UAE", "Country_Code": "AE", "Region": "Africa & Middle East", "Data_Centers": 58},
    {"Country": "South Africa", "Country_Code": "ZA", "Region": "Africa & Middle East", "Data_Centers": 56},
    {"Country": "Saudi Arabia", "Country_Code": "SA", "Region": "Africa & Middle East", "Data_Centers": 51},
    {"Country": "Nigeria", "Country_Code": "NG", "Region": "Africa & Middle East", "Data_Centers": 21},
    {"Country": "Kenya", "Country_Code": "KE", "Region": "Africa & Middle East", "Data_Centers": 20},
    {"Country": "Iran", "Country_Code": "IR", "Region": "Africa & Middle East", "Data_Centers": 20},
    {"Country": "Oman", "Country_Code": "OM", "Region": "Africa & Middle East", "Data_Centers": 15},
    {"Country": "Egypt", "Country_Code": "EG", "Region": "Africa & Middle East", "Data_Centers": 13},
    {"Country": "Morocco", "Country_Code": "MA", "Region": "Africa & Middle East", "Data_Centers": 12},
    {"Country": "Tanzania", "Country_Code": "TZ", "Region": "Africa & Middle East", "Data_Centers": 11},
    {"Country": "Qatar", "Country_Code": "QA", "Region": "Africa & Middle East", "Data_Centers": 11},
    {"Country": "Mauritius", "Country_Code": "MU", "Region": "Africa & Middle East", "Data_Centers": 10},
    {"Country": "Ghana", "Country_Code": "GH", "Region": "Africa & Middle East", "Data_Centers": 8},
    {"Country": "Bahrain", "Country_Code": "BH", "Region": "Africa & Middle East", "Data_Centers": 8},
    {"Country": "Angola", "Country_Code": "AO", "Region": "Africa & Middle East", "Data_Centers": 8},
    {"Country": "Jordan", "Country_Code": "JO", "Region": "Africa & Middle East", "Data_Centers": 8},
    {"Country": "Senegal", "Country_Code": "SN", "Region": "Africa & Middle East", "Data_Centers": 7},
    {"Country": "Ivory Coast", "Country_Code": "CI", "Region": "Africa & Middle East", "Data_Centers": 6},
    {"Country": "Mozambique", "Country_Code": "MZ", "Region": "Africa & Middle East", "Data_Centers": 6},
    {"Country": "Libya", "Country_Code": "LY", "Region": "Africa & Middle East", "Data_Centers": 6},
    {"Country": "Algeria", "Country_Code": "DZ", "Region": "Africa & Middle East", "Data_Centers": 6},
    {"Country": "Reunion", "Country_Code": "RE", "Region": "Africa & Middle East", "Data_Centers": 5},
    {"Country": "Kuwait", "Country_Code": "KW", "Region": "Africa & Middle East", "Data_Centers": 5},
    {"Country": "Ethiopia", "Country_Code": "ET", "Region": "Africa & Middle East", "Data_Centers": 5},
    {"Country": "Botswana", "Country_Code": "BW", "Region": "Africa & Middle East", "Data_Centers": 5},
    {"Country": "DRC", "Country_Code": "CD", "Region": "Africa & Middle East", "Data_Centers": 4},
    {"Country": "Djibouti", "Country_Code": "DJ", "Region": "Africa & Middle East", "Data_Centers": 4},
    {"Country": "Tunisia", "Country_Code": "TN", "Region": "Africa & Middle East", "Data_Centers": 4},
    {"Country": "Uganda", "Country_Code": "UG", "Region": "Africa & Middle East", "Data_Centers": 4},
    {"Country": "Rwanda", "Country_Code": "RW", "Region": "Africa & Middle East", "Data_Centers": 3},
    {"Country": "Zambia", "Country_Code": "ZM", "Region": "Africa & Middle East", "Data_Centers": 3},
    {"Country": "Madagascar", "Country_Code": "MG", "Region": "Africa & Middle East", "Data_Centers": 3},
    {"Country": "Zimbabwe", "Country_Code": "ZW", "Region": "Africa & Middle East", "Data_Centers": 3},
    {"Country": "Lebanon", "Country_Code": "LB", "Region": "Africa & Middle East", "Data_Centers": 2},
    {"Country": "Namibia", "Country_Code": "NA", "Region": "Africa & Middle East", "Data_Centers": 2},
    {"Country": "Cameroon", "Country_Code": "CM", "Region": "Africa & Middle East", "Data_Centers": 2},
    {"Country": "Togo", "Country_Code": "TG", "Region": "Africa & Middle East", "Data_Centers": 2},
    {"Country": "Lesotho", "Country_Code": "LS", "Region": "Africa & Middle East", "Data_Centers": 2},
    {"Country": "Malawi", "Country_Code": "MW", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Republic of the Congo", "Country_Code": "CG", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Iraq", "Country_Code": "IQ", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Burkina Faso", "Country_Code": "BF", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Guinea", "Country_Code": "GN", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Palestine", "Country_Code": "PS", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Gabon", "Country_Code": "GA", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Mali", "Country_Code": "ML", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Mayotte", "Country_Code": "YT", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Equatorial Guinea", "Country_Code": "GQ", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Eswatini", "Country_Code": "SZ", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Sudan", "Country_Code": "SD", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Seychelles", "Country_Code": "SC", "Region": "Africa & Middle East", "Data_Centers": 1},
    {"Country": "Somalia", "Country_Code": "SO", "Region": "Africa & Middle East", "Data_Centers": 1},
]

df = pd.DataFrame(data)

# Classify countries
middle_east = ['Israel', 'UAE', 'Saudi Arabia', 'Iran', 'Oman', 'Qatar', 'Bahrain',
               'Jordan', 'Kuwait', 'Lebanon', 'Iraq', 'Palestine']
df['Sub_Region'] = df['Country'].apply(lambda x: 'Middle East' if x in middle_east else 'Africa')

# Adding population data (approximate, in millions)
population_data = {
    # Africa
    'Nigeria': 237.5, 'Ethiopia': 135.5, 'Egypt': 118.4, 'DRC': 112.8,
    'Tanzania': 70.5, 'South Africa': 64.7, 'Kenya': 57.5, 'Uganda': 51.4,
    'Sudan': 51.7, 'Algeria': 47.4, 'Angola': 39.0, 'Morocco': 38.4,
    'Mozambique': 35.6, 'Ghana': 35.1, 'Madagascar': 32.7, 'Ivory Coast': 32.7,
    'Cameroon': 29.9, 'Niger': 27.9, 'Mali': 25.2, 'Burkina Faso': 24.1,
    'Malawi': 22.2, 'Zambia': 21.9, 'Somalia': 19.7, 'Senegal': 18.9,
    'Zimbabwe': 17.0, 'Guinea': 15.1, 'Rwanda': 14.6, 'Tunisia': 12.3,
    'Togo': 9.7, 'Libya': 7.5, 'Republic of the Congo': 6.5,
    'Namibia': 3.1, 'Botswana': 2.6, 'Gabon': 2.6, 'Lesotho': 2.4,
    'Equatorial Guinea': 1.9, 'Mauritius': 1.3, 'Eswatini': 1.3,
    'Djibouti': 1.2, 'Reunion': 0.88, 'Mayotte': 0.34, 'Seychelles': 0.13,

    # Middle East
    'Iran': 92.4, 'Iraq': 47.0, 'Saudi Arabia': 34.6, 'Jordan': 11.4,
    'UAE': 11.1, 'Israel': 9.5, 'Palestine': 5.6, 'Lebanon': 5.4,
    'Oman': 5.5, 'Kuwait': 5.1, 'Qatar': 3.1, 'Bahrain': 1.6
}
df['Population_M'] = df['Country'].map(population_data)
df['DC_per_Million'] = (df['Data_Centers'] / df['Population_M']).round(2)

df.head(15).sort_values('Data_Centers', ascending=False)





Unnamed: 0,Country,Country_Code,Region,Data_Centers,Sub_Region,Population_M,DC_per_Million
0,Israel,IL,Africa & Middle East,61,Middle East,9.5,6.42
1,UAE,AE,Africa & Middle East,58,Middle East,11.1,5.23
2,South Africa,ZA,Africa & Middle East,56,Africa,64.7,0.87
3,Saudi Arabia,SA,Africa & Middle East,51,Middle East,34.6,1.47
4,Nigeria,NG,Africa & Middle East,21,Africa,237.5,0.09
5,Kenya,KE,Africa & Middle East,20,Africa,57.5,0.35
6,Iran,IR,Africa & Middle East,20,Middle East,92.4,0.22
7,Oman,OM,Africa & Middle East,15,Middle East,5.5,2.73
8,Egypt,EG,Africa & Middle East,13,Africa,118.4,0.11
9,Morocco,MA,Africa & Middle East,12,Africa,38.4,0.31


In [38]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53 entries, 0 to 52
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Country         53 non-null     object 
 1   Country_Code    53 non-null     object 
 2   Region          53 non-null     object 
 3   Data_Centers    53 non-null     int64  
 4   Sub_Region      53 non-null     object 
 5   Population_M    53 non-null     float64
 6   DC_per_Million  53 non-null     float64
dtypes: float64(2), int64(1), object(4)
memory usage: 3.0+ KB


In [39]:
# number of countries in sub_region == 'Africa'
print(df[df['Sub_Region'] == 'Africa']['Country'].count())

41


In [40]:
# number of countries in sub_region == 'Middle East'
print(df[df['Sub_Region'] == 'Middle East']['Country'].count())

12


Total number of countries under review is 53 whereby we have 41 countries in Africa and 12 in Middle East

In [41]:
# Top 10 overall
print(df.nlargest(10, 'Data_Centers')[['Country', 'Population_M','Data_Centers','DC_per_Million']])

        Country  Population_M  Data_Centers  DC_per_Million
0        Israel           9.5            61            6.42
1           UAE          11.1            58            5.23
2  South Africa          64.7            56            0.87
3  Saudi Arabia          34.6            51            1.47
4       Nigeria         237.5            21            0.09
5         Kenya          57.5            20            0.35
6          Iran          92.4            20            0.22
7          Oman           5.5            15            2.73
8         Egypt         118.4            13            0.11
9       Morocco          38.4            12            0.31


In [42]:
#top 10 by sub_region == 'Africa'
africa_10 = (df[df['Sub_Region'] == 'Africa'].nlargest(10, 'Data_Centers')[['Country', 'Population_M','Data_Centers','DC_per_Million']])
print(africa_10)

         Country  Population_M  Data_Centers  DC_per_Million
2   South Africa          64.7            56            0.87
4        Nigeria         237.5            21            0.09
5          Kenya          57.5            20            0.35
8          Egypt         118.4            13            0.11
9        Morocco          38.4            12            0.31
10      Tanzania          70.5            11            0.16
12     Mauritius           1.3            10            7.69
13         Ghana          35.1             8            0.23
15        Angola          39.0             8            0.21
17       Senegal          18.9             7            0.37


In [43]:
#sum of data centers for top 10 in Africa
print(africa_10['Data_Centers'].sum())

166


In [44]:
#top 10 by sub_region == 'Middle East'
middleeast_10 = (df[df['Sub_Region'] == 'Middle East'].nlargest(10, 'Data_Centers')[['Country', 'Data_Centers', 'Population_M','DC_per_Million']])
print(middleeast_10)

         Country  Data_Centers  Population_M  DC_per_Million
0         Israel            61           9.5            6.42
1            UAE            58          11.1            5.23
3   Saudi Arabia            51          34.6            1.47
6           Iran            20          92.4            0.22
7           Oman            15           5.5            2.73
11         Qatar            11           3.1            3.55
14       Bahrain             8           1.6            5.00
16        Jordan             8          11.4            0.70
23        Kuwait             5           5.1            0.98
34       Lebanon             2           5.4            0.37


In [45]:
#sum of data centers for top 4 middle east countries
middleeast_10.loc[0:6]['Data_Centers'].sum()

np.int64(190)

## Significant Insights from Data Centers Distribution

### Is the Middle East Dominating Africa's Digital Infrastructure?

In [46]:

# QUESTION 1
print("\n" + "üîç QUESTION 1: Is the Middle East Dominating Africa's Digital Infrastructure?")
print("-" * 90)

africa_total = df[df['Sub_Region'] == 'Africa']['Data_Centers'].sum()
me_total = df[df['Sub_Region'] == 'Middle East']['Data_Centers'].sum()
total = df['Data_Centers'].sum()

print(f"\nMiddle East Data Centers: {me_total} ({me_total/total*100:.1f}%)")
print(f"African Data Centers: {africa_total} ({africa_total/total*100:.1f}%)")
print(f"Total Data Centers: {total}")
#africa_total
print(f"Tota Data Centers in Africa:  {africa_total:.1f}")
#me_total
print(f"Total Data Centers in Middle East: {me_total:.1f}")

print(f"\nTop 5 Overall:")
print(df.nlargest(5, 'Data_Centers')[['Country', 'Sub_Region', 'Data_Centers']].to_string(index=False))





üîç QUESTION 1: Is the Middle East Dominating Africa's Digital Infrastructure?
------------------------------------------------------------------------------------------

Middle East Data Centers: 241 (48.8%)
African Data Centers: 253 (51.2%)
Total Data Centers: 494
Tota Data Centers in Africa:  253.0
Total Data Centers in Middle East: 241.0

Top 5 Overall:
     Country  Sub_Region  Data_Centers
      Israel Middle East            61
         UAE Middle East            58
South Africa      Africa            56
Saudi Arabia Middle East            51
     Nigeria      Africa            21


**THE STORY:**

Africa is being dramatically outpaced by Middle Eastern nations in digital infrastructure. Key Insights:

- Total number of data centers for all nations categorised under Africa and Middle East Region: 494
- Total data centers in Africa: 253
- Total data centers in Middle East: 241

Top 5 overall in terms of number of data centers:

1. Israel                       61
2. United Arab Emirates         58
3. South Africa                 56
4. Saudi Arabia                 51
5. Nigeria                      21

**Contrast between Midle East and African Data Centers Space**

The Middle East controls **48.8%** of the region's data centers with just four countries;Israel, UAE, Saudi Arabia, and Iran hosting 24 more facilities than the entire African continent's top 10 countries combined (Top 4 in Middle east aggregated total number of data centers is 190, whilst in Africa, Top 10 aggregeted total number of data centers is 166).

Africa controls **51.2%**  of the regions data centers.

This isn't just about numbers. It's about who controls the digital future. As AI, cloud computing, and digital services become the backbone of modern economies, Africa risks becoming a digital dependent on Nations with higher numbers of data centers.

### Which African Countries Are Punching Above Their Weight?

In [47]:
# QUESTION 2
print("\n" + "=" * 90)
print("üîç QUESTION 2: Which African Countries Are Punching Above Their Weight?")
print("-" * 90)

africa_df = df[df['Sub_Region'] == 'Africa'].copy()
africa_df = africa_df.dropna(subset=['Population_M'])
print("\nData Centers Per Million People (African Countries):")
print(africa_df.nlargest(15, 'DC_per_Million')[['Country', 'Population_M', 'Data_Centers', 'DC_per_Million']].to_string(index=False))




üîç QUESTION 2: Which African Countries Are Punching Above Their Weight?
------------------------------------------------------------------------------------------

Data Centers Per Million People (African Countries):
          Country  Population_M  Data_Centers  DC_per_Million
        Mauritius          1.30            10            7.69
       Seychelles          0.13             1            7.69
          Reunion          0.88             5            5.68
         Djibouti          1.20             4            3.33
          Mayotte          0.34             1            2.94
         Botswana          2.60             5            1.92
     South Africa         64.70            56            0.87
          Lesotho          2.40             2            0.83
            Libya          7.50             6            0.80
         Eswatini          1.30             1            0.77
          Namibia          3.10             2            0.65
Equatorial Guinea          1.90     

In [48]:
# Average of DC_per_Million
avg_dc_per_million = africa_df['DC_per_Million'].mean()

print(f"Average data centers per million for the African continent (41 countries) is: {avg_dc_per_million:.2f}")

Average data centers per million for the African continent (41 countries) is: 0.93


**THE STORY:**

Size isn't everything. While the average number of data centers for the whole African continent is **0.93**,  Mauritius, with a population of 1.3 million people  emerges as Africa's data center champion with **7.69 data centers** per million people‚Äîmore than 8 times the continental average.

This island nation has transformed itself into a digital hub, leveraging strategic location, political stability, and smart policy to attract infrastructure investment.

In contrast, densely populated giants like South Africa and Kenya, with populations exceeding 50 million each, have less than 1 data centers per million people.

**The key takeaway**: National wealth and population alone don't guarantee digital infrastructure. Vision, regulation, and investment climate matter more.

### Are Africa's Most Populous Nations Being Left Behind?

In [49]:
# QUESTION 3
print("\n" + "=" * 90)
print("üîç QUESTION 3: Are Africa's Most Populous Nations Being Left Behind?")
print("-" * 90)

big_populations = africa_df.nlargest(10, 'Population_M')[['Country', 'Population_M', 'Data_Centers', 'DC_per_Million']]
print("\nAfrica's 10 Most Populous Countries:")
print(big_populations.to_string(index=False))

avg_dc_big = big_populations['DC_per_Million'].mean()
avg_dc_continent = africa_df['DC_per_Million'].mean()

print(f"\nAverage DC per million (Top 10 by population): {avg_dc_big:.2f}")
print(f"Average DC per million (All African countries): {avg_dc_continent:.2f}")




üîç QUESTION 3: Are Africa's Most Populous Nations Being Left Behind?
------------------------------------------------------------------------------------------

Africa's 10 Most Populous Countries:
     Country  Population_M  Data_Centers  DC_per_Million
     Nigeria         237.5            21            0.09
    Ethiopia         135.5             5            0.04
       Egypt         118.4            13            0.11
         DRC         112.8             4            0.04
    Tanzania          70.5            11            0.16
South Africa          64.7            56            0.87
       Kenya          57.5            20            0.35
       Sudan          51.7             1            0.02
      Uganda          51.4             4            0.08
     Algeria          47.4             6            0.13

Average DC per million (Top 10 by population): 0.19
Average DC per million (All African countries): 0.93


**THE STORY:**

There's a scarcity of data centeres serving populous nations in Africa.

Africa's population giants are woefully under-served.

1. Nigeria, with 237.5 million people, has 21 data centers, with 0.09 data centers serving 1 million people.

2. Ethiopia's 123 million people own 5 data center, with 0.04 data centers serving 1 million people

3. South Africa, with 64.5 million has the highest number of data centers, **56 data centers**, also has the highest number of data centers serving 1 million people, 0.87 data centers.

4. Egypt's 118.4 million people owns 13 data centers with 0.11 data centers serving 1 million people.

This creates a dependency cycle: without adequate number of local data centers, these nations face slower internet speeds, higher costs for cloud services, and barriers to building local tech ecosystems.

For 500+ million Africans living in the continent's largest economies, their digital future is literally being hosted elsewhere‚Äîoften in Europe or North America.

The implications for digital sovereignty are profound. When a Nigerian startup scales up, it likely stores data abroad. When an Egyptian business goes digital, its customer information crosses borders. This isn't just inconvenient‚Äîit's a structural disadvantage in the digital economy.

### The 'One Data Center Club'‚Äî20 African Nations on Life Support?

In [58]:
# QUESTION 4
print("\n" + "=" * 90)
print("üîç QUESTION 4: The 'One Data Center Club'‚Äî20 African Nations on Life Support?")
print("-" * 90)

one_dc_countries = africa_df[africa_df['Data_Centers'] == 1].sort_values('Population_M', ascending=False)
one_dc_pop = one_dc_countries['Population_M'].sum()
total_africa_pop = africa_df['Population_M'].sum()

print(f"\n12 African countries have just ONE data center each:")
print(one_dc_countries[['Country', 'Population_M', 'Data_Centers','DC_per_Million']].to_string(index=False))
print(f"\nCombined population: {one_dc_pop:.1f} million ({one_dc_pop/total_africa_pop*100:.1f}% of Africa)")



üîç QUESTION 4: The 'One Data Center Club'‚Äî20 African Nations on Life Support?
------------------------------------------------------------------------------------------

12 African countries have just ONE data center each:
              Country  Population_M  Data_Centers  DC_per_Million
                Sudan         51.70             1            0.02
                 Mali         25.20             1            0.04
         Burkina Faso         24.10             1            0.04
               Malawi         22.20             1            0.05
              Somalia         19.70             1            0.05
               Guinea         15.10             1            0.07
Republic of the Congo          6.50             1            0.15
                Gabon          2.60             1            0.38
    Equatorial Guinea          1.90             1            0.53
             Eswatini          1.30             1            0.77
              Mayotte          0.34           

In [55]:
one_dc_countries.info()

<class 'pandas.core.frame.DataFrame'>
Index: 12 entries, 50 to 51
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Country         12 non-null     object 
 1   Country_Code    12 non-null     object 
 2   Region          12 non-null     object 
 3   Data_Centers    12 non-null     int64  
 4   Sub_Region      12 non-null     object 
 5   Population_M    12 non-null     float64
 6   DC_per_Million  12 non-null     float64
dtypes: float64(2), int64(1), object(4)
memory usage: 768.0+ bytes


In [57]:
one_dc_pop.sum()

np.float64(170.77)

**THE STORY:**

Twenty African nations, home to 151 million people are hanging by a digital thread.

Each has exactly ONE data center. One facility failure, one power outage, one political decision could severely impact entire national digital economies.

This includes countries like Sudan (52M people), Mali (25M), and Burkina Faso (24M). Imagine if the United States had just one data center for the entire country‚Äîthe vulnerability is staggering. Yet this is reality for a third of African nations.

**For context**: the tiny nation of Israel has 61 data centers serving 9.5 million people. These 20 African countries combined have 20 data centers serving 151 million.

### East vs West vs North Africa‚ÄîWhich Region Leads?

In [51]:

# QUESTION 5
print("\n" + "=" * 90)
print("üîç QUESTION 5: East vs West vs North Africa‚ÄîWhich Region Leads?")
print("-" * 90)

# Regional classification

east_africa = [
    'Kenya', 'Tanzania', 'Uganda', 'Ethiopia', 'Rwanda', 'Djibouti',
    'Somalia', 'Eritrea', 'Burundi', 'South Sudan', 'Sudan',
]

west_africa = [
    'Nigeria', 'Ghana', 'Ivory Coast', 'Senegal', 'Mali', 'Burkina Faso',
    'Guinea', 'Togo', 'Benin', 'Niger', 'Gambia', 'Liberia',
    'Sierra Leone', 'Cabo Verde', 'Guinea-Bissau', 'Mauritania'
]

north_africa = [
    'Egypt', 'Morocco', 'Algeria', 'Tunisia', 'Libya', 'Sahrawi Arab Democratic Republic'
]

# Note: Many countries from your West/East lists actually belong here
central_africa = [
    'Cameroon', 'Gabon', 'Equatorial Guinea', 'Republic of the Congo',
    'DRC', 'Central African Republic', 'Chad',
    'Sao Tome and Principe'
]

southern_africa = [
    'South Africa', 'Angola', 'Botswana', 'Namibia', 'Lesotho', 'Eswatini',
    'Zambia', 'Zimbabwe', 'Malawi', 'Mozambique','Madagascar','Mauritius','Comoros',
    'Seychelles','Reunion', 'Mayotte'
]

def classify_african_region(country):
    if country in east_africa:
        return 'East Africa'
    elif country in west_africa:
        return 'West Africa'
    elif country in north_africa:
        return 'North Africa'
    elif country in southern_africa:
        return 'Southern Africa'
    elif country in central_africa:
        return 'Central Africa'
    else:
        return 'Other'


africa_df['African_Region'] = africa_df['Country'].apply(classify_african_region)
regional_stats = africa_df.groupby('African_Region').agg({
    'Data_Centers': 'sum',
    'Population_M': 'sum',
    'Country': 'count'
}).round(1)
regional_stats.columns = ['Total_DC', 'Total_Pop_M', 'Num_Countries']
regional_stats['DC_per_Million'] = (regional_stats['Total_DC'] / regional_stats['Total_Pop_M']).round(2)
regional_stats = regional_stats.sort_values('Total_DC', ascending=False)

print("\nRegional Breakdown:")
print(regional_stats.to_string())




üîç QUESTION 5: East vs West vs North Africa‚ÄîWhich Region Leads?
------------------------------------------------------------------------------------------

Regional Breakdown:
                 Total_DC  Total_Pop_M  Num_Countries  DC_per_Million
African_Region                                                       
Southern Africa       107        245.2             15            0.44
East Africa            49        402.1              8            0.12
West Africa            47        398.3              8            0.12
North Africa           41        224.0              5            0.18
Central Africa          9        153.7              5            0.06


In [52]:
africa_df.info()

# save africa_df as csv file
africa_df.to_csv('africa_only_datacenter_analysis.csv', index=False)

<class 'pandas.core.frame.DataFrame'>
Index: 41 entries, 2 to 52
Data columns (total 8 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Country         41 non-null     object 
 1   Country_Code    41 non-null     object 
 2   Region          41 non-null     object 
 3   Data_Centers    41 non-null     int64  
 4   Sub_Region      41 non-null     object 
 5   Population_M    41 non-null     float64
 6   DC_per_Million  41 non-null     float64
 7   African_Region  41 non-null     object 
dtypes: float64(2), int64(1), object(5)
memory usage: 2.9+ KB


**THE STORY:**

Southern Africa region is quietly winning the data
infrastructure race. Despite having the smallest population of any African region, Southern Africa (led by South Africa's 56 data centers) has built the densest digital infrastructure on the continent. The region:

- Has **107** data centers serving 245.2 million people
- Number of data centers serving 1 million people: 0.44
- Number of countries in Southern african region with data centers: 15

East Africa ( hosting Kenya, once dubbed "Silicon Savannah" for its mobile money innovation) trails behind with Kenya and Tanzania leading the charge. The region:

- Has **49** data centers serving 402.1 million people
- Number of data centers serving 1 million people: 0.12
- Number of countries in East African region with data centers: 8

West Africa, home to the continent's largest economy (Nigeria) and nearly 400 million people, remains
dramatically underserved relative to its population.

North Africa's story is one of missed opportunity. With proximity to Europe, relatively better connectivity, and significant populations, countries like Egypt and Morocco should be regional powerhouses, yet they lag behind tiny island nations in per-capita
infrastructure.

The regional disparity suggests that Africa's digital divide isn't just between Africa and the world, it's within Africa itself.

### Kenya vs Nigeria‚ÄîWhy Is the Digital David Beating Goliath?

In [53]:
# QUESTION 6
print("\n" + "=" * 90)
print("üîç QUESTION 6: Kenya vs Nigeria‚ÄîWhy Is the Digital David Beating Goliath?")
print("-" * 90)

kenya = df[df['Country'] == 'Kenya'].iloc[0]
nigeria = df[df['Country'] == 'Nigeria'].iloc[0]

print(f"\nNIGERIA: {nigeria['Population_M']:.0f}M people | {nigeria['Data_Centers']} data centers | {nigeria['DC_per_Million']:.2f} per million")
print(f"KENYA: {kenya['Population_M']:.0f}M people | {kenya['Data_Centers']} data centers | {kenya['DC_per_Million']:.2f} per million")
print(f"\nKenya has {kenya['DC_per_Million']/nigeria['DC_per_Million']:.1f}x better data center coverage than Nigeria")




üîç QUESTION 6: Kenya vs Nigeria‚ÄîWhy Is the Digital David Beating Goliath?
------------------------------------------------------------------------------------------

NIGERIA: 238M people | 21 data centers | 0.09 per million
KENYA: 58M people | 20 data centers | 0.35 per million

Kenya has 3.9x better data center coverage than Nigeria


**THE STORY:**

Here's the shocker that should worry Nigerian policymakers: Kenya, with just 58 million people, has nearly as many data centers (20) as Nigeria with 238
million (21). Per capita, Kenya's digital infrastructure is **3.9 times better** than Africa's largest economy.

This isn't an accident.It's a policy failure. While Nigeria wrestled with inconsistent regulations and infrastructure challenges, Kenya deliberately positioned itself as East Africa's digital hub.

The result? Multiple undersea cable landings, favorable
policies for tech companies, and data centers that serve not just Kenya but the entire  region.

For Nigeria, this represents a massive opportunity cost. As Africa's largest economy
and tech market, Nigeria should be the data center capital of the continent. Instead,
majority of its startups often host services in Europe/USA or depend on infrastructure in neighboring
countries.

The key takwaway: economic size doesn't automatically translate to digital leadership‚Äîintentional policy does.





# Conclusion: Africa'S Digital Infrastructure Crisis

The data reveals an uncomfortable truth: Africa is falling behind in the race to build digital infrastructure. With just 253 data centers across 48 countries and over 1.4 billion people, the continent has less infrastructure than Israel alone.

The winners: Mauritius, South Africa, Kenya show that smart policy, political stability, and strategic investment can overcome disadvantages.

The losers: particularly Africa's population giants, demonstrate that size and natural resources mean nothing without vision and execution.

As the world enters the AI age, data centers aren't just convenient, they're sovereignty.

Without them, Africa will remain a digital consumer, not a digital creator. The window to change this is narrowing.

In [54]:
# Save analysis
df.to_csv('africa_me_datacenter_analysis.csv', index=False)
print(f"\n‚úì Full analysis dataset saved to 'africa_me_datacenter_analysis.csv'")


‚úì Full analysis dataset saved to 'africa_me_datacenter_analysis.csv'
