<a href="https://colab.research.google.com/github/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/blob/main/Tackling_the_Health_Crises_in_Africa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Tackling the Health Crises in Africa
People dying from lack of medical resources

<div align="center" style="width: 950px; font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://media.istockphoto.com/id/1218638664/photo/woman-donates-her-blood-in-a-street-point-in-kampala-uganda.jpg?s=1024x1024&w=is&k=20&c=F0gUOdUhXkk1DcE3U82e_dbriaVy03aC4waFgg6VP0g="
     alt="Dummy image 1"
     style="float: center; padding-bottom=0.5em"
     width=950px/>



<a name="51"></a>

#### The Process of this Notebook

[The Problem Statement](#a)


[Install and Import Necessary Libraries](#1)

[Load Data](#2)

[Data Engineering](#3)

[Visualizations Codes](#4.0)

> [Codes for Plot1 & Plot2](#4)
> [Codes for Plot3 & Plot4](#5)
> [Codes for Plot5 & Plot6](#6)
> [Codes for Plot7 & Plot8](#7)
> [Codes for Plot9 & Plot10](#8)
> [Codes for Plot11 & Plot12](#9)
> [Codes for Plot13 & Plot14](#10)
> [Codes for Plot14_1](#14.1)
> [Codes for Plot15 & Plot16](#11)
> [Codes for Plot17](#12)
> [Codes for Plot18](#13)
> [Codes for Plot19 & Plot20](#14)
> [Codes for Plot21 & Plot22](#15)

[Visualizations & Insights](#16)

[Africa as a continent](#17)

[Results](#18)

[Dataset Limitations](#19)

[Recommendations](#20)

[Conclusion](#21)

<a name="a"></a>

## Problem Statement
The lack of access to adequate medical resources and facilities have led to the significant number of deaths. Some of these deaths could have been avoided by timely access to a medical professional or close proximity of hospitals.

Health systems across Africa are underfunded and understaffed. Less than half of African citizens (52%) - about 615 million people — have access to the health care they need, the quality of health services across the continent is generally poor, and family planning needs of half the continent's women and girls are unmet.

### Let's focus on Africa in this dataset

You are required to provide solutions to the health challenges, especially in Africa. Ensure that you allow all your creative self to shine through and remember, Africa looks up to you for a solution.

There are six datasets for this project: Four are Comma Separated Values (CSV), and Two are Excel (xlxs) files.


-  annual number of deaths by cause - csv

-  number of deaths by age group - csv

-  medical Doctors per 10000 group - xlxs

-  ISO 3166 country and continent code - csv

-  World population -csv

-  current health expenditure (% of GDP) - xlxs




a)   Include the limitations with the dataset and how you think your analysis can be further enriched


b)   Be ready to discuss your insights and share what you learned working on the datasets

**All the bestl!**

<a name="1"></a>


## Install and Import the Necessary Libraries




In [None]:
!pip -q install pandas_bokeh

In [None]:
# to avoid unnecessary and irritating warning messages
import warnings
warnings.filterwarnings('always')
warnings.filterwarnings('ignore')

# data visualisation and manipulation
import pandas as pd
import numpy as np
np.set_printoptions(precision=3, suppress=True)
import pandas_bokeh
import functools as ft

# Embedding plots in Colab Notebook
pandas_bokeh.output_notebook()
from matplotlib import pyplot as plt
plt.rcParams["figure.figsize"] = [10, 7]

<a name="2"></a>

### Download and Load the Data in Dataframes

We will now have our data for easy analysis. The pandas.read_csv() and pandas.read_excel() allow us to create dataframes from the downloaded files.


In [None]:
# Annual Number of Death by Cause data 
disease_df = pd.read_csv(
    "https://raw.githubusercontent.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/main/1.%20annual-number-of-deaths-by-cause.csv",
    parse_dates=["Year"],
)
# ISo_3166 Country and Continent data 
iso_df = pd.read_csv(
    "https://raw.githubusercontent.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/main/4.%20ISO%203166_country-and-continent-codes-list-csv.csv"
)
# World population data 
pop_df = pd.read_csv(
    "https://raw.githubusercontent.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/main/5.%20World%20Population.csv",
    parse_dates=["Year"],
)
# Number of Death by Age Group data  
age_df = pd.read_csv(
    "https://raw.githubusercontent.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/main/2.%20number-of-deaths-by-age-group.csv",
    parse_dates=["Year"],
)
# Medical Doctors per 10000 population data 
doc_df = pd.read_excel(
    "https://github.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/blob/main/3.%20Medical%20Doctors%20Per%2010000%20population.xlsx?raw=true"
)
# Current health Expenditure (% of GDP) data
exp_df = pd.read_excel(
    "https://github.com/AbiemwenseMaureenOshobugie/Tackling-the-Health-Crises-in-Africa/blob/main/6.%20Current%20health%20expenditure%20(%25%20of%20GDP).xlsx?raw=true"
)


<a name="3"></a>


### Data Engineering
The main task here is that of analysis which I am diving right into. There is one dimension table (the Iso_3166 Cuntry and Continent data) and the others are fact tables. As a result, we may merge all the tables through the dimension table. I guess I can as well do the data engineering along.

**Merge all dataframes** - I choose to merge them simultaneously with carrying out the analysis. So, the results of the analysis will not be visible until later on.

Impatient? You may hit the [links above](#51) to take you to any part of the note you love to go.



Still here? Let's hit the start button.

**Firstly**, I will merge the Annual Number of Death by Cause and ISo_3166 Country and Continent data to give `disease_iso` df

*   Annual Number of Death by Cause data - Gives the number of deaths and the causes across different countries from 1990 to 2019

*   ISo_3166 Country and Continent data - Gives the list of sovereign states and dependent territories by continent describing the list of countries by continent




In [None]:
# Merge disease and iso data
disease_iso = pd.merge(
    disease_df,
    iso_df.drop(
        ["Two_Letter_Country_Code", "Country_Number"],
        axis=1,
    ),
    how="left",
    left_on=["Code"],
    right_on=["Three_Letter_Country_Code"],
)


Here, I want to remove some duplicate columns from the `disease_iso` df to give a new df called `disease_iso_no_dup` df.

In [None]:
# drop duplicate columns
disease_iso_no_dup = disease_iso.drop(
    ["Entity", "Code", "Three_Letter_Country_Code", "Continent_Code"],
    axis=1,
)


Again, I will merge the ISo_3166 Country and Continent and Number of Deaths by Age Group data to give the `age_iso` df. While at that, I will drop the duplicate columns.

*   Number of Deaths by Age Group data - Gives the number of deaths by age groups across different counties from 1990 to 2019




In [None]:
# Merge the age-group and iso data
age_iso = pd.merge(
    age_df,
    iso_df.drop(
        ["Two_Letter_Country_Code", "Country_Number", "Continent_Code"],
        axis=1,
    ),
    how="left",
    left_on=["Code"],
    right_on=["Three_Letter_Country_Code"],
)

**Secondly**, I have decided to merge those three already mentioned dataframes. , I will have  to bring together: Annual Number of Death by Cause, ISo_3166 Country and Continent and Number of Death by Age Group data to form `disease_age_iso` df, while dropping duplcates columns in the process.
 

In [None]:
# 2) Merge disease, age-group and iso data
disease_age_iso = pd.merge(
    disease_df,
    age_iso.drop(
        ["Entity", "Code"],
        axis=1,
    ),
    how="left",
    left_on=["Code", "Year"],
    right_on=["Three_Letter_Country_Code", "Year"],
)


**Thirdly**, I want to merge a fourth dataframe: World population data. This will give me `disease_age_iso_pop` df. while loading this data into the table, parse_dates formatting did not work, so I have to add the `errors="coerce"` here.

*   World population data - Gives data from the UN Population Division, provides consistent and comparable estimates (and projections) within and across countries over the last century.





In [None]:
# transform the Year column of the population data to datetime
pop_df["Year"] = pd.to_datetime(pop_df["Year"], errors="coerce")

# Merge disease, age-group, iso and population data
disease_age_iso_pop = pd.merge(
    disease_age_iso,
    pop_df,
    how="left",
    left_on=["Entity", "Code", "Year"],
    right_on=["Entity", "Code", "Year"],
)


Before merging the fifth dataframe: Medical Doctors per 10000 population data, I want to drop the first two rows. The first row is made up of NaN values and the second row is the supposed column headers. As a result, I will rename the columns I am most likely to use and leave out the others. In this data, I have to change the data type of the `Year` column.


In [None]:
# drop unnecessary rows and rename some columns
doc_df = doc_df.drop(labels=[0, 1], axis=0)

doc_df.rename(
    columns={
        "Unnamed: 8": "Doctor_Value",
        "Unnamed: 3": "Continent_Name",
        "Unnamed: 1": "Indicator",
        "Unnamed: 5": "Entity",
        "Unnamed: 6": "Year"
    },
    inplace=True,
)  # renames the columns

# change the year to a datetime datatype 
doc_df["Year"] = pd.to_datetime(doc_df["Year"], errors="coerce")


**Fourthly**,  I will now merge the Medical Doctors per 10000 population data to the dataframe to give me the `disease_age_iso_pop_doc` df. Those columns that I did not rename are not considered necessary by me, so I will drop them here.

*   Medical Doctors per 10000 population data - Gives the Number of medical doctors per 10,000 medical practitioners in Africa



In [None]:
# Merge disease, age-group, iso, population and medical-doctors data
disease_age_iso_pop_doc = pd.merge(
    disease_age_iso_pop,
    doc_df.drop(
        [
            "Applied filters:\nLocation type is Country\nLANGUAGE_CODE is en\nIndicatorCode is HWF_0001, HWF_0002, HWF_0003, HWF_0004, or HWF_0005",
            "Unnamed: 2",
            "Continent_Name",
            "Unnamed: 4",
            "Unnamed: 7",
        ],
        axis=1,
    ),  # drop unneeded columns
    how="left",
    left_on=["Entity", "Year"],
    right_on=["Entity", "Year"],
)
# further drop duplicate columns
disease_age_iso_pop_doc = disease_age_iso_pop_doc.drop(
    ["Country_Name", "Three_Letter_Country_Code"], axis=1
)

Indeed, I saved the best for the last. The last dataframe to deal with, and I'll be all done, as per the data engineering, is the Current health Expenditure (% of GDP) data. The last two columns of this data: `Unnamed: 23` and `Unnamed: 24` are all NaN values, so I will first drop them.	I will assign the 3rd row as the column headers and do away with that row. I will drop some NaN values, here and there.

Most importantly, I have to apply the `pandas.melt()` function to `unpivot` the dataframe from a wide to long format. The emerging dataframe from this is called the `exp_melt`.



In [None]:
exp_df = exp_df.drop(
    ["Unnamed: 23", "Unnamed: 24"], axis=1
)  # these columns are all NaN values

exp_df.columns = exp_df.iloc[3]  # assigns the 3rd row as column headers.
exp_df = exp_df.drop(3)  # drops header row.
exp_df = exp_df.dropna(axis=0)  # drop NaN values
exp_df = exp_df.reset_index(drop=True)  # resets index.
exp_df = exp_df.drop("Indicator Name", axis=1)  # this is the column header of the years

# transform columns to row and rows to column (like from a wide to long dataframe)
exp_melt = exp_df.melt(
    id_vars=["Country Name", "Country Code"],
    var_name="Year",
    value_name="Current health expenditure (% of GDP)",
)

exp_melt["year"] = exp_melt["Year"].astype(
    int
)  # converts the year column from object to int.


On this note, I want to create a commom column in the `disease_age_iso_pop_doc` dataframe for me to easily merge `exp_melt` to it. I want to extract only the year from the date column. Of course, I will drop this column immediately the dataframes have been merged.

*Trying to covert the year column in the `exp_melt` df was taking me to where I don't know. lol*

In [None]:
# create a year only columns from these dataframes for easy merging
disease_age_iso_pop_doc["year"] = pd.DatetimeIndex(disease_age_iso_pop_doc["Year"]).year


**Finally**, I will merge the Current health Expenditure (% of GDP) data, which has been prepared to a large extent to the dataframe to give me the `disease_age_iso_pop_doc_exp` df. The name of the final dataframe is from the six data, which are: `Annual Number of Death by Cause`, `Number of Death by Age Group`, `ISo_3166 Country and Continent`, `World population`, `Medical Doctors per 10000 population`, and `Current health Expenditure (% of GDP)`. Least I forget,

*   Current health Expenditure (% of GDP) data - Gives the Countries expenditure on health as a percentage of GDP.


In [None]:
# Merge disease, age-group, iso, population, medical-doctors and Expenditura data
disease_age_iso_pop_doc_exp = pd.merge(
    disease_age_iso_pop_doc,
    exp_melt.drop(
        ["Country Name", "Year"],
        axis=1,
    ),
    how="left",
    left_on=["Code", "year"],
    right_on=["Country Code", "year"],
)
# drop the year and Country Code columns from the exp_melt data
disease_age_iso_pop_doc_exp = disease_age_iso_pop_doc_exp.drop(
    ["year", "Country Code"], axis=1
)


Viola! Here comes my final dataset `disease_age_iso_pop_doc_exp`.

In the process of merging the data, analysis was being down simultaneously, like I promised.

Below are the codes applied in analyzing the various dataframes to derive insights  .

Howbeit, there are many NaN values in this final data.

In [None]:
disease_age_iso_pop_doc_exp.head(3).T

Unnamed: 0,0,1,2
Entity,Afghanistan,Afghanistan,Afghanistan
Code,AFG,AFG,AFG
Year,2007-01-01 00:00:00,2007-01-01 00:00:00,2007-01-01 00:00:00
Number of executions (Amnesty International),15,15,15
Meningitis,2933.0,2933.0,2933.0
Alzheimer's disease and other dementias,1402.0,1402.0,1402.0
Parkinson's disease,450.0,450.0,450.0
Nutritional deficiencies,2488.0,2488.0,2488.0
Malaria,393.0,393.0,393.0
Drowning,2127.0,2127.0,2127.0


<a name="4.0"></a>
### Visualizations Codes


<a name="4"></a>
Check below the codes for links to visuals


#### Codes for Plots1 & Plot2

The first code gives a **bar graph** of the total death caused by diseases per Continents in the World

In [None]:
### plot1 - total death  by Continents in the World
continent_name = disease_iso_no_dup["Continent_Name"]
disease_sum = disease_iso_no_dup.drop(["Continent_Name", "Year"], axis=1).sum(axis=1)
death_sum_by_con = pd.concat([continent_name, disease_sum], axis=1)
death_sum_by_con.rename(
    columns={
        "Continent_Name": "continent_name",
        0: "death from diseases",
    },
    inplace=True,
)  # renames the columns
death_sum_by_con = death_sum_by_con.groupby(["continent_name"])["death from diseases"].sum()


The second code gives a **bar graph** of the total death caused by diseases per Counries in the Africa  Continent.

In [None]:
### plot2 - total death by Countries in Africa
disease_iso_afr = disease_iso_no_dup[disease_iso_no_dup['Continent_Name']=='Africa']
country_name = disease_iso_afr["Country_Name"]
disease_sum_afr = disease_iso_afr.drop(
    ["Continent_Name", "Year", "Country_Name"], axis=1
).sum(axis=1)
death_sum_by_afr = pd.concat([country_name, disease_sum_afr], axis=1)
death_sum_by_afr.rename(
    columns={
        "Country_Name": "country_name",
        0: "death from diseases",
    },
    inplace=True,
)  # renames the columns
death_sum_by_afr = death_sum_by_afr.groupby(["country_name"])["death from diseases"].sum()


<a name="1&2b"></a>


[View visuals 1 & 2](#1&2)


<a name="5"></a>

#### Codes for Plot3 & Plot4

The third code gives a **horizontal bar graph** of the diseases and total death caused by each of these diseases in the World

In [None]:
### plot3 - Death causing Diseases in the World
world_death_causing_diseases = disease_df.drop(
    ["Entity", "Year", "Code"], axis=1
)
world_death_causing_diseases = world_death_causing_diseases.fillna(0)

The fourth code gives a **horizontal bar graph** of the diseases and total death caused by each of these diseases in the Africa continent.

In [None]:
###  plot4 - Death causing Diseases in Africa
africa_death_causing_diseases = disease_iso_afr.drop(
    ["Continent_Name", "Year", "Country_Name"], axis=1
)

africa_death_causing_diseases = africa_death_causing_diseases.fillna(0)

<a name="3&4b"></a>


[View visuals 3 & 4](#3&4)


<a name="6"></a>


#### Codes for Plot5 & Plot6

The fifth code gives a **line graph** of the average death trend over the period, between 1990 to 2019 in the World

In [None]:
### plot5 - Death Trend over 30 years period in the World
year = disease_df["Year"]
average_death = world_death_causing_diseases.mean(
    axis=1
)  # averages the yearly death from all the causes
death_rate_over_time = pd.concat(
    [year, average_death], axis=1
)  # combines the year and average death from all causes in one df
death_rate_over_time.rename(
    columns={
        "Year": "year",
        0: "average_death",
    },
    inplace=True,
)  # renames the columns


In [None]:
death_rate_over_time["year"] = pd.DatetimeIndex(
    death_rate_over_time["year"]
).year  # takes only the year from the datetime column
death_rate_over_time = death_rate_over_time.groupby(
    by=["year"]
).mean()  # groups all data according to year and average up


The sixth code gives a **line graph** of the average death trend over the period, between 1990 to 2019 in the Africa continent

In [None]:
### plot6 - Death Trend over 30 years period in Africa
year = disease_iso_afr["Year"]
average_death = africa_death_causing_diseases.mean(
    axis=1
)  # averages the yearly death from all the causes in Africa
death_rate_over_time_africa = pd.concat(
    [year, average_death], axis=1
)  # combines the year and average death from all causes in one df
death_rate_over_time_africa.rename(
    columns={
        "Year": "year",
        0: "average_death",
    },
    inplace=True,
)  # renames the columns


In [None]:
death_rate_over_time_africa["year"] = pd.DatetimeIndex(
    death_rate_over_time_africa["year"]
).year  # takes only the year from the datetime column
death_rate_over_time_africa = death_rate_over_time_africa.groupby(
    by=["year"]
).mean()  # groups all data according to year and average up

<a name="5&6b"></a>


[View visuals 5 & 6](#5&6)


<a name="7"></a>

#### Codes for Plots7 & Plot8

The seventh code plots a **pie chart** of record of death according to age-group in the World


In [None]:
### plot7 - death by age group in the World
age_df_world = age_df.drop(["Entity", "Code", "Year"], axis=1)


The eighth code plots a **pie chart** of record of death according to age-group in Africa


In [None]:
### plot8 - death by age group in Africa
age_df_africa = age_iso[age_iso.loc[:, "Continent_Name"] == "Africa"]
age_df_africa = age_df_africa.drop(
    [
        "Entity",
        "Code",
        "Year",
        "Continent_Name",
        "Country_Name",
        "Three_Letter_Country_Code",
    ],
    axis=1,
)


<a name="7&8b"></a>


[View visuals 7 & 8](#7&8)


<a name="8"></a>

#### Codes for Plot9 & Plot10


The ninth code plots a **line graph** of the average growth in popultion size between 1990 and 2019 in the World


In [None]:
### plot9- Average Population Growth over the 30 years in the world
years_p = disease_age_iso_pop["Year"]  # takes only the year from the datetime column
population = disease_age_iso_pop["Population (historical estimates)"]
continent_name = disease_age_iso_pop["Continent_Name"]
populate_world = pd.concat(
    [years_p, population], axis=1
)  # combines the year and population in one df
populate_world["year"] = pd.DatetimeIndex(
    populate_world["Year"]
).year  # takes only the year from the datetime column

populate_world = populate_world.fillna(0) # fill NaN values with zero

populate_world_avg = populate_world.groupby(
    by=["year"]
).mean()  # groups all data according to year and average up


The tenth code plots a **line graph** of the average growth in popultion size between 1990 and 2019 in Africa


In [None]:
### plot10 - Average Population Growth over the 30 years in Africa
populate_africa_df = disease_age_iso_pop[
    disease_age_iso_pop.loc[:, "Continent_Name"] == "Africa"
]
years_a = populate_africa_df["Year"]
af_populate = populate_africa_df["Population (historical estimates)"]
country_name = populate_africa_df["Country_Name"]
populate_africa = pd.concat(
    [years_a, af_populate], axis=1
)  # combines the year and population in one df
populate_africa["Year"] = pd.DatetimeIndex(
    populate_africa["Year"]
).year  # takes only the year from the datetime column

populate_africa_avg = populate_africa.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up

<a name="9"></a>

#### Codes for Plot11 & Plot12

The eleventh code plots a **bar graph** that groups the total population from 1990 to 2019 according to the Continents in the World


In [None]:
### plot11- Total Population by Continents in the world
total_pop_continent = pd.concat(
    [continent_name, population], axis=1
)  # combines the continent_name and population in one df

total_pop_continent = total_pop_continent.dropna() # drop NaN rows from data

total_pop_continent_sum = total_pop_continent.groupby(["Continent_Name"])[
    "Population (historical estimates)"
].sum()


The twelveth code plots a **bar graph** that groups the total population from 1990 to 2019 according to the Countries in the Continent of Africa.


In [None]:
### plot12- Total Population by Countries in Africa
total_pop_africa_countries = pd.concat(
    [country_name, af_populate], axis=1
)  # combines the country_name and population in one df
total_pop_africa_countries_sum = total_pop_africa_countries.groupby(["Country_Name"])[
    "Population (historical estimates)"
].sum()

populate_africa_avg = populate_africa.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up

<a name="9&10b"></a>


[View visuals 9, 10, 11 & 12](#9&10)


<a name="10"></a>

#### Codes for Plot13 & Plot14

This thirteen code plots a **pie chart** that groups the Medical Doctors in the World into 5 Classes.


In [None]:
### plot13 - The Different Classes of Medical Doctors: World
doc_df = doc_df.drop(
    [
        "Applied filters:\nLocation type is Country\nLANGUAGE_CODE is en\nIndicatorCode is HWF_0001, HWF_0002, HWF_0003, HWF_0004, or HWF_0005",
        "Unnamed: 2",
        "Unnamed: 4",
        "Unnamed: 7",
    ],
    axis=1,
)

doc_df["Doctor_Value"] = doc_df["Doctor_Value"].str.replace(
    "\xa0", ""
)  # removes space (the space is in the form '\xa0') between digits
doc_df["Doctor_Value"] = doc_df["Doctor_Value"].astype(
    float
)  # converts the data type to float

fields_of_med_spec_world = doc_df["Indicator"].value_counts()


This fourteenth code plots a **pie chart** that groups the Medical Doctors in the Africa into 5 Classes.


In [None]:
### plot14 - The Different Classes of Medical Doctors: Africa
doc_afr = doc_df[doc_df["Continent_Name"] == "Africa"]
fields_of_med_spec_afr = doc_afr["Indicator"].value_counts()


<a name="13&14b"></a>


[View visuals 13 & 14](#13&14)


In [None]:
Medical_doctors_in_number = doc_df[doc_df["Indicator"] == "Medical doctors (number)"]
Medical_doctors_per_10000 = doc_df[
    doc_df["Indicator"] == "Medical doctors (per 10,000)"
]
Medical_doctors_not_further_defined = doc_df[
    doc_df["Indicator"] == "Medical doctors not further defined (number)"
]
Generalist_medical_practitioners = doc_df[
    doc_df["Indicator"] == "Generalist medical practitioners (number)"
]
Specialist_medical_practitioners = doc_df[
    doc_df["Indicator"] == "Specialist medical practitioners (number)"
]


In [None]:
'''
import pandas as pd

# Read in the dataframe
df = pd.read_csv("data.csv")

# Select the column you want to iterate through
col = doc_df["Indicator"]

# Create three empty lists to store the output
output1 = []
output2 = []
output3 = []
output4 = []
output5 = []

# Iterate through the column in the dataframe
for value in col:
  # Append the first set of values to the first list
  output1.append(value)
  # Append the second set of values to the second list
  output2.append(value * 2)
  # Append the third set of values to the third list
  output3.append(value * 3)

# Zip the three lists together into a single iterable object
output = zip(output1, output2, output3)

# Create a new dataframe using the output
df_output = pd.DataFrame(output, columns=['column1', 'column2', 'column3'])

# Concatenate the two dataframes
df = pd.concat([df, df_output], axis=1)

# Display the updated dataframe
print(df)

'''

'\nimport pandas as pd\n\n# Read in the dataframe\ndf = pd.read_csv("data.csv")\n\n# Select the column you want to iterate through\ncol = doc_df["Indicator"]\n\n# Create three empty lists to store the output\noutput1 = []\noutput2 = []\noutput3 = []\noutput4 = []\noutput5 = []\n\n# Iterate through the column in the dataframe\nfor value in col:\n  # Append the first set of values to the first list\n  output1.append(value)\n  # Append the second set of values to the second list\n  output2.append(value * 2)\n  # Append the third set of values to the third list\n  output3.append(value * 3)\n\n# Zip the three lists together into a single iterable object\noutput = zip(output1, output2, output3)\n\n# Create a new dataframe using the output\ndf_output = pd.DataFrame(output, columns=[\'column1\', \'column2\', \'column3\'])\n\n# Concatenate the two dataframes\ndf = pd.concat([df, df_output], axis=1)\n\n# Display the updated dataframe\nprint(df)\n\n'

In [None]:
doctors_in_number_afr = Medical_doctors_in_number[
    Medical_doctors_in_number["Continent_Name"] == "Africa"
]
doctors_per_10000_afr = Medical_doctors_per_10000[
    Medical_doctors_per_10000["Continent_Name"] == "Africa"
]
doctors_undefined_afr = Medical_doctors_not_further_defined[
    Medical_doctors_not_further_defined["Continent_Name"] == "Africa"
]
doctors_generalist_afr = Generalist_medical_practitioners[
    Generalist_medical_practitioners["Continent_Name"] == "Africa"
]
doctors_specialist_afr = Specialist_medical_practitioners[
    Specialist_medical_practitioners["Continent_Name"] == "Africa"
]


<a name="14.1"></a>


#### Plot14_1

This sub-fourteenth code plots a **stacked bar chart** that gives the average Medical Doctors in the 5 classes of practice in the World and Africa.


In [None]:
# collect data for the 5 classes for year corresponding to 2016 from the whole dataframe
Medical_doctors_in_number_2016 = Medical_doctors_in_number[
    Medical_doctors_in_number.Year == "2016-01-01"
] # subset out data of Medical_doctors_in_number for 2016
Medical_doctors_per_10000_2016 = Medical_doctors_per_10000[
    Medical_doctors_per_10000.Year == "2016-01-01"
] # subset out data of Medical_doctors_per_10000 for 2016
Medical_doctors_not_further_defined_2016 = Medical_doctors_not_further_defined[
    Medical_doctors_not_further_defined.Year == "2016-01-01"
] # subset out data of Medical_doctors_not_further_defined for 2016
Generalist_medical_practitioners_2016 = Generalist_medical_practitioners[
    Generalist_medical_practitioners.Year == "2016-01-01"
] # subset out data of Generalist_medical_practitioners for 2016
Specialist_medical_practitioners_2016 = Specialist_medical_practitioners[
    Specialist_medical_practitioners.Year == "2016-01-01"
] # subset out data of Specialist_medical_practitioners for 2016


In [None]:
# collect data for the 5 classes for year corresponding to 2016 from Africa dataframe
doctors_in_number_afr_2016 = doctors_in_number_afr[
    doctors_in_number_afr.Year == "2016-01-01"
]
doctors_per_10000_afr_2016 = doctors_per_10000_afr[
    doctors_per_10000_afr.Year == "2016-01-01"
]
doctors_undefined_afr_2016 = doctors_undefined_afr[
    doctors_undefined_afr.Year == "2016-01-01"
]
doctors_generalist_afr_2016 = doctors_generalist_afr[
    doctors_generalist_afr.Year == "2016-01-01"
]
doctors_specialist_afr_2016 = doctors_specialist_afr[
    doctors_specialist_afr.Year == "2016-01-01"
]


In [None]:
### Plot14_1 - Average Doctors in 5 the Classes of Practice in 2016 (World vs Africa)
# create a dictionary from the mean values for each class in the World and Africa
doctor_class = {
    "Medical_doctors_in_number": [
        Medical_doctors_in_number_2016.Doctor_Value.mean(),
        doctors_in_number_afr_2016.Doctor_Value.mean(),
    ],
    "Medical_doctors_per_10000": [
        Medical_doctors_per_10000_2016.Doctor_Value.mean(),
        doctors_per_10000_afr_2016.Doctor_Value.mean(),
    ],
    "Medical_doctors_not_further_defined": [
        Medical_doctors_not_further_defined_2016.Doctor_Value.mean(),
        doctors_undefined_afr_2016.Doctor_Value.mean(),
    ],
    "Generalist_practitioners": [
        Generalist_medical_practitioners_2016.Doctor_Value.mean(),
        doctors_generalist_afr_2016.Doctor_Value.mean(),
    ],
    "Specialist_practitioners": [
        Specialist_medical_practitioners_2016.Doctor_Value.mean(),
        doctors_specialist_afr_2016.Doctor_Value.mean(),
    ],
}
# creating a Dataframe object from dictionary with custom indexing
world_africa_practitioners = pd.DataFrame(doctor_class, index=["World", "Africa"]).T


<a name="14_1b"></a>


[View visuals 14_1](#14_1)


<a name="11"></a>

#### Code for Plot15 & Plot16

First split the dataframe containing Medical docors data into the various classes.




In [None]:
#  Doctors in Number in the World df
Doctors_in_number = Medical_doctors_in_number.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
Doctors_in_number.rename(
    columns={"Doctor_Value": "Doctors_in_number"},
    inplace=True,
)  # renames the columns


In [None]:
#  Doctors per 10000 in the World df
Doctors_per_10000 = Medical_doctors_per_10000.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
Doctors_per_10000.rename(
    columns={"Doctor_Value": "Doctors_per_10000"},
    inplace=True,
)  # renames the columns


In [None]:
# Doctors not Further Defined in the World df 
Doctors_undefined = Medical_doctors_not_further_defined.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
Medical_doctors_not_further_defined.rename(
    columns={"Doctor_Value": "Doctors_undefined"},
    inplace=True,
)  # renames the columns


In [None]:
#  Generalist Doctors in the World df
Doctors_Generalist = Generalist_medical_practitioners.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
Doctors_Generalist.rename(
    columns={"Doctor_Value": "Doctors_Generalist"},
    inplace=True,
)  # renames the columns


In [None]:
# Specialist Doctors in the World df
Doctors_Specialist = Specialist_medical_practitioners.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
Doctors_Specialist.rename(
    columns={"Doctor_Value": "Doctors_Specialist"},
    inplace=True,
)  # renames the columns


In [None]:
# Doctors in Number in Africa df
doctors_in_number = doctors_in_number_afr.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
doctors_in_number.rename(
    columns={"Doctor_Value": "doctors_in_number"},
    inplace=True,
)  # renames the columns


In [None]:
# Doctors per 10000 in Africa df
doctors_per_10000 = doctors_per_10000_afr.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
doctors_per_10000.rename(
    columns={"Doctor_Value": "doctors_per_10000"},
    inplace=True,
)  # renames the columns


In [None]:
# Doctors not Further Defined in Africa df 
doctors_undefined = doctors_undefined_afr.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
doctors_undefined.rename(
    columns={"Doctor_Value": "doctors_undefined"},
    inplace=True,
)  # renames the columns


In [None]:
# Generalist Doctors in Africa df
doctors_generalist = doctors_generalist_afr.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
doctors_generalist.rename(
    columns={"Doctor_Value": "doctors_generalist"},
    inplace=True,
)  # renames the columns


In [None]:
# Generalist Doctors in Africa df
doctors_specialist = doctors_specialist_afr.groupby(
    by=["Year"]
).mean()  # groups all data according to year and average up
doctors_specialist.rename(
    columns={"Doctor_Value": "doctors_specialist"},
    inplace=True,
)  # renames the columns


This fifteenth code plots a **line graph** that shows changes in the classes of Medical Doctors within the period of 30 years in the World.


In [None]:
### plot15 - Changes per Classes of Medical Doctors in 30years: World
dfs = [
    Doctors_in_number,
    Doctors_per_10000,
    Doctors_undefined,
    Doctors_Generalist,
    Doctors_Specialist,
]
med_doctor_line_world = ft.reduce(
    lambda left, right: pd.merge(left, right, on="Year"), dfs
) # merge values in a dataframe


The sixteenth code plots a **line graph** also that shows changes in the classes of Medical Doctors within the period of 20 years in Africa.


In [None]:
### plot16 - Changes per Classes of Medical Doctors in 20years: Africa
dfs_afr = [
    doctors_in_number,
    doctors_per_10000,
    doctors_undefined,
    doctors_generalist,
    doctors_specialist,
]
med_doctor_line_afr = ft.reduce(
    lambda left, right: pd.merge(left, right, on="Year"), dfs_afr
) # merge values in a dataframe

<a name="15&16b"></a>


[View visuals 15 & 16](#15&16)


<a name="12"></a>

#### Code for Plot17

The seventeenth code shows a **bar graph**  that groups health expenditure according to Continents.


In [None]:
### plot17- Average Expenditure per Continent
exp_iso = pd.merge(
    exp_melt,
    iso_df.drop(['Continent_Code','Two_Letter_Country_Code','Country_Name'], axis = 1),
    how="left",
    left_on=["Country Code"],
    right_on=["Three_Letter_Country_Code"],
)
exp_iso = exp_iso.dropna() # drop rows with NaN values. eg Continent_Name

continent_name = exp_iso["Continent_Name"]
expenditure = exp_iso["Current health expenditure (% of GDP)"]
expend_by_cont = pd.concat([continent_name, expenditure], axis=1)
expend_by_continent = expend_by_cont.groupby(["Continent_Name"])["Current health expenditure (% of GDP)"].mean()

In [None]:
expend_by_cont_asia = expend_by_cont[expend_by_cont.Continent_Name == "Asia"]
expend_by_cont_europe = expend_by_cont[expend_by_cont.Continent_Name == "Europe"]
expend_by_cont_north = expend_by_cont[expend_by_cont.Continent_Name == "North America"]
expend_by_cont_africa = expend_by_cont[expend_by_cont.Continent_Name == "Africa"]
expend_by_cont_oceania = expend_by_cont[expend_by_cont.Continent_Name == "Oceania"]
expend_by_cont_South = expend_by_cont[expend_by_cont.Continent_Name == "South America"]

<a name="13"></a>

#### Code for Plot18

The eighteenth code shows a **stacked bar graph**  that shows the  maximum and minimum health expenditure according to Continents.


In [None]:
### plot18- Min and Max Percentage Of GDP Spent on Health for the Continents
# create a dictionary from the maximum and minmum exp for the continents
dt = {
    "Asia": [
        expend_by_cont_asia["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_asia["Current health expenditure (% of GDP)"].max(),
    ],
    "Europe": [
        expend_by_cont_europe["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_europe["Current health expenditure (% of GDP)"].max(),
    ],
    "Africa": [
        expend_by_cont_africa["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_africa["Current health expenditure (% of GDP)"].max(),
    ],
    "Oceania": [
        expend_by_cont_oceania["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_oceania["Current health expenditure (% of GDP)"].max(),
    ],
    "South America": [
        expend_by_cont_South["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_South["Current health expenditure (% of GDP)"].max(),
    ],
    "North America": [
        expend_by_cont_north["Current health expenditure (% of GDP)"].min(),
        expend_by_cont_north["Current health expenditure (% of GDP)"].max(),
    ],
}
# creating a Dataframe object from dictionary with custom indexing
min_max_range = pd.DataFrame(dt, index=["min_exp (% of GDP)", "max_exp (% of GDP)"]).T


<a name="17&18b"></a>


[View visuals 17 & 18](#17&18)


<a name="14"></a>

#### Code for Plot19 & Plot20

The nineteenth code outputs a **scatter plot**  that shows the relationship between health expenditure and total population in the Continents.

In [None]:
### plot19- Relationship Between Health Expenditure and Total Population
expend_populat = pd.merge(
    expend_by_continent*3,# increased expenditure by 3x to refect on the size of the bubbles
    total_pop_continent_sum,
    how="left",
    left_on=["Continent_Name"],
    right_on=["Continent_Name"]
).reset_index()


The twentieth code outputs a **bubble plot**  that displays the relationship between death by diseases and population, and the size of the bubble is average health expenditure in the Continents.

In [None]:
### plot20 - The Effect of Health Expenditure on Total Population and Death from Diseases
expend_populate_death = pd.merge(
    expend_populat,
    death_sum_by_con,
    how="left",
    left_on=["Continent_Name"],
    right_on=["continent_name"]
)

<a name="19&20b"></a>


[View visuals 19 & 20](#19&20)


<a name="15"></a>

### Code for Plot21 & Plot22

The twenty-first code outputs a **scatter plot**  that displays the relationship between Percentage of GDP spent on Health and Average Death in the World.

In [None]:
### plot21- Percentage of GDP spent on Health and Average Death in the world
exp_over_time = exp_melt.groupby(["year"])[
    "Current health expenditure (% of GDP)"
].mean()
expend_death_over_time_world = pd.concat(
    [death_rate_over_time, exp_over_time], axis=1, join="inner"
)


The twenty-second code outputs a **scatter plot**  that displays the relationship between Percentage of GDP spent on Health and Average Death by diseases in Africa.

In [None]:
### plot22- Percentage of GDP spent on Health and Average Death in Africa
exp_africa = exp_iso[exp_iso.Continent_Name == "Africa"]
exp_over_time_afr = exp_africa.groupby(["year"])[
    "Current health expenditure (% of GDP)"
].mean()
expend_death_over_time_afr = pd.concat(
    [death_rate_over_time_africa, exp_over_time_afr], axis=1, join="inner"
)


<a name="21&22b"></a>


[View visuals 21 & 22](#21&22)


<a name="16"></a>

## Visualizations & Insights

**How Many Continents are Captured in this Data?**

In [None]:
print(
    f"\nThere are {iso_df.Continent_Name.nunique()} Continents captured in the data which are: {iso_df.Continent_Name.unique()}\n"
)


There are 7 Continents captured in the data which are: ['Asia' 'Europe' 'Antarctica' 'Africa' 'Oceania' 'North America'
 'South America']



**How Many Countries Data are Included in this Analysis?**

In [None]:
print(f'\nThere are {iso_df.Country_Name.nunique()} Countries in total')


There are 254 Countries in total


**What are the Numbers of African Countries Alone?**

In [None]:
africa_countries = iso_df[iso_df.loc[:, "Continent_Name"] == "Africa"]
print(f"\nThere are {africa_countries.Country_Name.nunique()} Countries from Africa")


There are 58 Countries from Africa


**What is the Period covered by This data?**

In [None]:
print(
    f" \nThe period covered by the death data is {disease_df.Year.nunique()} years, which ranges from {pd.DatetimeIndex(disease_df.Year).year.min()} to {pd.DatetimeIndex(disease_df.Year).year.max()}"
)

 
The period covered by the death data is 30 years, which ranges from 1990 to 2019


**What is the Total death Caused by Diseases in the World?**



In [None]:
print(
    f"\nThe total death by diseases in the world within this period is {world_death_causing_diseases.sum(axis=1).sum()}"
)


The total death by diseases in the world within this period is 8601548690.0


**What is the total Death by Age Group in the World?**

In [None]:
print(
    f"\nThe total death by Age-group in the world within this period is {age_df_world.sum(axis=1).sum()}"
)


The total death by Age-group in the world within this period is 9027780193


**What is the total Death by Age Group in Africa?**

In [None]:
print(
    f"\nThe total death by Age-group in Africa within this period is {age_df_africa.sum(axis=1).sum()}, which is {round((age_df_africa.sum(axis=1).sum()/age_df_world.sum(axis=1).sum())*100,2)}% of the World's total."
)


The total death by Age-group in Africa within this period is 278690736, which is 3.09% of the World's total.


**What is the Average Population of the World and the African Continent?**

In [None]:
t_po_w = round(disease_age_iso_pop['Population (historical estimates)'].mean())
t_po_a = round(populate_africa_df['Population (historical estimates)'].mean())
print(f'\nAverage yearly population, as given in this data, of the World is {t_po_w}')
print(f'\nAverage yearly population, as given by the data, in Africa is {t_po_a}')


Average yearly population, as given in this data, of the World is 127819675

Average yearly population, as given by the data, in Africa is 17406523


[Results & Recommendations](#50)


<a name="1&2"></a>

**What is the Amount of Death for each Continent and Which Continent Rates the Highest in Death Count?**

In [None]:
plot1 = death_sum_by_con.plot_bokeh(
    kind="bar", 
    title="Total Death by Continents in the World",
    xlabel='Count',
    plot_data_points=False,
    show_figure =False,
)
plot2 = death_sum_by_afr.plot_bokeh(
    kind="bar", 
    title="Total Death by Counties in Africa",
    xlabel='Count',
    plot_data_points=False,
    show_figure =False,
    legend='top_left',
)
pandas_bokeh.plot_grid([[plot1,plot2]], plot_width=550, plot_height=400 )

<a name="4"></a>


[Back to code](#1&2b)


<a name="3&4"></a>

**What are the Major Causes of Death?**

**Are There differences in The Major causes of Death in the World vs African Continent?**

In [None]:
plot3 = (
    world_death_causing_diseases.mean()
    .sort_values()
    .plot_bokeh(
        kind="barh",
        title="Death Causing Diseases in the World",
        xlabel="Count",
        plot_data_points=False,
        show_figure=False,
        legend=False,
    )
)
plot4 = (
    africa_death_causing_diseases.mean()
    .sort_values()
    .plot_bokeh(
        kind="barh",
        title="Death Causing Diseases in Africa",
        xlabel="Count",
        plot_data_points=False,
        show_figure=False,
        legend=False,
    )
)
pandas_bokeh.plot_grid([[plot3, plot4]], plot_width=550, plot_height=450)


[Back to code](#3&4b)

<a name="5&6"></a>

**What is the Death Trend Over The 30 years Under Consideration?**

In [None]:
plot5 = death_rate_over_time.plot_bokeh(
    kind="line",
    title="Death Trend over 30 years period in the World",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left',
)
plot6 = death_rate_over_time_africa.plot_bokeh(
    kind="line",
    title="Death Trend over 30 years period in Africa",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
)
pandas_bokeh.plot_grid([[plot5, plot6]], plot_width=550, plot_height=400)


[Back to code](#5&6b)

<a name="7&8"></a>
**Which Age Group Accounts for the highest Death Rate?**

In [None]:
plot7 = age_df_world.mean().plot_bokeh(
    kind="pie",
    title="Death by Age Group in the World",
    plot_data_points=False,
    show_figure=False,
)
plot8 = age_df_africa.mean().plot_bokeh(
    kind="pie",
    title="Death by Age Group in Africa",
    plot_data_points=False,
    show_figure=False,
)
pandas_bokeh.plot_grid([[plot7, plot8]], plot_width=550, plot_height=550)


__x__values_original
__x__values_original


[Back to code](#7&8b)

<a name="9&10"></a>

**How Have the Population Changed in the World and Africa over the Years in consideration?**

**Which Continent Has the Largest Population in the World?**

**Which Countries Have the Largest and Fewest Population in Africa?**

In [None]:
plot9 = populate_world_avg.plot_bokeh(
    kind="line",
    title="Average Population Growth over the 30 years in the world",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left',
)
plot10 = populate_africa_avg.plot_bokeh(
    kind="line",
    title="Average Population Growth over the 30 years in Africa",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left',
)
plot11 = total_pop_continent_sum.plot_bokeh(
    kind="bar",
    title="Total Population by Continents in the world",
    xlabel="Count",
    plot_data_points=False,
    show_figure=False,
)
plot12 = total_pop_africa_countries_sum.plot_bokeh(
    kind="bar",
    title="Total Population by Counties in Africa",
    xlabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left',
)
pandas_bokeh.plot_grid(
    [[plot9, plot10], [plot11, plot12]], plot_width=550, plot_height=400
)


[Back to code](#9&10b)

<a name="13&14"></a>

**What are the Proportions of the Classes of Medical Doctors in Africa and the World?**

In [None]:
plot13 = fields_of_med_spec_world.plot_bokeh(
    kind="pie",
    title="The Proportion of the Classes of Medical Doctors: World",
    plot_data_points=False,
    show_figure=False,
)
plot14 = fields_of_med_spec_afr.plot_bokeh(
    kind="pie",
    title="The Proportion of the Classes of Medical Doctors: Africa",
    plot_data_points=False,
    show_figure=False,
)
pandas_bokeh.plot_grid([[plot13, plot14]], plot_width=550, plot_height=550
)



__x__values_original
__x__values_original


[Back to code](#13&14b)

<a name="14_1"></a>
**What are the average differences in the 5 classes of Medical Doctors in the World and Africa?**

In [None]:
plot14_1 = world_africa_practitioners.plot_bokeh(
    kind="bar",
    title="Average Doctors in the 5 Classes of Practice in 2016 (World vs Africa)",
    xlabel="Medical Doctors",
    plot_data_points=False,
    show_figure=False,
    legend='top_right',
)
pandas_bokeh.plot_grid([[plot14_1]], plot_width=1100, plot_height=400
)


[Back to code](#14_1b)

<a name="15&16"></a>

**How have the Proportion of these Classes Changed over the years?**



In [None]:
plot15 = med_doctor_line_world.plot_bokeh(
    kind="line",
    title="Changes in the Classes of Medical Doctors in 30years: World",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_center',
)
plot16 = med_doctor_line_afr.plot_bokeh(
    kind="line",
    title="Changes in the Classes of Medical Doctors in 20years: Africa",
    ylabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left',
)
pandas_bokeh.plot_grid([[plot15, plot16]], plot_width=550, plot_height=400)


[Back to code](#15&16b)

<a name="17&18"></a>

**What Percent of GDP is Spent on Health by the Continent?**

**What are the Minimum and Maximum Percentages that have been Spent on Health by these Continent?**

In [None]:
plot17 = expend_by_continent.plot_bokeh(
    kind="bar",
    title="Percentage of GDP Spent on Health for the Continent",
    xlabel="Count",
    plot_data_points=False,
    show_figure=False,
    legend='top_left'
)
plot18 = min_max_range.plot_bokeh(
    kind="bar",
    title="Min and Max Percentage Of GDP Spent on Health for the Continents",
    xlabel="Count",
    plot_data_points=False,
    show_figure=False,
)
pandas_bokeh.plot_grid(
    [[plot17, plot18]], plot_width=550, plot_height=400
)


[Back to code](#17&18b)


<a name="19&20"></a>
**What is the Relationship Between Population, Health Expenditure and Death?**

In [None]:
plot19 = expend_populat.plot_bokeh(
    kind="scatter",
    title="Relationship Between Health Expenditure and Total Population", 
    x="Population (historical estimates)", 
    y="Current health expenditure (% of GDP)",
    category="Continent_Name",
    show_figure=False
)
plot20 = expend_populate_death.plot_bokeh(
    kind="scatter",
    title="The Effect of Health Expenditure on Total Population and Death from Diseases", 
    x="Population (historical estimates)", 
    y="death from diseases",
    category="Continent_Name",
    size="Current health expenditure (% of GDP)",
    legend='top_left',
    show_figure=False
)
pandas_bokeh.plot_grid(
    [[plot19, plot20]], plot_width=550, plot_height=400
)


[Back to code](#19&20b)

<a name="21&22"></a>

**How does Health Expenditure affect Death?**

In [None]:
plot21 = expend_death_over_time_world.plot_bokeh(
    kind="scatter",
    title="Percentage of GDP spent on Health and Average Death in the World", 
    x="Current health expenditure (% of GDP)", 
    y="average_death",
    legend=False,
    show_figure=False
)
plot22 = expend_death_over_time_afr.plot_bokeh(
    kind="scatter",
    title="Percentage of GDP spent on Health and Average Death in Africa",
    x="Current health expenditure (% of GDP)", 
    y="average_death",
    legend=False,
    show_figure=False
)
pandas_bokeh.plot_grid(
    [[plot21,plot22]], plot_width=550, plot_height=400
)

<a name="50"></a>
[Back to code](#21&22b)

**It's at this ponit that I would love to drop my *mouse*.**

But, before I do that, here are the results, recommendations and conclusion.

Firstly,

<a name="17"></a>

### A Brief History of the Africa Continent
Africa is the world's second-largest and second-most populous continent, after Asia in both cases. At about 30.3 million km2 (11.7 million square miles) including adjacent islands, it covers 6% of Earth's total surface area and 20% of its land area. With 1.4 billion people as of 2021, it accounts for about 18% of the world's human population [Wikipedia](https://en.wikipedia.org/wiki/Africa#:~:text=Africa%20is%20the%20world%27s%20second,20%25%20of%20its%20land%20area).

The continent has 40 percent of the world’s gold and up to 90 percent of its chromium and platinum. The largest reserves of cobalt, diamonds, platinum and uranium in the world are in Africa. It holds 65 per cent of the world’s arable land and ten percent of the planet’s internal renewable fresh water source [UNEP](https://www.unep.org/regions/africa/our-work-africa).

There are 54 countries in Africa today, according to the United Nations. The full list is shown in the table in the link, with current population and subregion ([based on the United Nations official statistics](https://www.worldometers.info/geography/how-many-countries-in-africa/)). Not included in this total of "countries" and listed separately are:

*   Dependencies (or dependent territories, dependent areas) or Areas of Special Sovereignty (autonomous territories).


<a name="18"></a>
## Results from this Analysis

The 7 Continents were captured in the data: Asia, Europe, Antarctica, Africa, Oceania, North America, and South America. There were `254` Countries in total and `58` of them were African Countries. The period covered by the death data was 30 years, which ranged from 1990 to 2019. The total death by diseases in the world within this period was 8,601,548,690. 

The total death by Age-group in the world within this period was 9027780193. The total death by  Age-group in Africa within this period was 278,690,736, which was 3.09% of the World's total. The average yearly population, as given in this data, of the World was 127,819,675 and of Africa was 17,406,523



Africa rated second, after Asia, among the Continents with the highest record of death from the various diseases. In Africa, Nigeria topped the list.

Malaria, HIV/AIDS and Diarrheal were the major diseases that caused death, as they accounted for 67.81%, 50.51% and 32.31% of the total death in the World. Following these were, Neonatal disorder (26.81%), Tubercolosis (24.31%), Lower Respiratory infection (23.84%) and Digestive disorder (11.48%).

Though Cardiovascular disease was at the fore of the list in Africa, Africa only accounted for 6.15% of the World's death from this disease. Same went for Neoplasm and Chronic respiratory diseases, where Africa accounted for 4.37 and 5.0 percents respectively.The above mentioned diseases were not dominant in Africa.

Over the years, the record of average death was on the increase in the World. Africa had her fair share too until 2003, when it took a turn, and there was a decrease. Average death record increased drastically in the World in 2017.

Moving further to death by Age-group, Africa recorded the highest number of death by Age-group in the age bracket of  `Under 5 years` and the lowest in the age bracket of `5 - 14 years`. The later was a likely case in the World. On the contrary, the highest death by Age-group in the World was recorded in the age bracket `70 years +`.

In the midst of all these, replenishing was taking place. Asia recorded the highest amount of population increase within the period under study, while Africa, as usual, followed in the second position. Within the Africa Continent, Nigeria recorded the highest increase in population growth, with Egypt, Ethiopia and Congo, to mention a few.

The average population of the World increased until 2017, the same time the highest death record started. At this point, there was a drastic fall from 37 million, `3.757e+07` to 28 million, `2.028e+07`. This accounted for approximately 46.23% average decrease in the size of the population of the World. Meanwhile, Africa did not show any sign of being affected as she kept soaring in population growth.

How did the health system come into the picture?

The medical practitioners were sub-divided into `5` groups:

*   Medical Doctors (per 10,000)

*   Medical Doctors (numbers)

*   Generalist Medical Practitioners (numbers)

*   Specialist medical Practitioners (numbers)

*   Medical Doctors not Further Defined (numbers)

In the area of Medical Doctors per 10,000 population, Africa recorded an average of 14% of the total in the World. Meanwhile, on average, there was a ratio of `21` Medical Doctors to `10,000` population in the world. But there were only `3` Medical Doctors to `10,000` population in Africa, which was a far cry of 86.67%.

Let's take a closer look at the year, 2016. Medical Doctors per 10,000 population were `21` in the World and approximately `4` in Africa. Which meant  a difference of 80.95%. Medical Doctors (numbers) were `72,800` and `8,972` in the world and Africa respectively. That was 87.68% difference between Africa and the World. 

Generalist Medical Practitioners (numbers) were `17,870` in the World, and in Africa, it was approximately `562` which was 92.86% difference of the total in the World and Africa. Specialist Medical Practitioners (numbers) were `34,770` and approximately `345` in Africa, which accounted for 99% difference from the world's total. As for Medical Doctors not Further Defined (numbers), there was a difference of 84.74%, which were represented by `10,920` from Africa and `71,550` from the World.

The fraction of GDP that was spent on health ranged from 1.26% to 24.24% with Oceania having the higheat while Africa bore the lowest. The allocation on health expenditure varied significantly amongst the Continents in that period. Africa ranked `2nd` to the last, in terms of health expeniture allocation, after Asia.

May we now relate the current health expenditure to population (historic estimate)?

Asia performed worse than Africa, where the record showed `high` population with `low` health expenditure. Africa on the other hand, also showed `high` population with `low` current health expenditure. In all of this, the best performing Continent was Oceania, which recorded `low` population and `high` current health expenditure.

The relationship between population (historic estimate), death by cause and current health expenditure reiterated the same point that Africa was second to Asia from the rear with `moderate` population, death and health expenditure, when compared to Asia which had `high` population and death, with `low` health expenditure.

As the record of the average death increased, also did the current health expenditure in the World. But when we came to Africa, there seemed to be no establised relationship between the recorded average death and current health expenditure.

<a name="19"></a>

## Datasets Limitations

The dataset did not put other things into consideration. For example: 

*   What caused the decrease in death in Africa from 2003 while the world recorded an increase?

The data did not reveal the causes of the increased average death from 2017 on-ward

*   What actual amount is represented by the percentages of GDP given?

Data for the Continent Antarctica was omitted from the datasets

Data about Medical Practitioners in Asia was not included

The data only covered the period of 30 years, up to 2019.

No data on the state, number, and proximity of health facilities were available.

No data on family planning needs of women and girls.

<a name="20"></a>

## Recommendations

African children `Under 5 years` are dieing, presumably from Malaria and Diarrheal and the youths `15 - 49 years` from HIV/AIDS. Stemming from this, the Governments of the African nations need to step up their game and invest immensely in drugs and other medical resources necessary to fight against Malaria, Diarrheal and HIV/AIDS in  order to reclaim the lives of our children and youths from the claws of these deadly killers. After all, these were not major death causing diseases in the World.

The percentage of GDP allocated to health expenditure should be readdressed in Africa to be at par with the World, in order to encourage the Medical practitioner to keep practicing in Africa. This will go a long way to increasing the average number of Medical Doctors per `10,000` population available in the Continent.



<a name="21"></a>

## Conclusion
Data on health facilities, the facilities state and access to them by the population in Africa would have further enriched the claim of significant number of death. Death caused by diseases like Malaria, diarrheal, HIV/AIDS, Tuberculosis and the likes, could have been avoided, if there were adequate medical resources (drugs and timely medical attention). On the other hand, the effect of health facility proximity on death could not be ascertained as the data did not cover that aspect.

No doubt, there could be more optimized method(s) of writing the codes in this note. I did all of these based on the level of my current knowledge when writing this analysis.

# *THANK YOU!*