# BioMed Central (BMC) website. 
## Human Resources for Health 
"**Physician emigration from Nigeria and the associated factors: the implications to safeguarding the Nigeria health system**"
By Cosmas Kenan Onah, Benedict Ndubueze Azuogu, Casmir Ndubuisi Ochie, Christian Obasi Akpa, Kingsley Chijioke Okeke, Anthony Okoafor Okpunwa, Hassan Muhammad Bello & George Onyemaechi Ugwu on 20 December 2022                                          Available:*https://human-resources-health.biomedcentral.com/articles/10.1186/s12960-022-00788-z#MOESM1*                      Access Date:19 July 2024                              



In [1]:
import pandas as pd
from helper.aws_helper import S3Connection

In [2]:
# get dataset from website
migration_dataset = pd.read_csv("https://static-content.springer.com/esm/art%3A10.1186%2Fs12960-022-00788-z"
                                "/MediaObjects/12960_2022_788_MOESM1_ESM.csv")

In [4]:
migration_dataset.shape

(925, 24)

### 925 physicians took part in this survey, 24 set of questions where asked

In [5]:
# see the first 5 dataset
migration_dataset.head()

Unnamed: 0,Timestamp,1. Your age at your last birthday,2. Your sex,3. Marital status,"4. If your answer to question 3 above is ""Married"" what is your spouse's profession?","5. Your religion. (Please, specify the religion if your response is 'Other')","6. Highest education acquired. (Please, specify if your response is 'Other')","7. What is your country of acquiring first professional (MBBS or equivalent) degree? (Please, specify the country if your response is 'Other')",8. What type of institution did you attend to obtain MBBS?,9. What is your current professional cadre?,...,14. How would you describe the location of your practice?,15. How satisfied are you with your current professional practice in Nigeria?,16. Are you willing to continue your professional practice in Nigeria?,"17. If your answer to question 16 above is “Yes”, what is/are your motivations or concerns? (Select as many that apply; Please, specify the concerns if your responses include ""Other"").","18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice? (Select as many that apply; Please, specify your reasons if your responses include ""Other"")","19.\tIf your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include ""Other"")","20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include ""Other"")","21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice? (Select as many that apply; Please, specify your reasons if your responses include ""Other"")","22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?","23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?"
0,2021/07/28 3:08:54 PM GMT+1,40,Female,Married,Non physician health worker,Christianity,Member WACP/NPMCN,Nigeria,Public University,Resident Doctor (in-training),...,Urban,Very unsatisfied,Yes,Desire to maintain family ties,,,,,,
1,2021/07/28 3:16:13 PM GMT+1,28,Male,Single,,Christianity,MBBS,Nigeria,Public University,Medical Officer,...,Urban,Very unsatisfied,No,,,Qatar;United Kingdom;United States of America,As soon as I get employment abroad,Poor remuneration;Irregular pay;Delayed pay;Ha...,Indefifintelly,Very unlikely
2,2021/07/28 3:18:55 PM GMT+1,32,Female,Married,Non physician health worker,Christianity,MBBS,Nigeria,Public University,Resident Doctor (in-training),...,Urban,Unsatisfied,Undecided,,Uncertainty about Government policies on healt...,,,,,
3,2021/07/28 3:30:36 PM GMT+1,29,Female,Married,Physician,Christianity,MBBS,Nigeria,Public University,Medical Officer,...,Semi-urban,Unsatisfied,No,,,United Kingdom,As soon as I get employment abroad,Poor remuneration;Irregular pay;Delayed pay;Ha...,Indefifintelly,Unlikely
4,2021/07/28 3:35:42 PM GMT+1,40,Male,Married,Physician,Christianity,MBBS,Nigeria,Public University,Fellow (post training),...,Urban,Very unsatisfied,No,,,United States of America,It depends on God's direction,To acquire further training/skills,Undecided,


In [42]:
# data types of columns in dataset
migration_dataset.dtypes

Timestamp                                                                                                                                                                                                                      object
1. Your age at your last birthday                                                                                                                                                                                              object
2. Your sex                                                                                                                                                                                                                    object
3. Marital status                                                                                                                                                                                                              object
4. If your answer to question 3 above is "Married" what is your spouse's profess

### All 24 columns are objects

In [38]:
migration_dataset.isnull().sum()

Timestamp                                                                                                                                                                                                                        0
1. Your age at your last birthday                                                                                                                                                                                                0
2. Your sex                                                                                                                                                                                                                      0
3. Marital status                                                                                                                                                                                                                0
4. If your answer to question 3 above is "Married" what is your spouse's profession?        

##### There seems to be lots of "missing" values in these 8 columns: 
- 4. If your answer to question 3 above is "Married" what is your spouse's profession
- 17. If your answer to question 16 above is “Yes”, what is/are your motivations or concerns? (Select as many that apply; Please, specify the concerns if your responses include "Other")
- 18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other")
- 19. If your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include "Other")
- 20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include "Other")
- 21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other")
- 22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?`,`23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?`
- 23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad? 

##### We will go through each of the these columns to assess why items are missing 

In [3]:
# find how many of the individuals that are married that have spouse profession missing
married_empty_spouse_prof = migration_dataset[
    (migration_dataset["3. Marital status"] == "Married") &
    migration_dataset["4. If your answer to question 3 above is \"Married\" what is your spouse's profession?"].isnull()
    ].shape[0]
married_empty_spouse_prof

3

##### The above shows that only 3 missing data in the real sense, this is because only individuals that are married are required to fill this column. I will fill the missing columns here with the mode of this column

In [4]:
# This locates the rows that are empty when individuals are married and the spouse profession is empty and replaces this with the mode of the spouse profession's column
married_empty_sprof= (migration_dataset["3. Marital status"] == "Married") & migration_dataset["4. If your answer to question 3 above is \"Married\" what is your spouse's profession?"].isnull()

migration_dataset.loc[married_empty_sprof, "4. If your answer to question 3 above is \"Married\" what is your spouse's profession?"] = migration_dataset["4. If your answer to question 3 above is \"Married\" what is your spouse's profession?"].mode()[0]

In [5]:
motivations_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "Yes") & migration_dataset['17. If your answer to question 16 above is “Yes”, what is/are your motivations or concerns? (Select as many that apply; Please, specify the concerns if your responses include "Other").'].isnull()].shape[0]
motivations_empty

0

##### The above shows that no data is missing data in the real sense, this is because all individuals that answered "Yes" in the previous column `column 16`, that were meant to answer this column all answered

In [6]:
future_concerns = migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "Undecided") & migration_dataset['18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].isnull()].shape[0]
future_concerns

3

##### The above shows that only 3 missing data in the real sense, this is because only individuals that are undecided are required to fill this column and only 3 have not done that. I will fill the missing columns here with the mode of this column

In [7]:
# This locates the rows that are empty when individuals are undecided and their concerns for the future is empty and replaces them with the mode of the concerns for the future column
migration_dataset.loc[
    ((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "Undecided") & migration_dataset['18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].isnull()),'18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '] = \
migration_dataset['18. If your answer to question 16 above is “Undecided”, what is/are your concerns regarding your future location of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].mode()[0]

In [8]:
emigrate_to_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['19.	If your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include "Other")'].isnull()].shape[0]
emigrate_to_empty

1

##### The above shows that only 1 missing data in the real sense, this is because only individuals that answered "No" to question No. 16 are required to fill this column and only 1 have not done that. I will fill the missing columns here with the mode of this column

In [9]:
# This locates the rows that are empty when individuals answered 'no' to if they will practice in Nigeria and have not stated what country they are interested in immigrating to.
migration_dataset.loc[((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['19.	If your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include "Other")'].isnull()),'19.	If your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include "Other")'] = migration_dataset['19.	If your answer to question 16 above is ‘No’, which country(ies) are you planning to emigrate to? (Select as many that apply; Please, specify the country if your responses include "Other")'].mode()[0]

In [10]:
emigrate_date_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include "Other")    '].isnull()].shape[0]
emigrate_date_empty

10

##### The above shows that 10 missing data in the real sense, this is because only individuals that answered "No" to question No. 16 are required to fill this column and only 10 have not done that. I will fill the missing columns here with the mode of this column

In [11]:
# This locates the rows that are empty when individuals answered 'no' to if they will practice in Nigeria and have not stated when they intend to emigrate.
migration_dataset.loc[((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include "Other")    '].isnull()),'20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include "Other")    '] = migration_dataset['20. If your answer to question 16 above is ‘No’, when are you planning to emigrate from Nigeria? (Select as many that apply; Please, specify the time if your responses include "Other")    '].mode()[0]

In [12]:
reason_for_emigrate_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].isnull()].shape[0]
reason_for_emigrate_empty

3

##### The above shows that 3 missing data in the real sense, this is because only individuals that answered "No" to question No. 16 are required to fill this column and only 3 have not done that. I will fill the missing columns here with the mode of this column

In [13]:
# This locates the rows that are empty when individuals answered 'no' to if they will practice in Nigeria and have not stated the reason for their planned relocation.
migration_dataset.loc[((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].isnull()),'21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '] = migration_dataset['21. If your answer to question 16 above is “No”, what is/are reason(s) for your planned relocation of practice?  (Select as many that apply; Please, specify your reasons if your responses include "Other") '].mode()[0]

In [15]:
planned_length_of_time_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?  '].isnull()].shape[0]
planned_length_of_time_empty

4

##### The above shows that 4 missing data in the real sense, this is because only individuals that answered "No" to question No. 16 are required to fill this column and only 4 have not done that. I will fill the missing columns here with the mode of this column

In [16]:
# This locates the rows that are empty when individuals answered 'no' to if they will practice in Nigeria and have not stated how long they planned to be away for.
migration_dataset.loc[((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?  '].isnull()),'22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?  '] = migration_dataset['22. If your answer to question 16 above is “No”, what is your planned length of stay outside Nigeria if you eventually relocate?  '].mode()[0]

In [17]:
Return_empty= migration_dataset[(migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?  '].isnull()].shape[0]
Return_empty

5

##### The above shows that 5 missing data in the real sense, this is because only individuals that answered "No" to question No. 16 are required to fill this column and only 5 have not done that. I will fill the missing columns here with the mode of this column

In [18]:
# This locates the rows that are empty when individuals answered 'no' to if they will practice in Nigeria and have not stated if they will ever return back to Nigeria
migration_dataset.loc[((migration_dataset["16. Are you willing to continue your professional practice in Nigeria? "] == "No") & migration_dataset['23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?  '].isnull()),'23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?  '] = migration_dataset['23. If your answer to question 16 above is “No”, how likely is it that you will come back to Nigeria for practice after spending some time abroad?  '].mode()[0]

In [57]:
migration_dataset.shape

(925, 24)

In [62]:
migration_dataset.isnull().sum()

Timestamp                                                                                                                                                                                                                        0
1. Your age at your last birthday                                                                                                                                                                                                0
2. Your sex                                                                                                                                                                                                                      0
3. Marital status                                                                                                                                                                                                                0
4. If your answer to question 3 above is "Married" what is your spouse's profession?        

In [19]:
migration_dataset["5. Your religion. (Please, specify the religion if your response is 'Other')"].isnull().sum()

2

##### The above shows that there are 2 missing data in the "Your religion. ..." column. I will fill the missing columns here with the mode of this column

In [20]:
migration_dataset.loc[(migration_dataset["5. Your religion. (Please, specify the religion if your response is 'Other')"].isnull()),"5. Your religion. (Please, specify the religion if your response is 'Other')"] = migration_dataset["5. Your religion. (Please, specify the religion if your response is 'Other')"].mode()[0]

In [21]:
migration_dataset["10. What is your area of specialization or interest to specialize. (Please, specify the specialty if your response is 'Other')"].isnull().sum()

2

##### The above shows that there are 2 missing data in the "What is your area of specialization or interest to specialize. ..." column. I will fill the missing columns here with the mode of this column

In [22]:
migration_dataset.loc[(migration_dataset["10. What is your area of specialization or interest to specialize. (Please, specify the specialty if your response is 'Other')"].isnull()),"10. What is your area of specialization or interest to specialize. (Please, specify the specialty if your response is 'Other')"] = migration_dataset["10. What is your area of specialization or interest to specialize. (Please, specify the specialty if your response is 'Other')"].mode()[0]

In [23]:
migration_dataset["11. What is the duration of your professional (post MBBS) practice in years?"].isnull().sum()

1

##### The above shows that there is 1 missing data in the "What is the duration of your professional (post MBBS) practice in years?" column. I will fill the missing columns here with the mode of this column

In [24]:
migration_dataset.loc[(migration_dataset["11. What is the duration of your professional (post MBBS) practice in years?"].isnull()),"11. What is the duration of your professional (post MBBS) practice in years?"] = migration_dataset["11. What is the duration of your professional (post MBBS) practice in years?"].mode()[0]

In [25]:
# function to convert the birthday from string to integer
def digit(df, col):
    age_arr = []
    for str in df[col]:
        age = []
        counter = 0
        for c in str:
            if c.isdigit():
                if c == "⁷":
                    c = "7"
                age.append(c)
                counter += 1
        if counter == 0:
            age_arr.append(int("0"))
        else:
            age_arr.append(int(''.join(age)))
    return age_arr

# create a new column in the dataframe for the converted birthday
migration_dataset["Birthday (int)"] = digit(migration_dataset, "1. Your age at your last birthday")

In [26]:
migration_dataset.shape

(925, 25)

##### New column with age converted to integer now added to the dataframe

In [27]:
(migration_dataset["Birthday (int)"] == 0).value_counts()

Birthday (int)
False    923
True       2
Name: count, dtype: int64

##### The above shows that there is 2 missing age, I will have to fill the missing data with the mean of that column

In [28]:
# first replace zeros with NaN and then fill those missing values with the mean value of the birthday column
migration_dataset["Birthday (int)"] = migration_dataset["Birthday (int)"].mask(migration_dataset["Birthday (int)"] == 0).fillna(migration_dataset["Birthday (int)"].mean())

In [29]:
average_age = migration_dataset["Birthday (int)"].mean()
f"{average_age:.0f}"

'38'

 ##### The mean age of the physicians that took part in the survey is **`38 years old`**

In [37]:
migration_dataset["15. How satisfied are you with your current professional practice in Nigeria?"].value_counts()

15. How satisfied are you with your current professional practice in Nigeria?
Unsatisfied         516
Very unsatisfied    290
Satisfied           109
Very satisfied       10
Name: count, dtype: int64

##### There seems to be typo in the spelling for "Satisfied"; instead of `Satisfied`, it is `Satified`; this needs to be corrected

In [36]:
migration_dataset.loc[(migration_dataset["15. How satisfied are you with your current professional practice in Nigeria?"] == "Satified"), "15. How satisfied are you with your current professional practice in Nigeria?"] = "Satisfied"

#### Write the cleaned up dataframe to s3 bucket

In [41]:
# instantiate and upload dataset to AWS S3 bucket
conn = S3Connection()

In [42]:
# write dataframe into s3 bucket
conn.write_df(migration_dataset, "medsafe-docs", "medsafe-docs/physicians_migration_additional_cleaned_data.csv")