<a href="https://colab.research.google.com/github/TsamayaDesigns/codeDivision-data-with-python/blob/main/Data_Project_Beyond_Borders_Analysing_Migration_Patterns_in_South_Africa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project title
---
**Beyond Borders: Analysing Migration Patterns in South Africa**


## Author and Date
---
* **Author:**
  * Eugene Gerber
* **Date:**
  * 2024-04-24


## Objectives
---
Understanding migration patterns to and from South Africa are crucial for a number of reasons, for example:
1.  Policy Making
2.  Economic Development
3.  Labour Market Planning
4.  Regional Comparison

I aim to investigate these migration patterns by attempting to answer the following four questions:
1.  What are the main countries of destination for migrants from South Africa, and how has the net migration rate changed over the years?
2.  Which industries are attracting migrants from South Africa, and what is the trend in their skill demand over recent years?
3.  What are the primary skill categories that migrants take from (or bring to) South Africa, and how have their net migration rates evolved annually?
4.  How does South Africa's migration patterns compare with those of neighbouring countries within the same World Bank region?

Each question explores a key aspect of migration dynamics, including destination countries, industries attracting migrants, skill categories, and a regional comparison.

---


## The dataset
_reference and link to the source of the data_
* description of the dataset
* description of the processing with code in the box below to show summary information about the data

---



In [5]:
import pandas as pd
pd.set_option('display.width', 270)

# Import CSV-data from GitHub
def get_excel_data():
  url = "https://github.com/futureCodersSE/working-with-data/blob/main/Data%20sets/public_use-talent-migration.xlsx?raw=true"
  country_df = pd.read_excel(url, sheet_name = "Country Migration")
  industry_df = pd.read_excel(url, sheet_name = "Industry Migration")
  skill_df =  pd.read_excel(url, sheet_name = "Skill Migration")

  return country_df, industry_df, skill_df

country_df, industry_df, skill_df = get_excel_data()

## describe the dataset (number of records, statistics, columns, etc)

display("\nCountry: ""\n", country_df, "\n""Industry: \n", industry_df, "\nSkill: \n", skill_df)

'\nCountry: \n'

Unnamed: 0,base_country_code,base_country_name,base_lat,base_long,base_country_wb_income,base_country_wb_region,target_country_code,target_country_name,target_lat,target_long,target_country_wb_income,target_country_wb_region,net_per_10K_2015,net_per_10K_2016,net_per_10K_2017,net_per_10K_2018,net_per_10K_2019
0,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,af,Afghanistan,33.939110,67.709953,Low Income,South Asia,0.19,0.16,0.11,-0.05,-0.02
1,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,dz,Algeria,28.033886,1.659626,Upper Middle Income,Middle East & North Africa,0.19,0.25,0.57,0.55,0.78
2,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,ao,Angola,-11.202692,17.873887,Lower Middle Income,Sub-Saharan Africa,-0.01,0.04,0.11,-0.02,-0.06
3,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,ar,Argentina,-38.416097,-63.616672,High Income,Latin America & Caribbean,0.16,0.18,0.04,0.01,0.23
4,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,am,Armenia,40.069099,45.038189,Upper Middle Income,Europe & Central Asia,0.10,0.05,0.03,-0.01,0.02
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4143,zw,Zimbabwe,-19.015438,29.154857,Low Income,Sub-Saharan Africa,za,South Africa,-30.559482,22.937506,Upper Middle Income,Sub-Saharan Africa,-2.98,-11.79,-9.10,-12.08,-20.76
4144,zw,Zimbabwe,-19.015438,29.154857,Low Income,Sub-Saharan Africa,ae,United Arab Emirates,23.424076,53.847818,High Income,Middle East & North Africa,-2.50,-2.49,-2.21,-1.68,-3.19
4145,zw,Zimbabwe,-19.015438,29.154857,Low Income,Sub-Saharan Africa,gb,United Kingdom,55.378051,-3.435973,High Income,Europe & Central Asia,3.91,4.66,0.74,-0.66,-1.97
4146,zw,Zimbabwe,-19.015438,29.154857,Low Income,Sub-Saharan Africa,us,United States,37.090240,-95.712891,High Income,North America,38.60,37.76,10.09,6.06,5.25


'\nIndustry: \n'

Unnamed: 0,country_code,country_name,wb_income,wb_region,isic_section_index,isic_section_name,industry_id,industry_name,net_per_10K_2015,net_per_10K_2016,net_per_10K_2017,net_per_10K_2018,net_per_10K_2019
0,ae,United Arab Emirates,High income,Middle East & North Africa,C,Manufacturing,1,Defense & Space,378.74,127.94,8.20,68.51,49.55
1,ae,United Arab Emirates,High income,Middle East & North Africa,J,Information and communication,3,Computer Hardware,100.97,358.14,112.98,149.57,182.22
2,ae,United Arab Emirates,High income,Middle East & North Africa,J,Information and communication,4,Computer Software,1079.36,848.15,596.48,409.18,407.41
3,ae,United Arab Emirates,High income,Middle East & North Africa,J,Information and communication,5,Computer Networking,401.46,447.39,163.99,236.69,188.07
4,ae,United Arab Emirates,High income,Middle East & North Africa,J,Information and communication,6,Internet,1840.33,1368.42,877.71,852.39,519.40
...,...,...,...,...,...,...,...,...,...,...,...,...,...
5290,zw,Zimbabwe,Low income,Sub-Saharan Africa,B,Mining and quarrying,56,Mining & Metals,257.36,187.70,-17.45,70.60,-18.30
5291,zw,Zimbabwe,Low income,Sub-Saharan Africa,P,Education,68,Higher Education,190.84,50.76,-68.74,-234.59,-304.36
5292,zw,Zimbabwe,Low income,Sub-Saharan Africa,O,Public administration and defence; compulsory ...,74,International Affairs,25.23,-46.12,214.29,311.03,-55.88
5293,zw,Zimbabwe,Low income,Sub-Saharan Africa,J,Information and communication,96,Information Technology & Services,46.65,35.93,-142.64,-108.16,-213.82


'\nSkill: \n'

Unnamed: 0,country_code,country_name,wb_income,wb_region,skill_group_id,skill_group_category,skill_group_name,net_per_10K_2015,net_per_10K_2016,net_per_10K_2017,net_per_10K_2018,net_per_10K_2019
0,af,Afghanistan,Low income,South Asia,2549,Tech Skills,Information Management,-791.59,-705.88,-550.04,-680.92,-1208.79
1,af,Afghanistan,Low income,South Asia,2608,Business Skills,Operational Efficiency,-1610.25,-933.55,-776.06,-532.22,-790.09
2,af,Afghanistan,Low income,South Asia,3806,Specialized Industry Skills,National Security,-1731.45,-769.68,-756.59,-600.44,-767.64
3,af,Afghanistan,Low income,South Asia,50321,Tech Skills,Software Testing,-957.50,-828.54,-964.73,-406.50,-739.51
4,af,Afghanistan,Low income,South Asia,1606,Specialized Industry Skills,Navy,-1510.71,-841.17,-842.32,-581.71,-718.64
...,...,...,...,...,...,...,...,...,...,...,...,...
17612,zw,Zimbabwe,Low income,Sub-Saharan Africa,12666,Specialized Industry Skills,Teaching,71.18,30.68,-18.85,-68.89,-93.70
17613,zw,Zimbabwe,Low income,Sub-Saharan Africa,1235,Specialized Industry Skills,Mining,8.97,-112.85,-35.87,-65.38,-93.46
17614,zw,Zimbabwe,Low income,Sub-Saharan Africa,43756,Specialized Industry Skills,Personal Coaching,-53.45,-59.70,-88.01,-55.90,-82.23
17615,zw,Zimbabwe,Low income,Sub-Saharan Africa,1724,Specialized Industry Skills,Public Health,15.25,-65.53,-57.22,-39.39,-32.14


## Cleaning the data
_what will you do to get the data ready for analysis_
* sorting
* removing null data
* forming new data tables
* ...
---


In [None]:
## data cleaning code

## Analysing the data
_what analysis are you doing and why_
* producing summary statistics
* printing calculated statistics
* data analysis calculations (e.g. regression, correlation)
* ...

---



In [None]:
## analysis code here

## Visualising the data
_graphical or textual visualisation of the data_

---



In [None]:
## visualisation code here

# Summary
_what has this dataset told you_
_include what you have learnt from this project_