# PERSONAL REMITTANCES PAID DEFINITION

Personal remittances, paid (current US$)

Personal remittances comprise personal transfers and compensation of employees. Personal transfers consist of all current transfers in cash or in kind made or received by resident households to or from nonresident households. Personal transfers thus include all current transfers between resident and nonresident individuals. Compensation of employees refers to the income of border, seasonal, and other short-term workers who are employed in an economy where they are not resident and of residents employed by nonresident entities. Data are the sum of two items defined in the sixth edition of the IMF's Balance of Payments Manual: personal transfers and compensation of employees. 
Data are in current U.S. dollars.

###THE OBJECTIVE OF THIS NOTEBOOK IS TO GET THE LIST OF TOP 10 COUNTRIES THAT PAY/SEND OUT PERSONAL REMITTANCE FUNDS



# Step 1 : Importing source file (CSV)


In [12]:
%matplotlib notebook
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import io

In [13]:
csv_2 ="Personal_remittances_Paid.csv"

Data_12 = pd.read_csv(csv_2, encoding="ISO-8859-1")

Data_12.head()
# file has 269 rows (country names) and many missing data

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,1960,1961,1962,1963,1964,1965,...,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017
0,Aruba,ABW,"Personal remittances, paid (current US$)",BM.TRF.PWKR.CD.DT,,,,,,,...,75642458.0,72960894.0,64916201.0,63910615.0,67821230.0,66652370.0,70222690.0,74978010.0,71743530.0,
1,Afghanistan,AFG,"Personal remittances, paid (current US$)",BM.TRF.PWKR.CD.DT,,,,,,,...,153330566.0,257128933.0,402473553.0,240078367.0,167729000.0,274617700.0,279652400.0,150409800.0,118216400.0,86359034.0
2,Angola,AGO,"Personal remittances, paid (current US$)",BM.TRF.PWKR.CD.DT,,,,,,,...,669453676.0,715962016.0,714458886.0,564261542.0,2051321000.0,2395966000.0,2746616000.0,1252909000.0,1176110000.0,961415276.0
3,Albania,ALB,"Personal remittances, paid (current US$)",BM.TRF.PWKR.CD.DT,,,,,,,...,279697508.0,229644909.0,209353148.0,150755566.0,215853300.0,190557700.0,178667100.0,153258400.0,147128100.0,106256732.0
4,Andorra,AND,"Personal remittances, paid (current US$)",BM.TRF.PWKR.CD.DT,,,,,,,...,,,,,,,,,,


# THE RESULT ABOVE SHOWS THE LIST OF ALL COUNTRIES AND PROVIDES DATA SINCE 1960

# There are so many periods and countries with NA, and list of Individual countries and Regions.
# Therefore I filter my data to pull only the 6 six most recent years - 2012 through 2017
# Also filtered to includes rows of only countries with data 
#(multiple countries summed) together. 

In [14]:
# Creating a new dataframe with only necessary columns

D12_NEW = Data_12[['CountryName', 'CountryCode', 'IndicatorName', '2012', '2013', '2014', '2015', '2016', '2017']].dropna(how = 'any')



D12_NEW['6YRS TOTAL'] = (D12_NEW['2012'] /1000000000 + D12_NEW['2013']/1000000000 + D12_NEW['2014']/1000000000 + D12_NEW['2015']/1000000000 + D12_NEW['2016']/1000000000 + D12_NEW['2017']/1000000000)


D12_NEW['YEARLY AVG'] = (D12_NEW['6YRS TOTAL']/6)

D12_NEW

ALL_PAID_2 = D12_NEW.loc[D12_NEW['CountryName'] =='World']



PR_WORLD_2 = ALL_PAID_2[['2012', '2013', '2014', '2015', '2016', '2017']]

PR_WORLD_2


Unnamed: 0,2012,2013,2014,2015,2016,2017
257,358668500000.0,392761100000.0,403011500000.0,394424200000.0,398920200000.0,412269000000.0


In [18]:
#TRANSPOSING COLUMNS TO ROWS TO PLOT A TOTAL REMITTANCE PAID CHART

FLIP_2 = PR_WORLD_2.T.reset_index()

FLIP_2.rename(columns = {'index': 'Year', 257:'PR Paid'}, inplace=True)


FLIP_2


Unnamed: 0,Year,PR Paid
0,2012,358668500000.0
1,2013,392761100000.0
2,2014,403011500000.0
3,2015,394424200000.0
4,2016,398920200000.0
5,2017,412269000000.0


In [20]:
#PLOTTING A CHART

fig1 = plt.figure()


plt.bar(x=np.arange(6),color=('lightblue'),height=FLIP_2['PR Paid'])

#Give it a title
plt.title("PERSONAL REMITTANCES PAID FOR 6 YEARS (ALL COUNTRIES)")

#Give the x axis some labels across the tick marks.
#Argument one is the position for each label
#Argument two is the label values and the final one is to rotate our labels
plt.xticks(np.arange(6), FLIP_2['Year'], rotation=45)

#Give the x and y axes a title
plt.xlabel("Year")
plt.ylabel("PR_Paid")

plt.savefig('PR_PAID_ALL.png')

<IPython.core.display.Javascript object>

In [21]:
Yr_Avg_PR_Paid = round(FLIP_2["PR Paid"].mean()/1000000000,2)

print(f" AN AVERAGE OF ${Yr_Avg_PR_Paid} BILLION (DOLLARS) IS RECEIVED YEARLY ACROSS DIFFERENT COUNTRIES GLOBALLY")

 AN AVERAGE OF $393.34 BILLION (DOLLARS) IS RECEIVED YEARLY ACROSS DIFFERENT COUNTRIES GLOBALLY


# PULLING THE LIST OF TOP 10 COUNTRIES PAYING PERSONAL REMITTANCES

In [22]:
Top_10list12 = D12_NEW.nlargest(55, ['6YRS TOTAL']) 

Top_10list12

Unnamed: 0,CountryName,CountryCode,IndicatorName,2012,2013,2014,2015,2016,2017,6YRS TOTAL,YEARLY AVG
257,World,WLD,"Personal remittances, paid (current US$)",358668500000.0,392761100000.0,403011500000.0,394424200000.0,398920200000.0,412269000000.0,2360.054417,393.342403
93,High income,HIC,"Personal remittances, paid (current US$)",277958400000.0,299128100000.0,312523400000.0,303604300000.0,313654600000.0,326881500000.0,1833.750297,305.625049
179,OECD members,OED,"Personal remittances, paid (current US$)",206636000000.0,218960800000.0,228336000000.0,218734400000.0,230157800000.0,246100400000.0,1348.925473,224.820912
196,Post-demographic dividend,PST,"Personal remittances, paid (current US$)",201450300000.0,213005400000.0,221862000000.0,211510300000.0,221378100000.0,234041800000.0,1303.248025,217.208004
63,Europe & Central Asia,ECS,"Personal remittances, paid (current US$)",164343800000.0,180390900000.0,180164600000.0,154924200000.0,154670800000.0,168936900000.0,1003.431296,167.238549
140,Late-demographic dividend,LTE,"Personal remittances, paid (current US$)",94101430000.0,105454900000.0,107032100000.0,106761200000.0,102762200000.0,105943900000.0,622.055637,103.67594
71,European Union,EUU,"Personal remittances, paid (current US$)",96504710000.0,103628400000.0,106383000000.0,98360200000.0,102735900000.0,110856500000.0,618.468739,103.078123
101,IDA & IBRD total,IBT,"Personal remittances, paid (current US$)",84519520000.0,97551890000.0,95143740000.0,95977300000.0,91834030000.0,94375210000.0,559.40169,93.233615
151,Middle East & North Africa,MEA,"Personal remittances, paid (current US$)",79516450000.0,91236650000.0,92738000000.0,92036250000.0,90559230000.0,87624450000.0,533.711035,88.951839
138,Low & middle income,LMY,"Personal remittances, paid (current US$)",80710100000.0,93633000000.0,90488060000.0,90819830000.0,85265610000.0,85387510000.0,526.30412,87.717353


In [23]:
Top_10list12['CountryName'].unique()

array(['World', 'High income', 'OECD members',
       'Post-demographic dividend', 'Europe & Central Asia',
       'Late-demographic dividend', 'European Union', 'IDA & IBRD total',
       'Middle East & North Africa', 'Low & middle income', 'IBRD only',
       'Middle income', 'Arab World', 'Euro area', 'North America',
       'Early-demographic dividend', 'Upper middle income',
       'United States', 'East Asia & Pacific',
       'Europe & Central Asia (IDA & IBRD countries)', 'Saudi Arabia',
       'Europe & Central Asia (excluding high income)',
       'Russian Federation', 'Switzerland',
       'East Asia & Pacific (IDA & IBRD countries)',
       'East Asia & Pacific (excluding high income)',
       'Lower middle income', 'Germany', 'Small states',
       'Other small states', 'Kuwait', 'France', 'Luxembourg', 'Qatar',
       'China', 'United Kingdom', 'Italy',
       'Fragile and conflict affected situations', 'Netherlands',
       'Korea, Rep.', 'Oman', 'IDA total', 'Sub-Sahara

# Manually selected the list of the TOP 10 Countires. Excluding all aggregates 

In [24]:
 

Top10list = ( 'United States',  'Saudi Arabia', 'Russian Federation', 'Switzerland','Germany',  'Kuwait', 'France', 'Luxembourg', 'Qatar',
       'China')
#DevelopmentClass = ('Developing', 'Developing', 'Developing', 'Developing', 'Developed', 'Developing', 'Developing', 'Developed','Developing')  
#Region = ('South Asia', 'Asia', 'Europe', 'Central America', 'Europe', 'West Africa', 'North Africa', 'South Asia', 'South Asia', 'South Asia')

# Creating a Dataframe for the top 10 countries

In [25]:
TOP10_PAID_1 = D12_NEW[D12_NEW.CountryName.isin([ 'United States',  'Saudi Arabia', 'Russian Federation', 'Switzerland','Germany',  'Kuwait', 
                                                 'France', 'Luxembourg', 'Qatar','China'])]

TOP10_PAID_1 = TOP10_PAID_1.sort_values('6YRS TOTAL',ascending=False)

TOP10_PAID_1

TOP10_PAID_1['CountryClass'] = ['Developed', 'Developing', 'In-Transit', 'Developed', 'Developed', 
                                'Developing', 'Developed', 'Developed', 'Developing','Developing']

TOP10_PAID_1['Region'] = ['America', 'Western Asia', 'East Europe', 'Europe', 'Europe', 'Western Asia', 'Europe', 'Europe', 'Western Asia', 'Asia']

TOP10_PAID_NEW = TOP10_PAID_1[[ 'CountryName' , 'CountryCode', 'CountryClass' , 'Region', '2012', '2013' , '2014', '2015' , '2016', '2017', '6YRS TOTAL', 'YEARLY AVG']]

TOP10_PAID_NEW.head(10)

Unnamed: 0,CountryName,CountryCode,CountryClass,Region,2012,2013,2014,2015,2016,2017,6YRS TOTAL,YEARLY AVG
249,United States,USA,Developed,America,52652000000.0,55669000000.0,58882000000.0,61859000000.0,65111000000.0,67964000000.0,362.137,60.356167
203,Saudi Arabia,SAU,Developing,Western Asia,29492570000.0,34984190000.0,36924240000.0,38787370000.0,37843210000.0,36118880000.0,214.15045,35.691742
200,Russian Federation,RUS,In-Transit,East Europe,31647700000.0,37216680000.0,32640360000.0,19688840000.0,16244420000.0,20610140000.0,158.04814,26.341357
35,Switzerland,CHE,Developed,Europe,22927180000.0,24663450000.0,25869020000.0,25400420000.0,25774490000.0,26597790000.0,151.232354,25.205392
53,Germany,DEU,Developed,Europe,15587630000.0,19978670000.0,20077990000.0,18033310000.0,20289740000.0,22090770000.0,116.058123,19.34302
125,Kuwait,KWT,Developing,Western Asia,15459200000.0,17711480000.0,18128480000.0,15202540000.0,15287570000.0,13760110000.0,95.549372,15.924895
75,France,FRA,Developed,Europe,12565630000.0,13425170000.0,13725930000.0,12787010000.0,13311520000.0,13503430000.0,79.3187,13.219783
142,Luxembourg,LUX,Developed,Europe,11343200000.0,12236860000.0,12864890000.0,11178500000.0,11640260000.0,12665820000.0,71.929525,11.988254
198,Qatar,QAT,Developing,Western Asia,10412910000.0,11281040000.0,11230220000.0,12192030000.0,11981870000.0,12759340000.0,69.857418,11.642903
38,China,CHN,Developing,Asia,1788059000.0,1714203000.0,4155302000.0,20421780000.0,20286000000.0,16177710000.0,64.543043,10.757174


In [27]:
#PLotting chart of Top 10 countries 


fig2 = plt.figure()


plt.bar(x=np.arange(10),color=('darkgrey'),height=TOP10_PAID_NEW['YEARLY AVG'])

#Give it a titlelt.title("AVG YEARLY PERSONAL REMITTANCES RECEVIED FOR 6 YEARS (ALL COUNTRIES)")

#Give it a title
plt.title("AVG YEARLY PR PAID (TOP 10 COUNTRIES)")

#Give the x axis some labels across the tick marks.
#Argument one is the position for each label
#Argument two is the label values and the final one is to rotate our labels
plt.xticks(np.arange(10), TOP10_PAID_NEW['CountryCode'], rotation=45)

#Give the x and y axes a title
plt.xlabel("CountryName")
plt.ylabel("PR PAID (Top 10 Countries)")

plt.savefig('PR_PAID_TOP_10_Countries.png')

<IPython.core.display.Javascript object>

# THE UNITED STATES IS THE BIGGEST COUNTRY FOR THE OUTFLOW OF PERSONAL REMITTANCE FUNDS WITH OVER $60 BILLLION SENT OUT
ANNUALLY 

# TOP 10 PR PAYING COUNTRIES BY COUNTRY CLASS

In [28]:
#group by type to be used by all pie charts
by_class2 = TOP10_PAID_NEW.groupby('CountryClass')['CountryClass', 'CountryName', 'CountryCode', 'Region', '6YRS TOTAL']

fig3 = plt.figure()

#total fare by city
grp12 = by_class2.sum()['6YRS TOTAL']

#pie chart build
labels = grp12.index


#colors and exploe the same for all pie charts, reference here
colors = [ 'LIGHTblue', 'Coral', 'SKYBLUE']
explode = [0 ,0, 0]
plt.pie(grp12, startangle = 120, colors = colors, explode = explode, labels = labels, autopct = "%1.1f%%", shadow = True, wedgeprops = {'linewidth': .5, 'edgecolor': 'black'})


# save chart


#pie chart display
plt.title('% of Top 10 Countries Sending PR (by Class)')
plt.axis('equal')
plt.savefig('PR PAID_Top 10_by Class.png')


<IPython.core.display.Javascript object>

In [29]:
TOP10_TTL = TOP10_PAID_NEW['6YRS TOTAL'].sum()
WORLD_TTL = FLIP_2['PR Paid'].sum()

avg = TOP10_TTL / WORLD_TTL 

print(TOP10_TTL)
print(WORLD_TTL )
print(avg)

TOP10_PAID_NEW.nunique()

1382.8241240729997
2360054416882.0
5.859289151052398e-10


CountryName     10
CountryCode     10
CountryClass     3
Region           5
2012            10
2013            10
2014            10
2015            10
2016            10
2017            10
6YRS TOTAL      10
YEARLY AVG      10
dtype: int64

# TOP 10 PAYING COUNTRIES BY REGION 

In [31]:
#group by type to be used by all pie charts
by_region2 = TOP10_PAID_NEW.groupby('Region')['Region', 'CountryClass', 'CountryCode', 'CountryName', '6YRS TOTAL']


fig4 = plt.figure()


#total fare by city
grp22 = by_region2.sum()['6YRS TOTAL']

#pie chart build
label2 = grp22.index


#colors and exploe the same for all pie charts, reference here
colors = [ 'lightskyblue', 'grey', 'lightblue', 'lightgreen','green']
explode = [0,0,0,0,0]
plt.pie(grp22, startangle = 120,  explode = explode, labels = label2, colors= colors, autopct = "%1.1f%%", shadow = True, wedgeprops = {'linewidth': .5, 'edgecolor': 'black'})


# save chart


#pie chart display
plt.title('% of Top 10 Countries Sending PR (by Region)')
plt.axis('equal')
plt.savefig('PR PAID_Top 10_by Region.png')

<IPython.core.display.Javascript object>

# INSERTING COUNTRY GDP DATA 

In [32]:
csv_22 ="Country_GDP.csv"

Data_GDP1 = pd.read_csv(csv_22, encoding="ISO-8859-1")


Data_GDP1.head()
# file has 269 rows (country names) and many missing data

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,1960,1961,1962,1963,1964,1965,...,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018
0,Aruba,ABW,GDP (current US$),NY.GDP.MKTP.CD,,,,,,,...,2498883000.0,2390503000.0,2549721000.0,2534637000.0,2581564000.0,2649721000.0,2691620000.0,2646927000.0,2700559000.0,
1,Afghanistan,AFG,GDP (current US$),NY.GDP.MKTP.CD,537777811.0,548888896.0,546666678.0,751111191.0,800000044.0,1006667000.0,...,12439090000.0,15856570000.0,17804290000.0,19907320000.0,20561070000.0,20484890000.0,19907110000.0,19046360000.0,19543980000.0,
2,Angola,AGO,GDP (current US$),NY.GDP.MKTP.CD,,,,,,,...,70307160000.0,83799500000.0,111790000000.0,128053000000.0,136710000000.0,145712000000.0,116194000000.0,101124000000.0,122124000000.0,
3,Albania,ALB,GDP (current US$),NY.GDP.MKTP.CD,,,,,,,...,12044210000.0,11926960000.0,12890870000.0,12319780000.0,12776280000.0,13228250000.0,11386930000.0,11883680000.0,13038540000.0,
4,Andorra,AND,GDP (current US$),NY.GDP.MKTP.CD,,,,,,,...,3660531000.0,3355695000.0,3442063000.0,3164615000.0,3281585000.0,3350736000.0,2811489000.0,2877312000.0,3012914000.0,


# DATAFRAME TO SHOW ONLY THE NECESSARY PERIODS 

In [33]:
# Creating a new dataframe with only necessary columns

GDP_PAID = Data_GDP1[['CountryName', 'CountryCode', 'IndicatorName', '2012', '2013', '2014', '2015', '2016', '2017']].dropna(how = 'any')



GDP_PAID['6YRS GDP TOTAL'] = (GDP_PAID['2012'] /1000000000 + GDP_PAID['2013']/1000000000 + GDP_PAID['2014']/1000000000 + GDP_PAID['2015']/1000000000 + GDP_PAID['2016']/1000000000 + GDP_PAID['2017']/1000000000)


GDP_PAID['YEARLY GDP AVG'] = (GDP_PAID['6YRS GDP TOTAL']/6)

GDP_PAID


Unnamed: 0,CountryName,CountryCode,IndicatorName,2012,2013,2014,2015,2016,2017,6YRS GDP TOTAL,YEARLY GDP AVG
0,Aruba,ABW,GDP (current US$),2.534637e+09,2.581564e+09,2.649721e+09,2.691620e+09,2.646927e+09,2.700559e+09,15.805028,2.634171
1,Afghanistan,AFG,GDP (current US$),1.990732e+10,2.056107e+10,2.048489e+10,1.990711e+10,1.904636e+10,1.954398e+10,119.450718,19.908453
2,Angola,AGO,GDP (current US$),1.280530e+11,1.367100e+11,1.457120e+11,1.161940e+11,1.011240e+11,1.221240e+11,749.917000,124.986167
3,Albania,ALB,GDP (current US$),1.231978e+10,1.277628e+10,1.322825e+10,1.138693e+10,1.188368e+10,1.303854e+10,74.633466,12.438911
4,Andorra,AND,GDP (current US$),3.164615e+09,3.281585e+09,3.350736e+09,2.811489e+09,2.877312e+09,3.012914e+09,18.498652,3.083109
5,Arab World,ARB,GDP (current US$),2.786790e+12,2.866860e+12,2.908390e+12,2.560750e+12,2.513940e+12,2.586310e+12,16223.040000,2703.840000
6,United Arab Emirates,ARE,GDP (current US$),3.745910e+11,3.901080e+11,4.031370e+11,3.581350e+11,3.570450e+11,3.825750e+11,2265.591000,377.598500
7,Argentina,ARG,GDP (current US$),5.459820e+11,5.520250e+11,5.263200e+11,5.947490e+11,5.548610e+11,6.374300e+11,3411.367000,568.561167
8,Armenia,ARM,GDP (current US$),1.061932e+10,1.112147e+10,1.160951e+10,1.055334e+10,1.054614e+10,1.153659e+10,65.986362,10.997727
9,American Samoa,ASM,GDP (current US$),6.440000e+08,6.410000e+08,6.430000e+08,6.610000e+08,6.530000e+08,6.340000e+08,3.876000,0.646000


In [34]:
GDP_PAID_TOP10 = GDP_PAID.loc[GDP_PAID['CountryName'].isin (['United States',  'Saudi Arabia', 'Russian Federation', 'Switzerland','Germany',  'Kuwait', 
                                                 'France', 'Luxembourg', 'Qatar','China'])]

#print(df.loc[df['A'] == 'foo']) - single row value
#df.loc[df['B'].isin(['one','three'])] - multiple row value 

PR_WORLD_GDP = GDP_PAID_TOP10[['CountryName', 'CountryCode', '2012', '2013', '2014', '2015', '2016', '2017', '6YRS GDP TOTAL', 'YEARLY GDP AVG']]

PR_WORLD_GDP

Unnamed: 0,CountryName,CountryCode,2012,2013,2014,2015,2016,2017,6YRS GDP TOTAL,YEARLY GDP AVG
35,Switzerland,CHE,668044000000.0,688504000000.0,709183000000.0,679289000000.0,668745000000.0,678887000000.0,4092.652,682.108667
38,China,CHN,8560550000000.0,9607220000000.0,10482400000000.0,11064700000000.0,11191000000000.0,12237700000000.0,63143.57,10523.928333
53,Germany,DEU,3543980000000.0,3752510000000.0,3890610000000.0,3375610000000.0,3477800000000.0,3677440000000.0,21717.95,3619.658333
75,France,FRA,2683830000000.0,2811080000000.0,2852170000000.0,2438210000000.0,2465130000000.0,2582500000000.0,15832.92,2638.82
125,Kuwait,KWT,174070000000.0,174161000000.0,162631000000.0,114567000000.0,110912000000.0,120126000000.0,856.467,142.7445
142,Luxembourg,LUX,56677960000.0,61739350000.0,66327340000.0,57784500000.0,58631320000.0,62404460000.0,363.564939,60.594157
198,Qatar,QAT,186834000000.0,198728000000.0,206225000000.0,161740000000.0,151732000000.0,166929000000.0,1072.188,178.698
200,Russian Federation,RUS,2210260000000.0,2297130000000.0,2063660000000.0,1368400000000.0,1284730000000.0,1577520000000.0,10801.7,1800.283333
203,Saudi Arabia,SAU,735975000000.0,746647000000.0,756350000000.0,654270000000.0,644936000000.0,686738000000.0,4224.916,704.152667
249,United States,USA,16155300000000.0,16691500000000.0,17427600000000.0,18120700000000.0,18624500000000.0,19390600000000.0,106410.2,17735.033333


In [35]:
combined22=pd.DataFrame.merge( PR_WORLD_GDP, TOP10_PAID_NEW, on = ['CountryName', 'CountryCode']).reset_index()

In [36]:
combined22 = combined22[['CountryName', 'CountryCode', 'Region', 'CountryClass','6YRS TOTAL', '6YRS GDP TOTAL', 'YEARLY AVG', 'YEARLY GDP AVG']]

In [37]:
combined22

Unnamed: 0,CountryName,CountryCode,Region,CountryClass,6YRS TOTAL,6YRS GDP TOTAL,YEARLY AVG,YEARLY GDP AVG
0,Switzerland,CHE,Europe,Developed,151.232354,4092.652,25.205392,682.108667
1,China,CHN,Asia,Developing,64.543043,63143.57,10.757174,10523.928333
2,Germany,DEU,Europe,Developed,116.058123,21717.95,19.34302,3619.658333
3,France,FRA,Europe,Developed,79.3187,15832.92,13.219783,2638.82
4,Kuwait,KWT,Western Asia,Developing,95.549372,856.467,15.924895,142.7445
5,Luxembourg,LUX,Europe,Developed,71.929525,363.564939,11.988254,60.594157
6,Qatar,QAT,Western Asia,Developing,69.857418,1072.188,11.642903,178.698
7,Russian Federation,RUS,East Europe,In-Transit,158.04814,10801.7,26.341357,1800.283333
8,Saudi Arabia,SAU,Western Asia,Developing,214.15045,4224.916,35.691742,704.152667
9,United States,USA,America,Developed,362.137,106410.2,60.356167,17735.033333


In [38]:
# Sorting dataframe by City type and re-arrange column position

combineD22 = combined22.sort_values('CountryClass', ascending = False)

combined_22 = combined22[['CountryClass', "CountryName" ,'YEARLY AVG', 'YEARLY GDP AVG']]
 
combined_22['Class'] = [1, 3, 1, 1, 3, 1, 3,2, 3, 1]

combined_22 = combined_22.rename(columns={'YEARLY AVG' : "PR_YR_AVG", "YEARLY GDP AVG" : "GDP_YR_AVG"})

combined_22

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  import sys


Unnamed: 0,CountryClass,CountryName,PR_YR_AVG,GDP_YR_AVG,Class
0,Developed,Switzerland,25.205392,682.108667,1
1,Developing,China,10.757174,10523.928333,3
2,Developed,Germany,19.34302,3619.658333,1
3,Developed,France,13.219783,2638.82,1
4,Developing,Kuwait,15.924895,142.7445,3
5,Developed,Luxembourg,11.988254,60.594157,1
6,Developing,Qatar,11.642903,178.698,3
7,In-Transit,Russian Federation,26.341357,1800.283333,2
8,Developing,Saudi Arabia,35.691742,704.152667,3
9,Developed,United States,60.356167,17735.033333,1


# BUBBLE PLOT TO SHOW THE RELATIONSHIP BETWEEN AVERAGE YEARLY REMITTANCES SENT (x-axis) AND THE GDP VALUE OF THE COUNTRY (y-axis)

In [40]:
#plt.scatter(combined_22.GDP_YR_AVG, combined_22.PR_YR_AVG, s=combined_22.Class)

combined_22.plot(kind='scatter', x="PR_YR_AVG", y='GDP_YR_AVG', s=combined_22.Class, axis= [10, 65, 0, 18000])


<IPython.core.display.Javascript object>

AttributeError: Unknown property axis

In [41]:
Developed = combined_22[combined_22['CountryClass']=='Developed']
Developing = combined_22[combined_22['CountryClass']=='Developing']
Transiting = combined_22[combined_22['CountryClass']=='In-Transit']

plt.scatter(Developed['PR_YR_AVG'], Developed['GDP_YR_AVG'],  s = Developed['Class']*800, marker='o', color ='Orange', edgecolor ='black', alpha=0.99, label='Developed' )
plt.scatter(Developing['PR_YR_AVG'], Developing['GDP_YR_AVG'], s = Developing['Class']*600, marker='o', color ='LightSkyBlue', edgecolor ='black', alpha=0.99, label='Developing' )
plt.scatter(Transiting['PR_YR_AVG'], Transiting['GDP_YR_AVG'], s = Transiting['Class']*500, marker='o', color ='LightCoral', edgecolor ='black', alpha=0.99, label='Transiting' )
plt.grid()



# labels definition 

plt.title("Personal Remittances Sent (Top 10 Countries)")
plt.xlabel("Average PR Sent Yearly ($))")
plt.ylabel("Average Yearly GDP ($)")

# city type label

lgnd=plt.legend(title="CountryClass", loc = "lower right", frameon = True )
axis= ([10, 65, 0, 19000])
lgnd.legendHandles[0]._sizes=[15]
lgnd.legendHandles[1]._sizes=[15]
lgnd.legendHandles[2]._sizes=[15]

plt.savefig('BubbleploT PR Sent (Top 10 Countries).png')      

plt.show()


# CHART RESULTS EXPLAINED IN POWER POINT FILE

# THE END 