## QOLT Tracker

I will be looking over my data collected from web scrapping as well as from a csv where I got data.
The data is looking over places in the world where they have a public software engineer salary. I will then be collecting the benefits from each of the countries to determine the best place to live as a software developer.

In [None]:
import pandas as pd
import numpy as np
import web_scrape
import nbformat
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go

# Need to find a way to change the errors on notebook.
pio.renderers.default = 'notebook'

df = pd.read_csv('./data/Cost of Living Index.csv', header='infer')

### Cost of Living Index Scores

1. <u>Cost of Living Index (Excl. Rent)</u>: This index measures the relative prices of consumer goods and services—including groceries, dining, transportation, and utilities—but excludes housing costs like rent or mortgage payments. For example, a city with a Cost of Living Index of <u>120 suggests that everyday expenses are 20% higher</u> than in NYC, excluding rent.​
2. <u>Rent Index</u>: This index evaluates the average cost of renting apartments in a city compared to New York City. An <u>index value of 80 indicates</u> that rental prices are approximately <u>20% lower</u> than those in New York City.
3. <u>Groceries Index</u>: This index assesses the <u>cost of grocery items</u> in a city relative to New York City. It is calculated using the weighted prices of various food items commonly found in the "Markets" section of Numbeo's database.​
4. <u>Restaurants Index</u>: This index compares the <u>prices of meals and beverages in restaurants and bars</u> between a given city and New York City, reflecting dining out expenses.
5. <u>Cost of Living Plus Rent Index</u>: This comprehensive index combines the costs captured in the <u>Cost of Living Index and the Rent Index</u>, providing an <u>overall comparison</u> of both everyday expenses and housing costs relative to New York City.
6. <u>Local Purchasing Power</u>: This index indicates the relative purchasing power of residents in a city based on the average net salary. A value of <u>40 implies that, on average</u>, residents can afford <u>60% fewer goods</u> and services than those in New York City earning an average salary.​


In [2]:
df.drop('Rank', axis=1)
df.head()

Unnamed: 0,Rank,Country,Cost of Living Index,Rent Index,Cost of Living Plus Rent Index,Groceries Index,Restaurant Price Index,Local Purchasing Power Index
0,1,Cayman Islands,108.2,76.3,94.3,113.9,96.6,149.7
1,2,Switzerland,106.8,50.6,82.3,111.9,100.3,177.8
2,3,Iceland,94.5,50.1,75.2,102.5,99.3,120.1
3,4,Bahamas,85.4,47.1,68.7,88.7,89.6,62.8
4,5,Singapore,85.3,75.1,80.9,77.7,54.3,103.0


In [4]:
df.drop('Rank', axis=1)
column_names = ['Country', 'Salary']


df2 = pd.DataFrame.from_records(web_scrape.results, columns=column_names)
df2.head()

Unnamed: 0,Country,Salary
0,United States,"$110,140"
1,Switzerland,"$97,518"
2,Israel,"$71,559"
3,Denmark,"$63,680"
4,Canada,"$61,680"


In [7]:
df_joined = df.merge(df2, on='Country', how='inner')
df_joined

Unnamed: 0,Rank,Country,Cost of Living Index,Rent Index,Cost of Living Plus Rent Index,Groceries Index,Restaurant Price Index,Local Purchasing Power Index,Salary
0,2,Switzerland,106.8,50.6,82.3,111.9,100.3,177.8,"$97,518"
1,5,Singapore,85.3,75.1,80.9,77.7,54.3,103.0,"$41,864"
2,6,Norway,78.9,28.2,56.8,81.1,82.4,129.9,"$57,013"
3,7,Denmark,74.1,28.8,54.3,67.6,85.1,148.6,"$63,680"
4,11,Israel,69.6,28.7,51.8,65.1,75.3,124.2,"$71,559"
5,13,Netherlands,68.1,37.4,54.7,63.0,74.0,139.5,"$45,180"
6,15,Ireland,66.6,44.7,57.0,64.7,69.6,117.4,"$48,427"
7,17,United States,64.8,41.3,54.6,71.3,65.5,157.2,"$110,140"
8,18,Germany,64.7,25.4,47.6,62.1,60.8,140.4,"$52,275"
9,19,Finland,64.5,21.1,45.6,64.9,67.0,139.5,"$47,850"


In [None]:
df_joined['Cost of Living Difference'] = df_joined['Cost of Living Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Cost of Living Index'].values[0]
df_joined['Rent Difference'] = df_joined['Rent Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Rent Index'].values[0]
df_joined['Cost of Living Plus Rent Index Difference'] = df_joined['Cost of Living Plus Rent Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Cost of Living Plus Rent Index'].values[0]
df_joined['Groceries Difference'] = df_joined['Groceries Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Groceries Index'].values[0]
df_joined['Restaurant Price Difference'] = df_joined['Restaurant Price Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Restaurant Price Index'].values[0]
df_joined['Local Purchasing Power Difference'] = df_joined['Local Purchasing Power Index'] - df_joined.loc[df_joined['Country'] == 'United States', 'Local Purchasing Power Index'].values[0]
df_joined['Salary'] = df_joined['Salary'].replace("[$,]", "", regex=True).astype(int)
df_joined['Salary Difference'] = df_joined['Salary'] - df_joined.loc[df_joined['Country'] == 'United States', 'Salary'].values[0]

In [9]:
df_joined = df_joined[["Rank", "Country", "Cost of Living Index", 'Cost of Living Difference', "Rent Index", 'Rent Difference', "Cost of Living Plus Rent Index", 'Cost of Living Plus Rent Index Difference', "Groceries Index", 'Groceries Difference', "Restaurant Price Index", 'Restaurant Price Difference', "Local Purchasing Power Index", 'Local Purchasing Power Difference', "Salary", 'Salary Difference']]
df_joined

Unnamed: 0,Rank,Country,Cost of Living Index,Cost of Living Difference,Rent Index,Rent Difference,Cost of Living Plus Rent Index,Cost of Living Plus Rent Index Difference,Groceries Index,Groceries Difference,Restaurant Price Index,Restaurant Price Difference,Local Purchasing Power Index,Local Purchasing Power Difference,Salary,Salary Difference
0,2,Switzerland,106.8,42.0,50.6,9.3,82.3,27.7,111.9,40.6,100.3,34.8,177.8,20.6,97518,-12622
1,5,Singapore,85.3,20.5,75.1,33.8,80.9,26.3,77.7,6.4,54.3,-11.2,103.0,-54.2,41864,-68276
2,6,Norway,78.9,14.1,28.2,-13.1,56.8,2.2,81.1,9.8,82.4,16.9,129.9,-27.3,57013,-53127
3,7,Denmark,74.1,9.3,28.8,-12.5,54.3,-0.3,67.6,-3.7,85.1,19.6,148.6,-8.6,63680,-46460
4,11,Israel,69.6,4.8,28.7,-12.6,51.8,-2.8,65.1,-6.2,75.3,9.8,124.2,-33.0,71559,-38581
5,13,Netherlands,68.1,3.3,37.4,-3.9,54.7,0.1,63.0,-8.3,74.0,8.5,139.5,-17.7,45180,-64960
6,15,Ireland,66.6,1.8,44.7,3.4,57.0,2.4,64.7,-6.6,69.6,4.1,117.4,-39.8,48427,-61713
7,17,United States,64.8,0.0,41.3,0.0,54.6,0.0,71.3,0.0,65.5,0.0,157.2,0.0,110140,0
8,18,Germany,64.7,-0.1,25.4,-15.9,47.6,-7.0,62.1,-9.2,60.8,-4.7,140.4,-16.8,52275,-57865
9,19,Finland,64.5,-0.3,21.1,-20.2,45.6,-9.0,64.9,-6.4,67.0,1.5,139.5,-17.7,47850,-62290


In [39]:
df_joined['US Salary'] = df_joined.loc[df_joined['Country'] == 'United States', 'Salary'].values[0]
df_joined.head()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,Rank,Country,Cost of Living Index,Cost of Living Difference,Rent Index,Rent Difference,Cost of Living Plus Rent Index,Cost of Living Plus Rent Index Difference,Groceries Index,Groceries Difference,Restaurant Price Index,Restaurant Price Difference,Local Purchasing Power Index,Local Purchasing Power Difference,Salary,Salary Difference,US Salary
0,2,Switzerland,106.8,42.0,50.6,9.3,82.3,27.7,111.9,40.6,100.3,34.8,177.8,20.6,97518,-12622,110140
1,5,Singapore,85.3,20.5,75.1,33.8,80.9,26.3,77.7,6.4,54.3,-11.2,103.0,-54.2,41864,-68276,110140
2,6,Norway,78.9,14.1,28.2,-13.1,56.8,2.2,81.1,9.8,82.4,16.9,129.9,-27.3,57013,-53127,110140
3,7,Denmark,74.1,9.3,28.8,-12.5,54.3,-0.3,67.6,-3.7,85.1,19.6,148.6,-8.6,63680,-46460,110140
4,11,Israel,69.6,4.8,28.7,-12.6,51.8,-2.8,65.1,-6.2,75.3,9.8,124.2,-33.0,71559,-38581,110140


In [52]:
#fig = px.bar(x = df_joined['Country'], y = df_joined['Salary'])
fig = go.Figure()

fig.add_trace(go.Bar(
    x=df_joined['Country'],
    y=df_joined['US Salary'],
    name='Difference from United States',
    marker_color='lightgray'
))

fig.add_trace(go.Bar(
    x = df_joined['Country'],
    y=df_joined['Salary'],
    name='Salary',
    marker_color='lightblue'
))

fig.update_layout(
    barmode='overlay',
    title='Salary Compared to United States',
    xaxis_title='Country',
    yaxis_title='Salary (USD)',
    legend_title='Legend',
    xaxis=dict(
        tickangle=-45
    )
)

fig.show()

ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed