<a href="https://colab.research.google.com/github/JorgeJaramilo060892/Data-Analyst/blob/main/Plotly_Gap_Minder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Exploration
In this data I'm going to focus only in the year 2007 to visualize the different types of data visualization.

In [20]:
import plotly.express as px

In [21]:
df = px.data.gapminder().query("year ==2007")
df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
11,Afghanistan,Asia,2007,43.828,31889923,974.580338,AFG,4
23,Albania,Europe,2007,76.423,3600523,5937.029526,ALB,8
35,Algeria,Africa,2007,72.301,33333216,6223.367465,DZA,12
47,Angola,Africa,2007,42.731,12420476,4797.231267,AGO,24
59,Argentina,Americas,2007,75.320,40301927,12779.379640,ARG,32
...,...,...,...,...,...,...,...,...
1655,Vietnam,Asia,2007,74.249,85262356,2441.576404,VNM,704
1667,West Bank and Gaza,Asia,2007,73.422,4018332,3025.349798,PSE,275
1679,"Yemen, Rep.",Asia,2007,62.698,22211743,2280.769906,YEM,887
1691,Zambia,Africa,2007,42.384,11746035,1271.211593,ZMB,894


##To do some basic data exploration let's look at "Life Expectancy"

In [28]:
fig = px.strip(df, x="lifeExp", color="continent", hover_name="country")
fig.show()

- As we can see the lowest life expectancy is 39 years old that corresponds to the country of Swaziland
- And the highest life expectancy is 82 years old that correspond to the country of Japan

Everything related to the year 2007

In [32]:
fig = px.bar(df, color="lifeExp", x="pop", y="continent", hover_name="country", height=500)
fig.show()

So now same data, same 3 variables =
- Continent
- Population
- Life Expectancy
Totally different Presentation

A kind of different perspective on this data would be to say that it's actually a part-to-whole relationship where every person in lives in a country and every country is in a continent.

In [35]:
fig = px.sunburst(df, color="lifeExp", values="pop", path=["continent", "country"], hover_name="country", height=500)
fig.show()

In the Sunburst diagram we can show a kind of nested representation of 3 variables.

In [36]:
fig = px.treemap(df, color="lifeExp", values="pop", path=["continent", "country"], hover_name="country", height=500)
fig.show()

Same set of arguments except here we are doing nested boxes instead of wrapping.

In [37]:
fig = px.choropleth(df, color="lifeExp", locations="iso_alpha", hover_name="country", height=500)
fig.show()

# Relationship beetween "Life Expectancy and  GPD per Capita"

In [40]:
fig = px.scatter(df, y="lifeExp", x="gdpPercap", hover_name="country", color="continent", size="pop", size_max=60, log_x=True, height=500)
fig.show()

In [62]:
fig = px.scatter(df, y="lifeExp", x="gdpPercap", hover_name="country", color="continent", size="pop", size_max=60,
                 log_x=True, height=800, width=1000, template="simple_white",
                 color_discrete_sequence=px.colors.qualitative.G10,
                 title="Health vs Wealth 2007",
                 labels=dict(
                    continent="Continent", pop="Population",
                    gdpPercap="GDP per Capita (US$, price-adjusted)",
                    lifeExp= "Life Expectancy (years)"))
# Update the graphic design
fig.update_layout(font_family="Rockwell",
                  legend=dict(orientation="h", title="", y=1.1, x=1, xanchor="right", yanchor="bottom"))

# Update axes x and y
fig.update_xaxes(tickprefix="s", range=[2,5], dtick=1)
fig.update_yaxes(range=[30,90])
# Adding horizontal and vertical lines
fig.add_hline(y=(df['lifeExp'].sum()/df['pop'].sum()), line=dict(width=1, dash='dot'))  # Line horizontal
fig.add_vline(x=(df['gdpPercap'].sum()/df['pop'].sum()), line=dict(width=1, dash='dot'))  # Line vertical
fig.show()


fig.write_html("gapminder_2007.html") #interactive export
fig.write_json("gapminder_2007.json") #serialized export


# Now let's take a look at the DataFrame "data.wind".

In [63]:
df = px.data.wind()
fig = px.bar_polar(df, r="frequency", theta="direction", height=600,
                   color="strength", template="plotly_dark",
                   color_discrete_sequence= px.colors.sequential.Plasma_r)
fig.show()
