![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

In this notebook we are going to analyze some statistics by country .

## Prep work

Run the  cell below to load libaries:

In [None]:
import pandas as pd

#to enable plotting in colab
def enable_plotly_in_cell():
    import IPython
    from plotly.offline import init_notebook_mode
    display(IPython.core.display.HTML('''
        <script src="/static/components/requirejs/require.js"></script>
  '''))
    init_notebook_mode(connected=False)
    
get_ipython().events.register('pre_run_cell', enable_plotly_in_cell)

#set of 20 colors that may be useful  for plotting 
colors20 = ['#e6194b', '#3cb44b', '#ffe119', '#4363d8', '#f58231', '#911eb4', '#46f0f0', 
          '#f032e6', '#bcf60c', '#fabebe', '#008080', '#e6beff', '#9a6324', '#fffac8', 
          '#800000', '#aaffc3', '#808000', '#ffd8b1', '#000075', '#808080', '#ffffff', '#000000']

## Read and explore input DataFrame

### countries
This dataframe was created by [Bootstrap](https://www.bootstrapworld.org/index.shtml) company and can be downloaded from  [here](https://docs.google.com/spreadsheets/d/19VoYxPw0tmuSViN1qFIkyUoepjNSRsuQCe0TZZDmrZs/edit#gid=213565368).


Data were aggregated from the following souces:
 - The World Factbook:
  - [GDP (PPP)](https://www.cia.gov/library/publications/the-world-factbook/rankorder/2001rank.html)
  - [Life expectancy at birth](https://www.cia.gov/library/publications/the-world-factbook/fields/355rank.html)
  - [Population](https://www.cia.gov/library/publications/the-world-factbook/fields/335rank.html)

- Wikipedia:
 - [Universal Health Care](https://en.wikipedia.org/wiki/List_of_countries_with_universal_health_care)
 
Some countries/territories/regions were omitted from the dataset due to incomplete data.

Column description:

**gdp(\$US)**  - the sum value of all goods and services produced in the country valued at prices prevailing in the United States.

**life-expectancy (yrs)** -  the average number of years to be lived by a group of people born in the same year, if mortality at each age remains constant in the future. Life expectancy at birth is also a measure of overall quality of life in a country and summarizes the mortality at all ages.

**population** - population of the country.

**has-univ-healthcare** - Universal health coverage is a broad concept that has been implemented in several ways. The common denominator for all such programs is some form of government action aimed at extending access to health care as widely as possible and setting minimum standards.

**code** - Country code

In [None]:
#we have csv file stored in the cloud
url = "https://swift-yeg.cloud.cybera.ca:8080/v1/AUTH_d22d1e3f28be45209ba8f660295c84cf/hackaton/countries2.csv"

#read csv file from url and save it as dataframe
countries = pd.read_csv(url)

#print first 5 rows
countries.head()

In [None]:
#how many rows and colums does the dataframe have?
countries.shape

In [None]:
#print column names
countries.columns

# Suggested group goals part 1
   
1. Run the cells below to load `plotly.express` package and draw a map colored by life expectancy.
     - Looking at the map - wich country has the highest life expectancy?
     - Print it on the screen (dataframe row corresponding to this country)
2. Using the cells below as an example - create new cells and draw a map colored by `population` and `gdp ($US)`
     - If you look at both maps you created - do they look similar? Why do you think it happens?
     - Print on the screen the exact number for China population.


In [None]:
#library should be installed already
#!pip install plotly_express

In [None]:
import plotly.express as px

In [None]:
fig = px.choropleth(countries, locations="code",
                    color="life-expectancy (yrs)", #coloring by life-expectancy
                    hover_name="country") #country name will appear when you hover your mouse over it
fig.show()

# Suggested group goals part 2

1. Uncomment and run the cells below to load  `cufflinks` libraray.
2. Create a new column `gdp ($US) person`  by dividing `'gdp ($US)'` by `population` column.
    -  Find the top 20 countries having the highest `gdp ($US) person` value.
        - Plot the results (hint: set index to country: `set_index("country")` )
        - What is the population for these countries?
3. Find the top 20 countries with least life expectancy.
     - Do these countries have Universal Health Care?
4. What is the number of countries by continent?
     - Plot the results.
5. What is the population by continent?
     - Plot the results.
6. Calculate the number of countries by continent having Universal Health Care
    - Plot the results.
  
**Extra challenge**:

Is there anything else interesting you can find and visualize for these data?

In [None]:
#library should be installed already
#!pip install cufflinks ipywidgets

In [None]:
#load "cufflinks" library under short name "cf"
import cufflinks as cf

#command to display graphics correctly in Jupyter notebook
cf.go_offline()

![alt text](https://github.com/callysto/callysto-sample-notebooks/blob/master/notebooks/images/Callysto_Notebook-Banners_Bottom_06.06.18.jpg?raw=true)