# World Alcohol Consumption

This project was inspired by the <i>FiveThirtyEight</i> article, "<a href="https://fivethirtyeight.com/features/dear-mona-followup-where-do-people-drink-the-most-beer-wine-and-spirits/">Dear Mona Followup: Where Do People Drink The Most Beer, Wine And Spirits?</a>". The 2010 alcohol consumption data included in the article was provided by <i>FiveThirtyEight</i> on <a href="https://github.com/fivethirtyeight/data/blob/master/alcohol-consumption/drinks.csv">GitHub</a>. Plotly was used to take this data one step further and map world beer, wine, spirit, and total alcohol consumption. My workflow is structured as follows:
1. Data cleaning and merging country codes using pandas
2. Mapping world alcohol consumption using plotly
3. Final thoughts

In [12]:
import pandas as pd
import plotly.plotly as py

In [13]:
#Read in .csv file 
world_url = "https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv"
world_alcohol = pd.read_csv(world_url)
world_alcohol.head()

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9


## Data Cleaning and Merging
We're missing the country codes needed to map these figures. Let's find this data online and incorporate these into our existing dataframe.

In [14]:
#Read in country codes data found on GitHub
code_url = "https://gist.githubusercontent.com/tadast/8827699/raw/7255fdfbf292c592b75cf5f7a19c16ea59735f74/countries_codes_and_coordinates.csv"
code_df = pd.read_csv(code_url)

code_df.head()

Unnamed: 0,Country,Alpha-2 code,Alpha-3 code,Numeric code,Latitude (average),Longitude (average)
0,Afghanistan,"""AF""","""AFG""","""4""","""33""","""65"""
1,Albania,"""AL""","""ALB""","""8""","""41""","""20"""
2,Algeria,"""DZ""","""DZA""","""12""","""28""","""3"""
3,American Samoa,"""AS""","""ASM""","""16""","""-14.3333""","""-170"""
4,Andorra,"""AD""","""AND""","""20""","""42.5""","""1.6"""


We'll first clean this data by only including the country and corresponsing Alpha-3 code and stripping the quotation marks. 

In [15]:
code_df['Alpha-3 code'] = code_df['Alpha-3 code'].apply(lambda x: x.replace('"','').strip())
only_code_df = code_df[['Country','Alpha-3 code']]
#Rename columns to match naming convention of existing dataframe
only_code_df.columns = ['country','code'] 

only_code_df.head()

Unnamed: 0,country,code
0,Afghanistan,AFG
1,Albania,ALB
2,Algeria,DZA
3,American Samoa,ASM
4,Andorra,AND


In [16]:
#Merge country codes with alcohol data and check null values
combined_df = world_alcohol.merge(only_code_df, how="left", on="country")
combined_df[combined_df['code'].isnull()] 

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,code
5,Antigua & Barbuda,102,128,45,4.9,
21,Bosnia-Herzegovina,76,173,8,4.6,
28,Cote d'Ivoire,37,1,7,4.0,
29,Cabo Verde,144,56,16,4.0,
46,North Korea,0,0,0,0.0,
47,DR Congo,32,3,1,2.3,
79,Iran,0,0,0,0.0,
92,Laos,62,0,123,6.2,
110,Micronesia,62,50,18,2.3,
139,Moldova,109,226,18,6.3,


Unfortunately the naming conventions of a handful of countries differed between the two datasets that were merged. We'll have to manually look up the Alpha-3 codes for these specific countries.

In [17]:
#Assign missing Alpha-3 codes
country_codes = {"Antigua & Barbuda":"ATG",
                "Bosnia-Herzegovina":"BIH",
                "Cote d'Ivoire":"CIV",
                "Cabo Verde":"CPV",
                "North Korea":"PRK",
                "DR Congo":"COD",
                "Iran":"IRN",
                "Laos":"LAO",
                "Micronesia":"FSM",
                "Moldova":"MDA",
                "St. Kitts & Nevis":"KNA",
                "St. Lucia":"LCA",
                "St. Vincent & the Grenadines":"VCT",
                "Sao Tome & Principe":"STP",
                "Syria":"SYR",
                "Macedonia":"MKD",
                "Tanzania":"TZA",
                "USA":"USA"}

#Use dictionary to map these country codes in our dataframe
combined_df['code'] = combined_df['code'].fillna(combined_df['country'].map(country_codes))
#Check to make sure there are no more null codes
combined_df['code'].isnull().sum()                                    

0

## Mapping World Alcohol Consumption 

Now that we have Alpha-3 country codes and a clean dataset, we can map alcohol consumption data on an interactive world map using plotly.

We'll first define some functions that will be used to format the maps.

In [18]:
def set_layout(title):
    layout = dict(
        title = title,
        geo = dict(
            showframe = False,
            showcoastlines = False,
            projection = dict(
                type = 'robinson'
            )
        )
    )
    return layout

def set_data(bar_title, column, colorscale):
    data = [dict(
            type = 'choropleth',
            locations = combined_df['code'],
            z = combined_df[column],
            text = combined_df['country'],
            colorscale = colorscale,
            autocolorscale = False,
            marker = dict(
                line = dict(
                    color = 'rgb(180,180,180)',
                    width = 0.3
                )),
            colorbar = dict(
                autotick = True,
                title = bar_title),
          )]
    return data

### Mapping Total Alcohol Consumption

In [19]:
alcohol_colorscale = [[0,"rgb(242, 242, 242)"],
            [0.2,"rgb(196, 216, 237)"],
            [0.4,"rgb(138, 177, 219)"],
            [0.6,"rgb(61, 125, 194)"],
            [0.8,"rgb(37, 75, 116)"],
            [1,"rgb(13, 26, 38)"]]

alcohol_title = '2010 World Alcohol Consumption'
alcohol_bar_title = 'Total Liters of Pure Alcohol <br>Consumed (per person)</br>'
alcohol_column = 'total_litres_of_pure_alcohol'

alcohol_data = set_data(alcohol_bar_title, alcohol_column, alcohol_colorscale)
alcohol_layout = set_layout(alcohol_title)

alcohol_fig = dict(data=alcohol_data, layout=alcohol_layout )
py.iplot(alcohol_fig, validate=False, filename='alcohol-world-map', sharing='public')

It comes to no surprise that alcohol consumption seems to be highest in Europe and lowest in North Africa and the Middle East. Whereas alcohol consumption has been ingrained in European culture for centuries, alcohol is generally not consumed in Islamic countries due to religious reasons. 

Let's plot specific alcohol consumption and see what we find. 

### Mapping Beer Consumption

In [20]:
beer_colorscale = [[0,'rgb(242, 242, 242)'],
                    [0.2,'rgb(255, 235, 179)'],
                    [0.4,'rgb(255, 214, 102)'],
                    [0.6,'rgb(255, 187, 0)'],
                    [0.8,'rgb(153, 112, 0)'],
                    [1,'rgb(77, 57, 0)']]

beer_title = '2010 World Beer Consumption'
beer_bar_title = 'Total Servings of Beer <br>Consumed (per person)</br>'
beer_column = 'beer_servings'

beer_data = set_data(beer_bar_title, beer_column, beer_colorscale)
beer_layout = set_layout(beer_title)

beer_fig = dict(data=beer_data, layout=beer_layout )
py.iplot(beer_fig, validate=False, filename='beer-world-map', sharing='public')

Beer seems to be popular in the former British colonies and of course extremely popular in Germany, Ireland, Poland, and the Czech Republic. However, beer is also extremely popular in a few unexpected countries: Venezuela, Namibia, and Gabon. I also expected beer consumption to be higher in the UK and Scandinavia. 

After some googling I found an <a href="https://www.cnn.com/travel/article/namibia-beer/index.html"> article</a> explaining how beer was introduced to Namibia when it was a German colony. Apparently brewing beer is still central to Namibia's culture, as the country produces some of Africa's best beers. 

### Mapping Wine Consumption

In [21]:
wine_colorscale = [[0,"rgb(242, 242, 242)"],
            [0.2,"rgb(242, 217, 217)"],
            [0.4,"rgb(215, 142, 142)"],
            [0.6,"rgb(189, 66, 66)"],
            [0.8,"rgb(113, 40, 40)"],
            [1,"rgb(57, 20, 20)"]]

wine_title = '2010 World Wine Consumption'
wine_bar_title = 'Total Servings of Wine <br>Consumed (per person)</br>'
wine_column = 'wine_servings'

wine_data = set_data(wine_bar_title, wine_column, wine_colorscale)
wine_layout = set_layout(wine_title)

wine_fig = dict(data=wine_data, layout=wine_layout )
py.iplot(wine_fig, validate=False, filename='wine-map', sharing='public')

Unsurprisingly, wine appears to be most popular in France and French Guiana. Portugal seems to come right behind France in wine consumption, to my surprise. Wine is predictably popular in Italy but also almost equally popular in a few unexpected countries: Argentina, Urguguay, and Australia. 

### Mapping Spirit Consumption

In [22]:
spirits_colorscale = [[0,"rgb(242, 242, 242)"],
              [0.2,"rgb(235, 179, 255)"],
              [0.4,"rgb(214, 102, 255)"],
              [0.6,"rgb(187, 0, 255)"],
              [0.8,"rgb(112, 0, 153)"],
              [1,"rgb(57, 0, 77)"]]

spirits_title = '2010 World Spirit Consumption'
spirits_bar_title = 'Total Servings of Spirits <br>Consumed (per person)</br>'
spirits_column = 'spirit_servings'

spirits_data = set_data(spirits_bar_title, spirits_column, spirits_colorscale)
spirits_layout = set_layout(spirits_title)

spirits_fig = dict(data=spirits_data, layout=spirits_layout )
py.iplot(spirits_fig, validate=False, filename='sprits-map', sharing='public')

Russia and the former USSR predictably consume the most servings of spirits (presumably consisting of a lot of vodka). 
One country that stands out to me is Thailand, which consumes more servings of spirits than most of Eastern Europe. After some googling, it appears that a special kind of rice-based whiskey is extremely popular in Thailand, along with an assortment of Thai cocktails. 

China also appears to consume a lot of spirits. After some further research, I learned that baijiu is the most popular distilled spirit in China and the most-consumed distilled spirit in the world. 

## Conclusion

Using world alcohol consumption data, I discovered some pretty interesting findings and have now added the following to my beverage bucket list:
* Namibian and Venezuelan beer
* Argentine and Australian wine 
* Thai whiskey
* Baijiu (China)

### Possible Further Steps
* Analyze trends in world alcohol consumption 
* Map most popular alcoholic beverage in each country
* Analyze alcohol consumption among US states