In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# Additional packages needed : 
import matplotlib.pyplot as plt
import seaborn as sb

import plotly as ply
from plotly import tools
import plotly.graph_objs as go
from plotly.offline import iplot, init_notebook_mode, plot
import plotly.io as pio
from plotly.subplots import make_subplots
init_notebook_mode(connected=True)

pio.templates.default = "none"

# __2003-2018, 15 Years of All Blacks Superiority!__

To any rugby fan it is well known that the New Zealand national team, aka the __All Blacks__, is the biggest reference throughout the years. They can play anyone, anywhere and anytime the All Blacks are systematically the favorite.  
But beyond the feeling that they dominate this sport what if we look at some statistics over 15 years ?  

_Are they really that amazing ?_  
_Is there any place, team or time that they don't like ?_  
_What can be their kryptonite ?_  

Making good use of international matchs data collected between 2003 and 2018 we hereafter investigate the reference status acquired by the New Zealand rugby team.
Can we question it ?

## __The DataSet__

The data collected covers approximately 15 years from October 2003 until November 2018.  
To give some context, the first game recorded corresponds to the second game of New Zealand during the 2003 rugby World Cup in Australia (don't ask me why the first game is not included... which was a victory 70-7 against Italy) when the last record is an international test match, their last of 2018.  
During this 15 years period the All Blacks have participated in 4 four World Cups and won one in 2011.  

For each game we have the following informations :

In [None]:
df = pd.read_csv('../input/all-black-match-data20032018/ABMatchData(FULL).csv', parse_dates=['Date'])
df = df.set_index(df['Date'], drop=True)
df.sort_index(inplace=True)
df.rename(columns={'Result':'Score'}, inplace=True)

def wdl(x):
    if x>0:
        return 'Win'
    elif x==0:
        return 'Draw'
    else:
        return 'Lost'
    
df['Result'] = df['Score'].apply(lambda x: wdl(x))
df.info(memory_usage='deep')

I think most field names are self-explanatory... :)  
Notice yet that I have renamed the original `Result` column in `Score` and created a new `Result` column that can take only three values `Win`, `Lost` or `Draw` based on the `Score`.
This is simply for later analytic use (you'll see).

Anyway, now we have the data, let's dive in...

---
# __All Blacks Overall Stats 2003-2018__

In this analysis we are going to have a classic top-to-bottom approach and first look at global stats and indicators before going into more details, especially ones found along the way (because I've been told that it's where the devil resides).

In [None]:
# creating a temporary dataframe for the section
tmp_df = df['Result'].value_counts().to_frame()
tmp_df['Fraction'] = 100 * df['Result'].value_counts(normalize=True).to_frame()

So, first thing first, how many games have the All Blacks played, won and lost between 2003 and 2018 :

In [None]:
print("{} international games played\n".format(len(df)))
print("{} Victories ({:.2f}%)\n{} Defeats ({:.2f}%)\n{} Draws({:.2f}%)".format(
    tmp_df.loc['Win', 'Result'], tmp_df.loc['Win', 'Fraction'],
    tmp_df.loc['Lost', 'Result'], tmp_df.loc['Lost', 'Fraction'],
    tmp_df.loc['Draw', 'Result'], tmp_df.loc['Draw', 'Fraction']
))

... well, that's a good start.  
Over __205 games__ the All Blacks have a victory ratio of __86.83%__ from 2003 until 2018.  
Visually it looks like this :

In [None]:
fig = go.Figure(go.Pie(labels=tmp_df['Result'].index, values=tmp_df['Result'].values,
                      text=['{} games<br>{:.2f}%'.format(n,f) for n,f in zip(tmp_df['Result'].values,tmp_df['Fraction'].values)], textinfo='text',
                      hoverinfo='label+text'))

fig.update_layout({'height':500,
                   'title':'All Blacks Game Results 2003-2018'})
fig.show()

As often a good visualization is worth a thousand datatables and the latter pie chart gives you the right first impression : they have won a lot!!!  

To go a step further we can look at the teams they played and who they have been able to beat :

In [None]:
result_df = df.groupby('Result').agg({'Opposition Name':'nunique'}).T

print("Played against {} different teams\n".format(df['Opposition Name'].nunique()))
print("Beat {} of them at least once.".format(result_df['Win'].values[0]))
print("Lost against {} teams at least once.".format(result_df[['Lost']].sum(axis=1).values[0]))

Here is another achievement, of the __21 different teams__ they have played between 2003 and 2018 __the All Blacks have beaten them all at least once__.  
Fruthermore __only 6 teams have been able to beat them__ at least once, these ones :

In [None]:
df[df['Result']=='Lost']['Opposition Name'].value_counts(ascending=False)\
    .to_frame().rename(columns={'Opposition Name':'Number of Victories against New Zealand'})

We can already notice that Australia and South Africa have apparently given New Zealand a hard time more than once.
We'll see that in more details later in this analysis.

The All Blacks have an overall impressive victory ratio over 15 years... but have they actually been able to maintain a high victory ratio over the years ?  
Well, if we look at the games outcome ratio over the years, here is how it looks :

In [None]:
# define temporary dataframe just the visualization
year_abs_df = df.resample('Y')['Result'].value_counts().unstack()
year_rel_df = df.resample('Y')['Result'].value_counts(normalize='index').unstack()

In [None]:
traces = []

for col in ['Win', 'Lost', 'Draw']:
    tmp_abs_df = year_abs_df[col].fillna(0.0)
    tmp_rel_df = year_rel_df[col].fillna(0.0)

    traces.append(go.Bar(x=tmp_rel_df.index.year, y=100 * tmp_rel_df.dropna().values, name=col,
                         text=['{:.1f}%'.format(100 * e) for e in tmp_rel_df.dropna().values],
                         textposition='inside',
                         hovertext=tmp_abs_df.dropna().values,
                         hovertemplate="<b>%{hovertext}</b> game(s)" + " <b>%{y}%</b>"))
    
fig = go.Figure(data=traces,
                layout={'barmode':'stack',
                        'hovermode':'x unified',
                        'height':500,
                        'xaxis': dict(title = 'Year', dtick=1, tickangle=-60),
                        'yaxis': dict(title = 'Percentage', dtick=10),
                        'title':'All Blacks Games Result over the Years'})
fig.show()

From that figure we can notice two things :
* Over 15 years __New Zealand has been able to maintain a high level of performance__, reaching a 100% victory ratio in 2013.
* Their __lowest victory ratio happened in 2009 with "only" 71.4% of victory__ that year.
The answer is hence quite clear : yes they have maintained a high victory ratio over the years. 

So what happened in 2009 ? How come did they lose 4 games ?  
It is worth noticing that 2009 is right inbetween two world cups (2007 won by South Africa and 2011 won by New Zealand) and we can assume that it was a time used by the head coach to test new players and/or tactics...  
Fortunately for us we can check the first claim by checking the number of debutants for each game of 2009.  

In the following figure we look at the final score difference versus the number of debutants.
In addition the color indicates the opposing team as follow :
* Green for South Africa
* Yellow for Australia
* Blue for France
* LightBlue for Italy
* Red for Wales
* White for England

In [None]:
tmp_df = df[df.index.year==2009]

fig = go.Figure(go.Scatter(x=tmp_df['Debutants'], y=tmp_df['Score'],
                           name='Game', mode='markers',
                           text=tmp_df['Opposition Name'],
                           hovertemplate="<b>%{y}</b> points difference<br>against <b>%{text}</b><br>" +
                           "<b>%{x}</b> debutants<br>",
                           marker={'color':tmp_df['Opposition Name'].replace({'South Africa':'green',
                                                                              'Australia':'gold',
                                                                              'France':'mediumblue',
                                                                              'Italy':'lightskyblue',
                                                                              'Wales':'crimson',
                                                                              'England':'whitesmoke'}),
                                   'size':10,
                                   'colorscale':'Bluered_r', 'opacity':1.0,
                                   'autocolorscale':False}))

fig.update_layout({'hovermode':'closest',
                   'height':500,
                   'xaxis': dict(title = 'Number of Debutants', dtick=1),
                   'yaxis': dict(title = 'Final Score Difference', dtick=10),
                   'title':'Final Score Difference vs Number of Debutants for games in 2009'})

fig.show()

Looking at the graph there is __no clear correlation between a defeat and the number of debutants in the squad__, instead it seems much correlated with the opposing team.  
Basically __South Africa was clearly responsible for the lower victory ratio in 2009__ (at the same time they were reigning world champions that same year).  

It's interesting to note that it's already the second time that the Springboks (South Africa rugby team) stands out in our analysis.
Maybe they are New Zealand's kryptonite.  

Zooming out of 2009, one would reasonably assume that games outcome are correlated with the "strength difference" between the two sides.
The only "strength" indicator available is each team rating and here is how it looks like when you look at results as a function of the rating difference :  
_(game won in blue, lost in orange and draws in green)_

In [None]:
# temporary df for visualization
tmp_df = df[['Opposition Name', 'Opposition Rating','Rating', 'Result']].copy()
tmp_df['Rating Difference'] = df['Opposition Rating'] - df['Rating']
tmp_df.sort_values(by='Rating Difference', ascending=False, inplace=True)

In [None]:
fig = go.Figure(data=go.Bar(x=np.arange(len(tmp_df)), y=tmp_df['Rating Difference'],name='Game',
                            marker_color=tmp_df['Result'].replace({'Win':'#1f77b4',
                                                                   'Lost':'#ff7f0e',
                                                                   'Draw':'#2ca02c'}),
                            text=['vs ' + e[1] + '<br>' + e[2] for e in tmp_df[['Opposition Name', 'Result']].itertuples()],
                            hovertemplate="%{text}<br>" + " <b>Rating Difference %{y}</b>"))

                
fig.update_layout({'hovermode':'x unified', 'height':500,
                   'xaxis': dict(showticklabels=False,
                                 tickvals=np.arange(len(tmp_df)),
                                 ticktext=tmp_df.index.strftime('%Y-%m-%d')),
                   'yaxis': dict(title = 'Rating Difference', dtick=10),
                   'title':'Game Result versus Rating Difference'})
fig.show()

Well... no clear correlation so not much to say here... New Zealand had more struggle against teams that were close in rating, which totally makes sense. Nothing to see here.  
For a better understanding it would be necessary to look at the rating system but this is beyond the scope of this analysis.

At this stage we know that between 2003 and 2018 the All Blacks have won most of their games and against any team.  
But maybe it was tight victories only... maybe they suffered large defeats... or maybe it is the other way wrong.
Let's have a look.

In [None]:
# cleaning a little bit the memory
del tmp_df, year_abs_df, year_rel_df, result_df

---
# __All Blacks Final Score Habits__

In this section we investigate in more details games final scores.  
The idea is to see if it reinforces the New Zealand rugby reference status or starts cracking it.

At first we can look at few descriptive stats :

In [None]:
# defining 
win_df = df[df['Result']=='Win']
lost_df =  df[df['Result']=='Lost']

biggest_win = win_df['Score'].idxmax()
smallest_wins = win_df[win_df['Score'] == win_df['Score'].min()]

In [None]:
print("Largest victory by {} points (against {} on {})\n".format(
    win_df.loc[biggest_win, 'Score'],
    win_df.loc[biggest_win, 'Opposition Name'],
    win_df.loc[biggest_win, 'Date'].strftime('%d %b %Y')
    ))

print("Smallest victory by 1 point.\n {} times, against :\n".format(len(smallest_wins)) + 
      (',\n').join([e[1] + ' on ' + e[0].strftime('%d %b %Y') for e in smallest_wins['Opposition Name'].iteritems()]))

print("\nMedian score difference for victories : {}".format(int(win_df['Score'].median())))

That last statistic is probably the most relevant of their ability not only to beat their opponents but to do it by quite a large amount :  
__Half of the All Blacks victories have been by more than 19 points__  
For those of you that might not be familiar with rugby it means that the opposing team has to score at least three times (three tries with conversion) to catch up.
In rugby this is generally considered to be something very complicated to achieve.  

+1 for reinforcing New Zealand reference status.  

Ok fine, but what about their defeats :

In [None]:
biggest_lost = lost_df['Score'].abs().idxmax()
smallest_lost = lost_df['Score'].abs().idxmin()

In [None]:
print("Largest defeat by {} points (against {} on {})\n".format(
    abs(lost_df.loc[biggest_lost, 'Score']),
    lost_df.loc[biggest_lost, 'Opposition Name'],
    lost_df.loc[biggest_lost, 'Date'].strftime('%d %b %Y')
    ))
print("Closest defeat by {} points (against {} on {})\n".format(
    abs(lost_df.loc[smallest_lost, 'Score']),
    lost_df.loc[smallest_lost, 'Opposition Name'],
    lost_df.loc[smallest_lost, 'Date'].strftime('%d %b %Y')
    ))

print("Median score difference for defeats : {}".format(int(lost_df['Score'].median())))

Well, one can argue that a defeat by 17 points is quite something (all credits to England that day) but it is interesting to note that for the All Blacks __the largest defeat is by less points than their victories median score__... I don't know how many teams can claim the same.  

Moreover __half of their defeat have been by less than 5 points__, in other words less than a non-converted try.

Again, +1 for the All Blacks legend.

Below we visualize all international games result over the 15 years time period.  
Victories are represented by blue bars, defeats by orange bars and draws by green dots.
The y-axis gives the game final score difference and the x-axis gives the game date.  
_(note : within the figure you can draw boxes with the mouse to zoom into the graph.)_

In [None]:
traces = []

for col in ['Win', 'Lost', 'Draw']:
    tmp_df = df[df['Result']==col].dropna()
    text = ['<b>{}</b> points difference<br> against <b>{}</b>'\
            .format(tmp_df.loc[date, 'Score'],tmp_df.loc[date, 'Opposition Name']) for date in tmp_df.index]

    if col=='Draw':
        traces.append(go.Scatter(x=tmp_df.index, y=tmp_df.dropna()['Score'].values,
                                 name=col, mode='markers',
                                 hovertext=text,
                                 hovertemplate="<b>"+col+"</b><br>" + "%{hovertext}"))
    else:
        traces.append(go.Bar(x=tmp_df.index, y=tmp_df.dropna()['Score'].values,
                             name=col, width=5e8,
                             hovertext=text,
                             hovertemplate="<b>"+col+"</b><br>" + "%{hovertext}"))

fig = go.Figure(data=traces)

fig.update_layout({'height':500,
                   'hovermode':'x unified',
                   'xaxis': dict(title = 'Date', tickangle=-60),
                   'yaxis': dict(title = 'Final Score Difference', dtick=10),
                   'title':'All Blacks Games Score Differences from 2003 until 2018'})
fig.show()

As expected blue bars representing victories are much higher than orange bars are deep, illustrating that the All Blacks defeat their opponent by quite a significant point difference.  
Interstingly, 2009 appears again has a relatively more complicated year in terms of results.

Another way to visualize dominance statement is by looking at the 3-games rolling average score difference over the 15 years :

In [None]:
tmp_df = df.reset_index(drop=True)['Score'].rolling(3).mean()

fig = go.Figure(go.Scatter(x=tmp_df.index, y=tmp_df.values.round(decimals=2), name='3 games rolling average',
                          mode='markers+lines',
                          text=df.index.strftime('%d %b %Y'),
                          hovertemplate="<b>%{text}</b><br>"+
                          "<b>%{y}</b> points average",
                          line = dict(width=1.0, color='black'),
                          marker = dict(color=tmp_df.values, 
                                    colorscale='RdYlBu',
                                    cmin=-tmp_df.max()/4,
                                    cmax=tmp_df.max()/4)))
    
fig.update_layout({'height':450,
                   'hovermode':'closest',
                   'xaxis': dict(title = 'Games', range=[0, 206],
                                 ticktext=df.index.year.unique(),
                                 tickvals=df['Date'].reset_index(drop=True).dt.year.drop_duplicates().index,
                                 dtick=10, tickangle=-60),
                   'yaxis': dict(title = 'Score Difference Average', dtick=10),
                   'title':'All Blacks 3-games Average Score Difference over the Years'})
fig.show()

So, on average over 3 games they have frequently been able to score more than 10 points than their opponents.  
Furthermore, since August 2011 the average score difference has never been negative.

In itself it might not seem much but it would be worth comparing with other national teams. I'm guessing that this is not that common.

+1 again in favor of the legend.
Looking at final has only reinforced our view.


So far we've looked at performances globally but we have already noticed that South Africa has been a strong opposition as well as Australia.
Maybe it is worth looking at how the All Blacks performed against the different teams individually.

In [None]:
# time to clean again
del tmp_df, win_df, lost_df, biggest_lost, biggest_win, smallest_lost, smallest_wins

---
# __All Blacks Favorite Victims and Best Rival__

As we saw in the previous sections, between 2003 and 2018, the All Blacks have played against 21 different teams and have beaten all of them at least once.  
We also noticed that the Springboks were their toughest opponent but what about the other teams ?  

Do the All Blacks have any favorite victim(s) ?  
Who is their best rival ?

Let's look at their performances against each teams :

In [None]:
# necessary temporay dataframes for this section
abs_df = pd.crosstab(df['Opposition Name'], df['Result']).sort_values('Win', ascending=False)
rel_df = pd.crosstab(df['Opposition Name'], df['Result'], normalize='index')

lost_df = abs_df[['Lost', 'Draw']].sum(axis=1)
lost_df = lost_df[lost_df!=0.0]

In [None]:
text_win = ['{:.1f}% of {} games'.format(100*rel_df.loc[ctry]['Win'], abs_df.sum(axis=1).loc[ctry])\
            for ctry in abs_df.index]
text_lad = ['{:.1f}% of {} games'.format(100*rel_df.loc[ctry][['Draw', 'Lost']].sum(), abs_df.sum(axis=1).loc[ctry])\
            for ctry in lost_df.index]

win_trace = go.Bar(x=abs_df.index, y=abs_df['Win'].values, name='Victories',
                   text=abs_df['Win'].values, textposition='outside',
                   hovertext=text_win,
                   hovertemplate="<b>%{y}</b><br>"+ "%{hovertext}")

lost_trace = go.Bar(x=lost_df.index, y=-lost_df.values, name='Lost or Draw',
                    text=lost_df.values, textposition='outside',
                    hovertext=text_lad,
                    hovertemplate="<b>%{text}</b><br>"+ "%{hovertext}")

fig = go.Figure(data=[win_trace, lost_trace])

fig.update_layout({'height':500,
                   'hovermode':'x unified',
                   'barmode':'relative',
                   'xaxis': dict(title = 'Team', tickangle=-60),
                   'title':'New Zealand Results by Opposition (sorted by Number of Victories)'})

fig.show()

As we already know, the Wallabies (Australia) and the Springboks (South Africa) have been a tough opposition for New Zealand, both of them being able not to lose 10 times.  
We can also notice that the Lions have played only 6 games and only lost 4 against the All Blacks, that's quite an achievement.

To better interpret our observations there are two things you must know :
* Every year the rugby championship occurs (formerly known as tri-nations). This is a competition involving New Zealand, South Africa, Australia and Argentina (since 2012) where each team plays twice against the other three. That's why Australia, South Africa and Argentina have played so many times against New Zealand and therefore had more chances to beat them... and more chance to be beaten as well.
* The Lions are not a nation but a team of the best players from the british islands. They don't play very often but it is a tradition that they play against reigning world champions in 3 games series.

Following now is another look at games result per opposit team but in victory/draw/defeat ratio :

In [None]:
sort_rel_df = rel_df.sort_values('Win', ascending=False)
traces = []

for col in ['Win', 'Lost', 'Draw']:
    tmp_df = sort_rel_df[col].dropna()
    traces.append(go.Bar(x=tmp_df.index, y=100*tmp_df.values.round(3), name=col,
                         hovertemplate="<b>%{y}%</b>"))
    
fig = go.Figure(data=traces)

fig.update_layout({'height':400,
                   'barmode':'stack',
                   'hovermode':'x unified',
                   'xaxis': dict(title = 'Team', tickangle=-60),
                   'yaxis': dict(title = 'Ratio [%]', dtick=10),
                   'title':'All Blacks Results ratio Versus other Teams'})
fig.show()

At this stage we __New Zealand__ did not really have a favorite victim considering how many teams that have systematically beaten in the sense that the __victims list would be too long__.

Regarding the best rivals, while South Africa and Australia were indeed strong opposition for New Zealand it seems that we have forgotten to notice the Lions.
__The selection of players from the british islands has the lowest ratio of defeats against the All Blacks, "only" 66.7%.__  

__Overall, the All Blacks best rival between 2003 and 2018 was South Africa with a 28.6% victory ratio for the later__


Now we have looked at teams individually it's worth investigating New Zealand performance at home or away from their island.

In [None]:
# cleaning, cleaning, cleaning
del tmp_df, sort_rel_df, rel_df, abs_df, lost_df

---
# Do the All Blacks Like Traveling ?

In sport in general we often hear about the advantage to play at home and how some teams are unbeatable in some specific places.  
For a team like the All Blacks that has been head and shoulders above all the others between 2003 and 2018 does this still means a thing or do all places just feel like home ?

In [None]:
# here is a dict of latitude and longitude of the different places where New Zealand has played.
# Note : it is not unlikely that for a few places coordinates are wrong, please let me know if that's the case

latlon_dict = {'Melbourne':{'latitude':-37.8249813, 'longitude':144.983613},
        'Brisbane':{'latitude':-27.4648603, 'longitude':153.0096055},
        'Sydney':{'latitude':-33.8471082, 'longitude':151.0643911},
        'Dunedin':{'latitude':-45.86931228637695, 'longitude':170.5242156982422},
        'Auckland':{'latitude':-36.8750114440918, 'longitude':174.7447509765625},
        'Hamilton':{'latitude':-37.7778596, 'longitude':175.2678314}, 
        'North Shore City':{'latitude':-36.8257515, 'longitude':174.8047152},
        'Wellington':{'latitude':-45.9037689, 'longitude':170.5237808},
        'Christchurch':{'latitude':-43.53333, 'longitude':172.63333},
        'Johannesburg':{'latitude':-26.1987522, 'longitude':28.0588697},
        'Rome':{'latitude':41.9339521, 'longitude':12.4547231},
        'Millennium Stadium':{'latitude':51.4782102, 'longitude':-3.1826127},
        'Stade de France':{'latitude':48.9244725, 'longitude':2.3601325},
        'Cape Town':{'latitude':-33.9034601, 'longitude':18.4111458},
        'Lansdowne Road':{'latitude':53.3352331, 'longitude':-6.2281785},
        'Twickenham':{'latitude':51.456027, 'longitude':-0.3415658},
        'Murrayfield':{'latitude':55.9420314, 'longitude':-3.237366},
        'Buenos Aires':{'latitude':-34.6354623, 'longitude':-58.5206591},
        'Pretoria':{'latitude':-25.7531683, 'longitude':28.2229414},
        'Rustenberg':{'latitude':-25.57863, 'longitude':27.16067},
        'Lyon':{'latitude':45.7237761, 'longitude':4.8322542},
        'Durban':{'latitude':-29.8306668, 'longitude':31.0322801},
        'Marseille':{'latitude':43.269972, 'longitude':5.390935},
        'Toulouse':{'latitude':43.5934385, 'longitude':1.4441164},
        'New Plymouth':{'latitude':-39.0702614, 'longitude':174.0651382},
        'Hong Kong':{'latitude':22.2727391, 'longitude':114.189227},
        'Croke Park':{'latitude':53.3606275, 'longitude':-6.2526326},
        'Bloemfontein':{'latitude':-29.1172351, 'longitude':26.2087696},
        'Tokyo':{'latitude':35.6678, 'longitude':139.5271},
        'Milan':{'latitude':45.4782004, 'longitude':9.123964},
        'Port Elizabeth':{'latitude':-33.937792, 'longitude':25.5994087},
        'La Plata':{'latitude':-34.9138407, 'longitude':-57.9890476},
        'Napier':{'latitude':-39.501834, 'longitude':176.9124712},
        'Chicago':{'latitude':41.8624515, 'longitude':-87.6167151},
        'Apia':{'latitude':-13.8364613, 'longitude':-171.7513179},
        'Wembley':{'latitude':51.5559838, 'longitude':-0.2795789},
        'London':{'latitude':51.53846, 'longitude':-0.01635},
        'Newcastle':{'latitude':54.9754693, 'longitude':-1.6218737},
        'Nelson':{'latitude':-41.2668521, 'longitude':173.2831593},
        'Yokohama':{'latitude':35.5099882, 'longitude':139.6063825}}

latlon_df = pd.DataFrame().from_dict(latlon_dict, orient='index')

In [None]:
# including places corrdinates
tmp_df = pd.merge(df, latlon_df.reset_index().rename(columns={'index':'Location'}), on='Location')

tmp_df['Ground'] = 'Away'
for i in tmp_df.index:
    if (tmp_df.loc[i, 'latitude']<-30) & (tmp_df.loc[i, 'longitude']>160):
        tmp_df.at[i,'Ground'] = 'Home'

Over 15 years New Zealand played 205 games as follow : 

In [None]:
tmp_df['Ground'].value_counts()

In more details here are results at home and away :

In [None]:
tmp_df.pivot_table(index=['Result'], columns=['Ground'], aggfunc='size').style.highlight_max()

A more visual representation of the previous table gives you these charts :

In [None]:
home_df = tmp_df.pivot_table(index=['Result'], columns=['Ground'], aggfunc='size')['Home']
away_df = tmp_df.pivot_table(index=['Result'], columns=['Ground'], aggfunc='size')['Away']

home_pie = go.Pie(labels=home_df.index, values=home_df.values, hole=0.4, name='Home Games',
                  hovertext=['{} games<br>{:.2f}%'.format(e, 100 * e/ home_df.sum()) for e in home_df.values],
                  hovertemplate="%{hovertext}")
                      #hoverinfo='label+text'))
away_pie = go.Pie(labels=away_df.index, values=away_df.values, hole=0.4, name='Away Games',
                  hovertext=['{} games<br>{:.2f}%'.format(e, 100 * e/ away_df.sum()) for e in away_df.values],
                  hovertemplate="%{hovertext}")

fig = make_subplots(rows=1, cols=2,
                    specs=[[{"type":"domain"}, {"type":"domain"}]],
                    subplot_titles=("Home results", "Away Results"))

fig.add_trace(home_pie,
              row=1, col=1)

fig.add_trace(away_pie,
              row=1, col=2)

fig.update_layout({'height':500,
                   'title':'All Blacks Game Results 2003-2018',
                   'annotations':[dict(text='Home', x=0.23, y=0.45, font_size=20, showarrow=False),
                                  dict(text='Away', x=0.78, y=0.45, font_size=20, showarrow=False)]})

fig.show()

As the number of games played at home and away are roughly the same it is clear that New Zealand is also subject to the home/away effect.  
Indeed __out of 24 games lost, 19 were away from the island, 4 times more than at home.__  
In terms of fraction, the All Blacks were three times more likely to lose away than at home.  

Are there actually any specific places where they lost a lot or won a lot ?

In [None]:
geo_df = pd.merge(pd.crosstab(df['Location'], df['Result'], normalize='index'), latlon_df,
                  left_index=True, right_index=True)

So, first, let's check where New Zealand never lost and where they always did :

In [None]:
print("All {} places where New Zealand was unbeaten between 2003 and 2018 :\n".format(len(geo_df['Win'].nlargest(1, keep='all')))
      + ('\n').join(geo_df['Win'].nlargest(1, keep='all').index.tolist()))

In [None]:
print("All {} places where New Zealand never won between 2003 and 2018 :\n".format(len(geo_df['Lost'].nlargest(1, keep='all')))
      + ('\n').join(geo_df['Lost'].nlargest(1, keep='all').index.tolist()))

For your consideration, all three places where New Zealand was unable to win are in South Africa.
Reinforcing our previous point that the Springboks represented the biggest challenge over these 15 years.

Hereafter is now a world map of all games played by the All Blacks.  
The size of each dot is related to the number of games played and the color represents the winning ratio (Blue = high victory ratio, Red = Low victory ratio).

In [None]:
tmp_df = pd.merge(df['Location'].value_counts().to_frame(), latlon_df, left_index=True, right_index=True).loc[geo_df.index]

fig = go.Figure(go.Scattermapbox(
    name='All Blacks Games',
    lon = tmp_df['longitude'], lat = tmp_df['latitude'],
    marker=go.scattermapbox.Marker(
        size=10 * np.log(tmp_df['Location']+1),
        color=100 * geo_df['Win'].round(decimals=3),
        showscale=True,
        colorscale='RdYlBu',
        autocolorscale=False,
        cmin=0, cmax=100,
        opacity=0.9
    ),
    text=[i[0] + ": {} games".format(i[1]) for i in tmp_df['Location'].iteritems()],
    hovertemplate="<b>%{text}</b><br>" + "%{marker.color}% wining ratio"
))

fig.update_layout(mapbox_style="open-street-map",
                  mapbox_center_lon=40, mapbox_zoom=0.4,
                  margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

From this map we can confidently say that, despite a higher defeats ratio away, __the All Blacks travel well and take full advantage of playing at home__.

Just for fun now, it is also possible to look at the world as a heatmap.  
It is the purpose of the next two worldmaps where places in blue represent where New Zealand is unbeatable and confidently plays, and red hot places where winning is a struggle and places can be hell.  

Here is where New Zealand feels at home :

In [None]:
fig = go.Figure(go.Densitymapbox(colorscale='Blues', name='Wining Heatmap',
    lat=geo_df['latitude'],
    lon=geo_df['longitude'],
    z=100 * geo_df['Win'].round(decimals=4),
    zmin=0.0, zmax=100, radius=20,
    text=geo_df.index.tolist(),
    hovertemplate="%{text}<br><b>%{z} Winning</b>% "))

fig.update_layout(mapbox_style="open-street-map",
                  mapbox_center_lon=40, mapbox_zoom=0.4,
                  margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

And here is where is hell for them : 

In [None]:
fig = go.Figure(go.Densitymapbox(colorscale='Reds', name='Losing Heatmap',
    lat=geo_df['latitude'],
    lon=geo_df['longitude'],
    z=100 * geo_df[['Lost', 'Draw']].sum(axis=1).round(decimals=4),
    zmin=0.0, zmax=100, opacity=1.0, radius=20,
    text=geo_df.index.tolist(),                         
    hovertemplate="%{text}<br><b>%{z}% Losing</b>"))

fig.update_layout(mapbox_style="open-street-map",
                  mapbox_center_lon=40, mapbox_zoom=0.4,
                  margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

Well... yep, South Africa was really the worst place for New Zealand.

---
# __Short Summary__

What can we actually conclude from our little analysis ?  

First of all that New Zealand was head and shoulders above all other nations between 2003 and 2018.  
Even though we lack statistics about other teams, no indicator allows us to question the All Blacks 15 years long hegemony.  

However they were not unbeatable.  
If 6 teams have been able to beat New Zealand, two stand out :
* the __Lions__ with the lowest defeat rate against the All Blacks (66.7%).
* __South Africa__ that has been able to win 10 times (28.6% of their games) and stand strong at home.

Nonetheless we found that obviously if you want to beat them, you'd better play them at home.
It will increase your chances but not by that much...