<img src="https://clipart.info/images/ccovers/1526942229Fifa-World-Cup_Russia-2018-logo-text.png" width=500>

![](https://e2k9ube.cloudimg.io/s/cdn/x/https://edienet.s3.amazonaws.com/library/features/images/full_6822.jpg?v=14/06/2018%2011:20:00)

**The 2018 FIFA World Cup** was an international football tournament contested by men's national teams and took place between **14 June** and **15 July 2018** in **Russia.** It was the **21st FIFA World Cup,** a worldwide football tournament held once every **four years**. It was the eleventh time the championships had been held in Europe, and the first time they were held in Eastern Europe. At an estimated cost of over $14.2 billion, it was the most expensive World Cup to date.

The tournament phase involved **32 teams,** of which 31 came through qualifying competitions, while as the host nation Russia qualified automatically. Of the 32, 20 had also appeared in the 2014 event, while both Iceland and Panama made their first appearances at the World Cup. 64 matches were played in 12 venues across 11 cities. Germany, the defending champions, were eliminated in the group stage. Host nation Russia was eliminated in the quarter-finals. In the final, **France** played **Croatia** on **15 July** at the **Luzhniki** Stadium in **Moscow.** **France won the match 4–2 to claim their second World Cup.**

The event featured a number of accolades. Croatian player **Luka Modrić** was voted the tournament's best player winning the Golden Ball. England's **Harry Kane** scored the most goals during the tournament with six. **Thibaut Courtois** won the Golden Glove awarded to the goalkeeper with the most clean sheets.

![](https://media.contentapi.ea.com/content/dam/ea/easports/fifa/features/2018/world-cup-announce-april-30/trophy-hero/f18wc-features-trophy-hero-bg-xs.jpg)

### **World Cup Details**

#### **Tournament Details**

* **Host country:** Russia<br>
* **Dates:** 14 June – 15 July<br>
* **Teams:** 32 (from 5 confederations)<br>
* **Venue(s):** 12 (in 11 host cities)<br>

#### **Final Positions**

* **Champions:** France (2nd title)<br>
* **Runners-up:** Croatia<br>
* **Third place:** Belgium<br>
* **Fourth place:** England<br>

#### **Tournament statistics**

* **Matches played:** 64<br>
* **Goals scored:** 169 (2.64 per match)<br>
* **Attendance:** 3,031,768 (47,371 per match)<br>
* **Top scorer(s):** England Harry Kane (6 goals)<br>
* **Best player(s):** Croatia Luka Modrić<br>
* **Best young player:** France Kylian Mbappé<br>
* **Best goalkeeper:** Belgium Thibaut Courtois<br>
* **Fair play award:** Spain<br>

### **Venue for World Cup 2018**

![](https://redfireonline.files.wordpress.com/2018/04/skysports-russia-world-cup-venues-stadiums_4121055.jpg)

<img src="https://arabicfonts.net/fonts/dusha-v5-regular.png?forcegenerate=True&text=Let%27s%20Kick%20Off%20Now" width=400 align=left>

**Load the important required libraries**

In [None]:
import pandas as pd
import numpy as np 
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

**Let's Load the dataset**

In [None]:
fifa18 = pd.read_csv("../input/fifa-worldcup-2018/2018_worldcup_v3.csv")

### **Data Analysis On Fifa World Cup 2018 Data Set**

**Checking first 5 and last 5 records from the datasets**

In [None]:
fifa18.head(5)

In [None]:
fifa18.tail(5)

**Let's check the duplicate data in data set**

In [None]:
fifa18.duplicated().sum()

In [None]:
fifa18.shape

In [None]:
fifa18.info()

**So, there 64 records in 14 columns. Also, there are no null records as well as duplicate values.**

**Let's extract hour from datetime and add it to the new column.**

In [None]:
fifa18['Hour'] = fifa18.Datetime.apply(lambda x: x.split(' - ')[1])
fifa18.Datetime = fifa18.Datetime.apply(lambda x: x.split(' - ')[0])

In [None]:
fifa18.head()

**Let's add total goals from home and away goals.**

In [None]:
fifa18['Total_Goals'] = fifa18['Home Team Goals']+fifa18['Away Team Goals']

In [None]:
fifa18.head()

**Let's rename few columns.**

In [None]:
fifa18.rename(columns={'Home Team Name': 'Home_Team',
                    'Away Team Name': 'Away_Team',
                    'Home Team Goals': 'Home_Team_Goals',
                    'Away Team Goals': 'Away_Team_Goals',}, inplace=True)

In [None]:
fifa18.head()

### **Exploratory Data Analysis - EDA**

In [None]:
fifa18['City'].value_counts().sort_index()

In [None]:
plt.figure(figsize=(10,5))
plt.title('Number Of Matches Held In Each Russian City', fontsize=14)
plt.xlabel("City", fontsize=12)
plt.ylabel("Count", fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)
sns.countplot(x = "City", data = fifa18, palette='rocket', order=fifa18["City"].value_counts().index)

**From above table and plot, we can observe that most number of the matches are held in Moscow.**

In [None]:
fifa18['Hour'].value_counts().sort_index()

In [None]:
plt.figure(figsize=(10,5))
plt.title('Number Of Matches Held In Each Hour', fontsize=14)
plt.xlabel("Hour", fontsize=12)
plt.ylabel("Count", fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)
sns.countplot(x = "Hour", data = fifa18, palette='mako', order=fifa18["Hour"].value_counts().index)

**From above table and plot, we can observe that most number of the matches are held in during 21:00 Hour time.**

In [None]:
fifa18['Stadium'].value_counts().sort_index()

In [None]:
plt.figure(figsize=(10,5))
plt.title('Number Of Matches Held In Each Stadium', fontsize=14)
plt.xlabel("Stadiums", fontsize=12)
plt.ylabel("Count", fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)
sns.countplot(x = "Stadium", data = fifa18, palette='Greens_d', order=fifa18["Stadium"].value_counts().index)

**From above table and plot, we can observe that most number of the matches are held in Luzhniki Stadium and Saint Petersburg Stadium with 7 no of matches each.**

In [None]:
goals_by_day = fifa18.groupby('Datetime').sum().Total_Goals.to_frame().reset_index()
goals_by_day.columns = ['Datetime','Total Goals By Day']
goals_by_day = goals_by_day.sort_values('Total Goals By Day', ascending=False)
goals_by_day

In [None]:
plt.figure(figsize=(12,8))
sns.barplot(y=goals_by_day['Datetime'], x=goals_by_day['Total Goals By Day'], palette='twilight', orient='h')
plt.title('No Of Goals Scored On Each Day', fontsize=15)
plt.xlabel('Goals', fontsize=12)
plt.ylabel('Date', fontsize=12)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)

**From above table and plot, we can observe that most number of goals scored on 24th June 2018 where as least number of goals scored on 10th July 2018.**

**Also, there is difference in group stages, QF, SF stage. So we will also have to look on no of matches held on each day.**

In [None]:
fifa18['Datetime'].value_counts().sort_index()

In [None]:
plt.figure(figsize=(10,5))
plt.title('Number Of Matches Held On Each Day', fontsize=14)
plt.xlabel("Days", fontsize=12)
plt.ylabel("Count", fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)
sns.countplot(x = "Datetime", data = fifa18, palette='cool_d', order=fifa18["Datetime"].value_counts().index)

**From above table and plot, we can observe that there were 4 days when 4 matches were held in one single day**

**Now, let'swork on total team goals, away and home.**

In [None]:
goals_by_home = fifa18.groupby('Home_Team').sum()[['Home_Team_Goals', 'Away_Team_Goals']].reset_index()
goals_by_away = fifa18.groupby('Away_Team').sum()[['Home_Team_Goals', 'Away_Team_Goals']].reset_index()
goals_total = pd.concat([goals_by_home, goals_by_away],axis=1)
goals_total.columns = ['Home_Team','Home_Scored', 'Home_Conceded', 'Away_Team', 'Away_Conceded', 'Away_Scored']
goals_total['Scored'] = goals_total.Home_Scored + goals_total.Away_Scored
goals_total['Conceded'] = goals_total.Home_Conceded + goals_total.Away_Conceded
goals_total = goals_total.drop(['Home_Scored', 'Home_Conceded', 'Away_Team', 'Away_Scored', 'Away_Conceded'], axis=1)
goals_total

In [None]:
goals_total['Goal_Diff'] = goals_total.Scored - goals_total.Conceded
goals_total = goals_total.rename(columns={'Home_Team': 'Team_Name'})
goals_total

**From above table, we can see that goals scored, conceded and the difference by each team.**

In [None]:
goals_total = goals_total.sort_values('Scored', ascending=False)
plt.figure(figsize=(12,8))
sns.barplot(x=goals_total['Team_Name'], y=goals_total['Scored'], palette='coolwarm')
plt.title('No Of Goals Scored By Each Teams', fontsize=15)
plt.xlabel('Teams', fontsize=12)
plt.ylabel('No of Goals', fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**As it's quite evident, Belgium scored the most number of goals in the tournament.**

In [None]:
goals_total = goals_total.sort_values('Conceded', ascending=False)
plt.figure(figsize=(12,8))
sns.barplot(x=goals_total['Team_Name'], y=goals_total['Conceded'], palette='cubehelix')
plt.title('No Of Goals Conceded By Each Teams', fontsize=15)
plt.xlabel('Teams')
plt.ylabel('No of Goals')
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**From above plot, Panama conceded the most number of goals. Also, teams like Argentina and Croatia are in 2nd position in the chart.**

In [None]:
goalbyhome_city = fifa18.groupby('City').sum().Home_Team_Goals.to_frame().reset_index()
goalbyhome_city.columns = ['City','Total Goals By Home Team']
goalbyhome_city = goalbyhome_city.sort_values('Total Goals By Home Team', ascending=False)
goalbyhome_city

In [None]:
plt.figure(figsize=(12,8))
sns.barplot(x=goalbyhome_city['City'], y=goalbyhome_city['Total Goals By Home Team'], palette='inferno')
plt.title('No Of Goals Scored By Home Team In Each city', fontsize=15)
plt.xlabel('City', fontsize=12)
plt.ylabel('No of Goals', fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**From above plot we can see that, most number of goals score by home team in the Moscow city.**

In [None]:
goalbyaway_city = fifa18.groupby('City').sum().Away_Team_Goals.to_frame().reset_index()
goalbyaway_city.columns = ['City','Total Goals By Away Team']
goalbyaway_city = goalbyaway_city.sort_values('Total Goals By Away Team', ascending=False)
goalbyaway_city

In [None]:
plt.figure(figsize=(12,8))
sns.barplot(x=goalbyaway_city['City'], y=goalbyaway_city['Total Goals By Away Team'], palette='icefire')
plt.title('No Of Goals Scored By Away Team In Each City', fontsize=15)
plt.xlabel('City', fontsize=12)
plt.ylabel('No of Goals', fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**From above plot we can see that, most number of goals score by away team in the Moscow city.**

**Therefore we can also say that Moscow city is the most happening city where most no of goals score by either of home or away teams.**

In [None]:
goalbyhome_standium = fifa18.groupby('Stadium').sum().Home_Team_Goals.to_frame().reset_index()
goalbyhome_standium.columns = ['Stadium','Total Goals By Home Team']
goalbyhome_standium = goalbyhome_standium.sort_values('Total Goals By Home Team', ascending=False)
goalbyhome_standium

In [None]:
plt.figure(figsize=(12,8))
sns.barplot(x=goalbyhome_standium['Stadium'], y=goalbyhome_standium['Total Goals By Home Team'], palette='flare')
plt.title('No Of Goals Scored By Home Team In Each Stadium', fontsize=15)
plt.xlabel('Stadium', fontsize=12)
plt.ylabel('No of Goals', fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**From above plot we can see that, most number of goals score by home team in the Luzhniki Stadium.**

In [None]:
goalbyaway_standium = fifa18.groupby('Stadium').sum().Away_Team_Goals.to_frame().reset_index()
goalbyaway_standium.columns = ['Stadium','Total Goals By Away Team']
goalbyaway_standium = goalbyaway_standium.sort_values('Total Goals By Away Team', ascending=False)
goalbyaway_standium

In [None]:
plt.figure(figsize=(12,8))
sns.barplot(x=goalbyaway_standium['Stadium'], y=goalbyaway_standium['Total Goals By Away Team'], palette='twilight_shifted_r')
plt.title('No Of Goals Scored By Away Team In Each Stadium', fontsize=15)
plt.xlabel('Stadium', fontsize=12)
plt.ylabel('No of Goals', fontsize=12)
plt.xticks(rotation=90, fontsize=12)
plt.yticks(fontsize=12)

**From above plot we can see that, most number of goals score by away team in the Kazan Arena.**