# Introduce
In this article, we will perform some analysis on the performance data of the Vietnam National Team from 2010 - 2019 (only counting official matches within the framework of FIFA).

#Data
Data for this article is taken from [wiki page](https://en.wikipedia.org/wiki/Vietnam_national_football_team_results)

Data includes columns:
- `date` : date of the match.
- `opponent`: opponent of the Vietnamese team.
- `score`: match score (only counted within 90 minutes of official competition). The number of goals scored by the Vietnamese team is always written first.
- `venue`: venue for the match.
- `venue_type` : position of Vietnam Team
     - `H`: kick at home.
     - `A`: away match.
     - `N`: kick on neutral ground.
- `competition`: tournament in which the Vietnamese Team participates.

# Exercise

**Question 1** : Read data into `DataFrame`.

In [1]:
import pandas as pd
data = pd.read_csv('Data.txt')
data

Unnamed: 0,date,opponent,score,venue,competition,venue_type
0,2010-01-06,Lebanon,1-1,"Saida International Stadium, Saida",2011 AFC Asian Cup qualification,A
1,2010-01-17,China PR,1-2,"Mỹ Đình National Stadium, Hanoi",2011 AFC Asian Cup qualification,H
2,2010-09-20,Kuwait,3-0,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H
3,2010-09-22,Australia,0-2,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H
4,2010-09-24,North Korea,0-0,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H
...,...,...,...,...,...,...
90,2019-09-05,Thailand,0-0,"Thammasat Stadium, Pathum Thani",2022 FIFA World Cup qualification - AFC Second...,A
91,2019-10-10,Malaysia,1-0,"Mỹ Đình National Stadium, Hanoi",2022 FIFA World Cup qualification - AFC Second...,H
92,2019-10-15,Indonesia,3-1,"Kapten I Wayan Dipta Stadium, Gianyar",2022 FIFA World Cup qualification - AFC Second...,A
93,2019-11-14,United Arab Emirates,1-0,"Mỹ Đình National Stadium, Hanoi",2022 FIFA World Cup qualification - AFC Second...,H


**Question 2**: Check the data types of the columns and make the necessary conversions.

In [2]:
data = data.assign(
    date = pd.to_datetime(data.date)
)
data.dtypes

date           datetime64[ns]
opponent               object
score                  object
venue                  object
competition            object
venue_type             object
dtype: object

**Question 3**: How many matches did the Vietnamese team play in the period 2010 - 2019, including how many matches were played at home, away and neutral stadiums.

In [None]:
print('Total number of match:',data.shape[0])
venue = data.venue_type.unique()
for name_venue in venue:
  print('''Position play Việt Nam's team:''' , name_venue , ':' , 'Number of match:',data[data['venue_type'] == name_venue].shape[0])

Số trận đấu: 95
Position play Việt Nam's team: A : Number of match: 41
Position play Việt Nam's team: H : Number of match: 50
Position play Việt Nam's team: N : Number of match: 4


**Question 4**: In home matches, how many different locations does the Vietnamese Team play at? Print out those locations.

In [4]:
place_H = data[data['venue_type'] == 'H']
print('Số địa điểm sân nhà :', place_H.venue.nunique())
place_H.venue.unique()

Số địa điểm sân nhà : 5


array(['Mỹ Đình National Stadium, Hanoi',
       'Thống Nhất Stadium, Ho Chi Minh City',
       'Gò Đậu Stadium, Bình Dương', 'Lạch Tray Stadium, Hải Phòng',
       'Hàng Đẫy Stadium, Hanoi'], dtype=object)

**Question 5**: On average, how many matches does the Vietnamese team play each year?

In [6]:
year_play = data.date.dt.year.nunique()
print('Average number of match each year:', data.shape[0] / year_play)

Average number of match each year: 9.5


**Question 6**: Print out the number of matches the Vietnamese team played in 2015.

In [7]:
print('Number of match took place in 2015:',data[data['date'].dt.year == 2015].shape[0])

Number of match took place in 2015: 5


**Question 7**: From the `score` column, create two new columns:
- `goal`: number of goals scored by the Vietnamese team.
- `conceded_goal`: number of goals conceded by the Vietnamese Team.

---
_Hint_: use `.str.split('-')` to split the string from `score`.

In [8]:
score = data['score']
goal = []
conceded_goal = []
for macth in score:
  goal.append(macth[:1])
  conceded_goal.append(macth[2:])
data = data.assign(
    goal = goal,
    conceded_goal = conceded_goal
)
data

Unnamed: 0,date,opponent,score,venue,competition,venue_type,goal,conceded_goal
0,2010-01-06,Lebanon,1-1,"Saida International Stadium, Saida",2011 AFC Asian Cup qualification,A,1,1
1,2010-01-17,China PR,1-2,"Mỹ Đình National Stadium, Hanoi",2011 AFC Asian Cup qualification,H,1,2
2,2010-09-20,Kuwait,3-0,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H,3,0
3,2010-09-22,Australia,0-2,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H,0,2
4,2010-09-24,North Korea,0-0,"Mỹ Đình National Stadium, Hanoi",2010 Millennial Anniversary of Hanoi Football ...,H,0,0
...,...,...,...,...,...,...,...,...
90,2019-09-05,Thailand,0-0,"Thammasat Stadium, Pathum Thani",2022 FIFA World Cup qualification - AFC Second...,A,0,0
91,2019-10-10,Malaysia,1-0,"Mỹ Đình National Stadium, Hanoi",2022 FIFA World Cup qualification - AFC Second...,H,1,0
92,2019-10-15,Indonesia,3-1,"Kapten I Wayan Dipta Stadium, Gianyar",2022 FIFA World Cup qualification - AFC Second...,A,3,1
93,2019-11-14,United Arab Emirates,1-0,"Mỹ Đình National Stadium, Hanoi",2022 FIFA World Cup qualification - AFC Second...,H,1,0


**Question 8**: Print out the matches in which the Vietnamese team scored the most goals?

In [9]:
max_goal = data.goal.max()
data[data['goal'] == max_goal]

Unnamed: 0,date,opponent,score,venue,competition,venue_type,goal,conceded_goal
10,2010-12-02,Myanmar,7-1,"Mỹ Đình National Stadium, Hanoi",2010 AFF Championship,H,7,1
16,2011-07-03,Macau,7-1,"Estádio Campo Desportivo, Taipa",2014 FIFA World Cup qualification - AFC First ...,A,7,1


**Question 9**: Calculate the average number of goals scored by the Vietnamese team. (Number of goals # Number of wins)

In [10]:
data = data.assign(
    goal = pd.to_numeric(goal),
    conceded_goal = pd.to_numeric(conceded_goal)
)
mean_goal = data.goal.mean()
print('Số trận trung bình:' ,mean_goal)

Số trận trung bình: 1.5473684210526315


**Question 10**: Calculate the winning rate of the Vietnamese Team.

In [12]:
win = 0
for macth in range(data.shape[0]):
  if data.iloc[macth].goal > data.iloc[macth].conceded_goal:
    win = win + 1
print('Win rate:' ,(100/data.shape[0])*win)

Win rate: 43.1578947368421


**Question 11**: Select friendly matches during the above period. Know that friendly matches are recorded as `'Friendly'` in `competition`.

In [13]:
data_friendly = data[data.competition.str.find('Friendly') != -1]
data_friendly

Unnamed: 0,date,opponent,score,venue,competition,venue_type,goal,conceded_goal
5,2010-10-08,India,1-3,India,Friendly,A,1,3
6,2010-10-12,Kuwait,1-3,"Jaber Al-Ahmad International Stadium, Kuwait City",Friendly,A,1,3
19,2011-10-07,Japan,0-1,Japan,Friendly,A,0,1
20,2012-06-08,China PR,0-3,"Wuhan Sports Centre Stadium, Wuhan",Friendly,A,0,3
21,2012-06-10,Hong Kong,2-1,"Mong Kok Stadium, Kowloon",Friendly,A,2,1
22,2012-06-23,Mozambique,1-0,"Thống Nhất Stadium, Ho Chi Minh City",Friendly,H,1,0
23,2012-09-11,Malaysia,2-0,"Shah Alam Stadium, Shah Alam",Friendly,A,2,0
24,2012-09-15,Indonesia,0-0,"Gelora Bung Tomo Stadium, Surabaya",Friendly,A,0,0
25,2012-10-16,Indonesia,0-0,"Mỹ Đình National Stadium, Hanoi",Friendly,H,0,0
29,2012-11-03,Malaysia,1-0,"Mỹ Đình National Stadium, Hanoi",Friendly,H,1,0


**Question 12**: Calculate the win rate of the Vietnamese Team in friendly matches.

In [14]:
win_friendly = 0
for macth_friendly in range(data_friendly.shape[0]):
  if data_friendly.iloc[macth_friendly].goal > data_friendly.iloc[macth_friendly].conceded_goal:
    win_friendly = win_friendly + 1
print('Win rate:' ,(100/data_friendly.shape[0])*win_friendly)

Win rate: 50.00000000000001


**Question 13**: Average number of goals conceded in **non** friendly matches.

In [15]:
data_not_friend = data[data.competition.str.find('Friendly') == -1]
lost = 0
for macth_lost in range(data_not_friend.shape[0]):
  if data_not_friend.iloc[macth_lost].goal < data_not_friend.iloc[macth_lost].conceded_goal:
    lost = lost + 1
print('Lose rate:' ,(100/data_not_friend.shape[0])*lost)

Lose rate: 31.506849315068493
