<a href="https://colab.research.google.com/github/yashguptaab99/Cricket-Prediction/blob/master/Cricket_Predictions_Batting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKET USING MACHINE LEARNING

Player selection is one the most important tasks for any sport and cricket is no exception. The performance of the players depends on various factors such as the opposition team, the venue, his current form etc. The team management, the coach and the captain select 11 players for each match from a squad of 15 to 20 players. They analyze different characteristics and the statistics of the players to select the best playing 11 for each match. Each batsman contributes by scoring maximum runs possible and each bowler contributes by taking maximum wickets and conceding minimum runs. This paper attempts to predict the performance of players as how many runs will each batsman score and how many wickets will each bowler take for both the teams. Both the problems are targeted as classification problems where number of runs and number of wickets are classified in different ranges. We used naïve bayes, random forest, multiclass SVM and decision tree classifiers to generate the prediction models for both the problems. Random Forest classifier was found to be the most accurate for both the problems. 

# Importing Libraray

In [1]:
import pandas as pd
import re

# Importing Data

In [2]:
# All Innings list after 14 Jan 2005
innings = pd.read_csv("/content/drive/My Drive/Projects/Cricket Prediction/Batting.csv")

# All Ininngs list from 18 Dec 1989 to 13 Jan 2005
inningsExtra = pd.read_csv("/content/drive/My Drive/Projects/Cricket Prediction/Batting89-05.csv")

# Data Preprocessing

## Batting data

In [3]:
innings = innings.drop(columns=['Mins', '4s', '6s', 'Sr', 'Inns'])
inningsExtra = inningsExtra.drop(columns=['Mins', '4s', '6s', 'Sr', 'Inns'])

In [4]:
# Cleaning data

innings = innings[innings.Runs != 'DNB']
innings = innings[innings.Runs != 'TDNB']
innings = innings[innings.Runs != 'sub']
innings = innings[innings.Runs != 'absent']
innings = innings.rename(columns={"Player 1":"Player", "Start Date":"StartDate"})

inningsExtra = inningsExtra[inningsExtra.Runs != 'DNB']
inningsExtra = inningsExtra[inningsExtra.Runs != 'TDNB']
inningsExtra = inningsExtra[inningsExtra.Runs != 'sub']
inningsExtra = inningsExtra[inningsExtra.Runs != 'absent']
inningsExtra = inningsExtra.rename(columns={"Player 1":"Player", "Start Date":"StartDate"})


In [5]:
#List of all players who played after 14 Jan 2005

listOfBatsman = list(innings['Player'].unique())

In [6]:
#Merging player past performance innings which were present in matches after 2005
# for ex. Sachine was senior most so his mast matches performance shoould be added 

for player in listOfBatsman:
  playerframe = inningsExtra[inningsExtra.Player == player]
  innings = innings.append(playerframe)


In [7]:
innings['StartDate'] = pd.to_datetime(innings['StartDate'])
# Now innings variable contains all players past played innings
innings.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 47844 entries, 2 to 35439
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   Player      47844 non-null  object        
 1   Team        47844 non-null  object        
 2   Runs        47844 non-null  object        
 3   Bf          47844 non-null  object        
 4   Opposition  47844 non-null  object        
 5   Ground      47844 non-null  object        
 6   StartDate   47844 non-null  datetime64[ns]
dtypes: datetime64[ns](1), object(6)
memory usage: 2.9+ MB


In [8]:
#Converting bf to integer and cleaning it
bf = []
for st in innings['Bf'].values:
  st = re.findall(r'[0-9]+', st)
  if not st:
    st.append('0')
  bf.append(float(st[0]))

In [9]:
innings['Bf'] = bf

In [10]:
innings.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 47844 entries, 2 to 35439
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   Player      47844 non-null  object        
 1   Team        47844 non-null  object        
 2   Runs        47844 non-null  object        
 3   Bf          47844 non-null  float64       
 4   Opposition  47844 non-null  object        
 5   Ground      47844 non-null  object        
 6   StartDate   47844 non-null  datetime64[ns]
dtypes: datetime64[ns](1), float64(1), object(5)
memory usage: 2.9+ MB


### Calculating The Derived Attributes

#### Consistency

This attribute describes how experienced the player is and how consistent he has been throughout his career. All the traditional attributes used in this formula are calculated over the entire career of the player. 

**Consistency = (0.4262 X average) + (0.2566 X no. of innings) + (0.1510 X SR) + (0.0787 X Centuries) + (0.0556 X Fifties) – (0.0328 X Zeros)**


In [11]:
#Please Rate then Calculate

## Consistency 
Consistency = []

for player in listOfBatsman:
  not_outs = 0
  runs_score = 0
  balls_faced = 0
  playerframe = innings[innings.Player == player]

  ######### Number of innings #########
  numInnings = playerframe.shape[0]

  ######### Amount of not out #########
  for st in playerframe['Runs'].values:
    if st.endswith("*"):
      not_outs+=1

   ######### Number of Dismisal #########
  num_of_dismisal = numInnings - not_outs

  ######### Total Runs #########
  #converting to int
  playruns = []
  for st in playerframe['Runs'].values:
    st = re.findall(r'[0-9]+', st)
    if not st:
      st.append('0')
    playruns.append(float(st[0]))
  playerframe['Runs'] = playruns
  runs_score = playerframe['Runs'].sum()

  ######### Total Ball Faced #########
  balls_faced = playerframe['Bf'].sum()

  ######### Batting Average #########
  if (num_of_dismisal==0):
    average = 0
  else:
    average = runs_score/num_of_dismisal

  ######### Strike Rate #########
  if (balls_faced==0):
    sr = 0
  else:
    sr = (runs_score/balls_faced) * 100

  ######### Number of Centuries #########
  cen = playerframe[playerframe.Runs >= 100].shape[0]

  ######### Number of Fifties #########
  fif = playerframe[playerframe.Runs >= 50].shape[0]
  fif = fif - cen

  ######### Highest Score #########
  h = playerframe['Runs'].max()

  ######### Number of Zeros #########
  zero = playerframe[playerframe.Runs == 0].shape[0]


####################  Rate the Elements Before Calculation  ####################

  #### numInnings ####
  if (numInnings>=1 and numInnings<=49):
    numInnings = 1
  elif (numInnings>=50 and numInnings<=99):
    numInnings = 2
  elif (numInnings>=100 and numInnings<=124):
    numInnings = 3
  elif (numInnings>=125 and numInnings<=149):
    numInnings = 4
  elif (numInnings>=150):
    numInnings = 5 

  #### average ####
  if (average>=0.0 and average<=9.9):
    average = 1
  elif (average>=10.0 and average<=19.9):
    average = 2
  elif (average>=20.0 and average<=29.9):
    average = 3
  elif (average>=30.0 and average<=39.9):
    average = 4
  elif (average>=40.0):
    average = 5   

  #### sr ####
  if (sr>=0.0 and sr<=49.9):
    sr = 1
  elif (sr>=50.0 and sr<=59.9):
    sr = 2
  elif (sr>=60.0 and sr<=79.0):
    sr = 3
  elif (sr>=80.0 and sr<=99.9):
    sr = 4
  elif (sr>=100.0):
    sr = 5  

  #### cen ####
  if (cen>=1 and cen<=4):
    cen = 1
  elif (cen>=5 and cen<=9):
    cen = 2
  elif (cen>=10 and cen<=14):
    cen = 3
  elif (cen>=15 and cen<=19):
    cen = 4
  elif (cen>=20):
    cen = 5 

  #### fif ####
  if (fif>=1 and fif<=9):
    fif = 1
  elif (fif>=10 and fif<=19):
    fif = 2
  elif (fif>=20 and fif<=29):
    fif = 3
  elif (fif>=30 and fif<=39):
    fif = 4
  elif (fif>=40):
    fif = 5 

  #### zero ####
  if (zero>=1 and zero<=4):
    zero = 1
  elif (zero>=5 and zero<=9):
    zero = 2
  elif (zero>=10 and zero<=14):
    zero = 3
  elif (zero>=15 and zero<=19):
    zero = 4
  elif (zero>=20):
    zero = 5 


  consistency = (0.4262 * average) + (0.2566 * numInnings) + (0.1510 * sr) + (0.0787 * cen) + (0.0556 * fif) - (0.0328 * zero)
  Consistency.append(consistency)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


In [12]:
ConsistencyFrame = pd.DataFrame(Consistency, columns = ["Consistency"])

In [13]:
ConsistencyFrame

Unnamed: 0,Consistency
0,1.4050
1,2.2902
2,3.0472
3,1.7130
4,1.3782
...,...
1395,2.4700
1396,1.6474
1397,2.0838
1398,1.7130


#### Form

Form of a player describes his performance over last one year. All the traditional attributes used in this formula are calculated over the matches played by the player in last 12 months from the day of the match. 

**Form = 0.4262 X average + 0.2566 X no. of innings + 0.1510 X SR + 0.0787 X Centuries + 0.0556 X Fifties – 0.0328 X Zeros**

In [14]:
#Please Rate then Calculate

## Form
Form = []

for player in listOfBatsman:
  playerframe = innings[innings.Player == player]
  playerframe = playerframe[playerframe.StartDate > "2019-01-01"]

  if not playerframe.empty:
    ######### Number of innings #########
    numInnings = playerframe.shape[0]

    ######### Amount of not out #########
    for st in playerframe['Runs'].values:
      if st.endswith("*"):
        not_outs+=1

    ######### Number of Dismisal #########
    num_of_dismisal = numInnings - not_outs

    ######### Total Runs #########
    #converting to int
    playruns = []
    for st in playerframe['Runs'].values:
      st = re.findall(r'[0-9]+', st)
      if not st:
        st.append('0')
      playruns.append(float(st[0]))
    playerframe['Runs'] = playruns
    runs_score = playerframe['Runs'].sum()

    ######### Total Ball Faced #########
    balls_faced = playerframe['Bf'].sum()

    ######### Batting Average #########
    if (num_of_dismisal==0):
      average = 0
    else:
      average = runs_score/num_of_dismisal

    ######### Strike Rate #########
    if (balls_faced==0):
      sr = 0
    else:
      sr = (runs_score/balls_faced) * 100

    ######### Number of Centuries #########
    cen = playerframe[playerframe.Runs >= 100].shape[0]

    ######### Number of Fifties #########
    fif = playerframe[playerframe.Runs >= 50].shape[0]
    fif = fif - cen

    ######### Highest Score #########
    h = playerframe['Runs'].max()

    ######### Number of Zeros #########
    zero = playerframe[playerframe.Runs == 0].shape[0]


  ####################  Rate the Elements Before Calculation  ####################

    #### numInnings ####
    if (numInnings>=1 and numInnings<=4):
      numInnings = 1
    elif (numInnings>=5 and numInnings<=9):
      numInnings = 2
    elif (numInnings>=10 and numInnings<=11):
      numInnings = 3
    elif (numInnings>=12 and numInnings<=14):
      numInnings = 4
    elif (numInnings>=15):
      numInnings = 5 

    #### average ####
    if (average>=0.0 and average<=9.9):
      average = 1
    elif (average>=10.0 and average<=19.9):
      average = 2
    elif (average>=20.0 and average<=29.9):
      average = 3
    elif (average>=30.0 and average<=39.9):
      average = 4
    elif (average>=40.0):
      average = 5   

    #### sr ####
    if (sr>=0.0 and sr<=49.9):
      sr = 1
    elif (sr>=50.0 and sr<=59.9):
      sr = 2
    elif (sr>=60.0 and sr<=79.0):
      sr = 3
    elif (sr>=80.0 and sr<=99.9):
      sr = 4
    elif (sr>=100.0):
      sr = 5  

    #### cen ####
    if (cen==1):
      cen = 1
    elif (cen==2):
      cen = 2
    elif (cen==3):
      cen = 3
    elif (cen==4):
      cen = 4
    elif (cen==5):
      cen = 5 

    #### fif ####
    if (fif>=1 and fif<=2):
      fif = 1
    elif (fif>=3 and fif<=4):
      fif = 2
    elif (fif>=5 and fif<=6):
      fif = 3
    elif (fif>=7 and fif<=9):
      fif = 4
    elif (fif>=10):
      fif = 5 

    #### zero ####
    if (zero==1):
      zero = 1
    elif (zero==2):
      zero = 2
    elif (zero==3):
      zero = 3
    elif (zero==4):
      zero = 4
    elif (zero==5):
      zero = 5 

    form = (0.4262 * average) + (0.2566 * numInnings) + (0.1510 * sr) + (0.0787 * cen) + (0.0556 * fif) - (0.0328 * zero)
  else:
    form = 0

  Form.append(form)

In [15]:
FormFrame = pd.DataFrame(Form, columns=['Form'])

In [16]:
FormFrame

Unnamed: 0,Form
0,-2.004600
1,-8.791000
2,0.000000
3,-0.986267
4,2.946200
...,...
1395,0.000000
1396,0.000000
1397,0.000000
1398,0.000000


#### Opposition

Opposition describes a player’s performance against a particular team. All the traditional attributes used in this formula are calculated over all the matches played by the player against the opposition team in his entire career till the day of the match. 

**Opposition = 0.4262 X average + 0.2566 X no. of innings + 0.1510 X SR + 0.0787 X Centuries + 0.0556 X Fifties – 0.0328 X Zeros** 

In [17]:
listOfOpposition = list(innings['Opposition'].unique())

In [18]:
#Please Rate then Calculate

## Opposition 
Oppositions = []

for player in listOfBatsman:
  playerframe = innings[innings.Player == player]
  perPlayerOpposition = []
  for opposition in listOfOpposition:
    oppositionframe = playerframe[playerframe.Opposition == opposition]
    if not oppositionframe.empty:
      ######### Number of innings #########
      numInnings = oppositionframe.shape[0]

      ######### Amount of not out #########
      for st in oppositionframe['Runs'].values:
        if st.endswith("*"):
          not_outs+=1

      ######### Number of Dismisal #########
      num_of_dismisal = numInnings - not_outs

      ######### Total Runs #########
      #converting to int
      playruns = []
      for st in oppositionframe['Runs'].values:
        st = re.findall(r'[0-9]+', st)
        if not st:
          st.append('0')
        playruns.append(float(st[0]))
      oppositionframe['Runs'] = playruns
      runs_score = oppositionframe['Runs'].sum()

      ######### Total Ball Faced #########
      balls_faced = oppositionframe['Bf'].sum()

      ######### Batting Average #########
      if (num_of_dismisal==0):
        average = 0
      else:
        average = runs_score/num_of_dismisal

      ######### Strike Rate #########
      if (balls_faced==0):
        sr = 0
      else:
        sr = (runs_score/balls_faced) * 100

      ######### Number of Centuries #########
      cen = oppositionframe[oppositionframe.Runs >= 100].shape[0]

      ######### Number of Fifties #########
      fif = oppositionframe[oppositionframe.Runs >= 50].shape[0]
      fif = fif - cen

      ######### Highest Score #########
      h = oppositionframe['Runs'].max()

      ######### Number of Zeros #########
      zero = oppositionframe[oppositionframe.Runs == 0].shape[0]


    ####################  Rate the Elements Before Calculation  ####################

      #### numInnings ####
      if (numInnings>=1 and numInnings<=2):
        numInnings = 1
      elif (numInnings>=3 and numInnings<=4):
        numInnings = 2
      elif (numInnings>=5 and numInnings<=6):
        numInnings = 3
      elif (numInnings>=7 and numInnings<=9):
        numInnings = 4
      elif (numInnings>=10):
        numInnings = 5 

      #### average ####
      if (average>=0.0 and average<=9.9):
        average = 1
      elif (average>=10.0 and average<=19.9):
        average = 2
      elif (average>=20.0 and average<=29.9):
        average = 3
      elif (average>=30.0 and average<=39.9):
        average = 4
      elif (average>=40.0):
        average = 5   

      #### sr ####
      if (sr>=0.0 and sr<=49.9):
        sr = 1
      elif (sr>=50.0 and sr<=59.9):
        sr = 2
      elif (sr>=60.0 and sr<=79.0):
        sr = 3
      elif (sr>=80.0 and sr<=99.9):
        sr = 4
      elif (sr>=100.0):
        sr = 5  

      #### cen ####
      if (cen==1):
        cen = 3
      elif (cen==2):
        cen = 4
      elif (cen>=3):
        cen = 5

      #### fif ####
      if (fif>=1 and fif<=2):
        fif = 1
      elif (fif>=3 and fif<=4):
        fif = 2
      elif (fif>=5 and fif<=6):
        fif = 3
      elif (fif>=7 and fif<=9):
        fif = 4
      elif (fif>=10):
        fif = 5 

      #### zero ####
      if (zero==1):
        zero = 1
      elif (zero==2):
        zero = 2
      elif (zero==3):
        zero = 3
      elif (zero==4):
        zero = 4
      elif (zero==5):
        zero = 5 

      oppo = (0.4262 * average) + (0.2566 * numInnings) + (0.1510 * sr) + (0.0787 * cen) + (0.0556 * fif) - (0.0328 * zero)
    else:
      oppo = 0
    perPlayerOpposition.append(oppo)
  Oppositions.append(perPlayerOpposition)   

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


In [19]:
OppositionsFrame = pd.DataFrame(Oppositions, columns = listOfOpposition) 

In [20]:
OppositionsFrame

Unnamed: 0,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI
0,1.010860,1.007168,0.000000,0.801,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
1,0.856168,0.000000,0.982054,0.000,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
2,0.000000,0.000000,0.000000,0.000,0.869665,0.988045,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
3,0.000000,0.851031,0.000000,0.000,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
4,0.908585,0.000000,0.000000,0.000,0.000000,0.000000,0.363022,0.684573,0.40613,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1395,0.000000,0.000000,0.000000,0.000,0.707442,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,1.017295,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
1396,0.708615,0.000000,0.000000,0.000,1.372533,1.776569,0.000000,0.000000,0.00000,0.965262,0.0,0.000000,1.785083,1.051178,1.232964,0.000000,1.412722,0.0,0.00000,0.920191,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0
1397,0.709460,1.423501,0.709366,0.000,0.406898,0.407553,0.000000,0.000000,0.00000,0.407085,0.0,0.858963,0.965217,0.717133,0.406945,1.578994,0.768200,0.0,0.78095,0.813189,1.237507,0.0,0.882058,0.597337,9.866605,0.0,0.0
1398,0.000000,0.000000,0.000000,0.000,0.000000,0.709506,0.000000,0.000000,0.00000,0.000000,0.0,0.407553,0.000000,0.000000,1.115704,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0


In [21]:
OppositionsFrame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1400 entries, 0 to 1399
Data columns (total 27 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   U.A.E.        1400 non-null   float64
 1   Scotland      1400 non-null   float64
 2   Namibia       1400 non-null   float64
 3   Nepal         1400 non-null   float64
 4   Zimbabwe      1400 non-null   float64
 5   New Zealand   1400 non-null   float64
 6   Netherlands   1400 non-null   float64
 7   Oman          1400 non-null   float64
 8   U.S.A.        1400 non-null   float64
 9   West Indies   1400 non-null   float64
 10  P.N.G.        1400 non-null   float64
 11  Bangladesh    1400 non-null   float64
 12  Pakistan      1400 non-null   float64
 13  England       1400 non-null   float64
 14  Australia     1400 non-null   float64
 15  Kenya         1400 non-null   float64
 16  India         1400 non-null   float64
 17  Africa XI     1400 non-null   float64
 18  Sri Lanka     1400 non-null 

#### Venue

Venue describes a player’s performance at a particular venue. All the traditional attributes used in this formula are calculated over all the matches played by the player at the venue in his entire career till the day of the match. 

**Venue = 0.4262 X average + 0.2566 X no. of innings + 0.1510 X SR + 0.0787X Centuries + 0.0556X Fifties + 0.0328 X HS**

In [22]:
listOfVenue = list(innings['Ground'].unique())

In [23]:
#Please Rate then Calculate

## Venue
Venues = []

for player in listOfBatsman:
  playerframe = innings[innings.Player == player]
  perPlayerVenue = []
  for venue in listOfVenue:
    venueframe = playerframe[playerframe.Ground == venue]
    if not venueframe.empty:
      ######### Number of innings #########
      numInnings = venueframe.shape[0]

      ######### Amount of not out #########
      for st in venueframe['Runs'].values:
        if st.endswith("*"):
          not_outs+=1

      ######### Number of Dismisal #########
      num_of_dismisal = numInnings - not_outs

      ######### Total Runs #########
      #converting to int
      playruns = []
      for st in venueframe['Runs'].values:
        st = re.findall(r'[0-9]+', st)
        if not st:
          st.append('0')
        playruns.append(float(st[0]))
      venueframe['Runs'] = playruns
      runs_score = venueframe['Runs'].sum()

      ######### Total Ball Faced #########
      balls_faced = venueframe['Bf'].sum()

      ######### Batting Average #########
      if (num_of_dismisal==0):
        average = 0
      else:
        average = runs_score/num_of_dismisal

      ######### Strike Rate #########
      if (balls_faced==0):
        sr = 0
      else:
        sr = (runs_score/balls_faced) * 100

      ######### Number of Centuries #########
      cen = venueframe[venueframe.Runs >= 100].shape[0]

      ######### Number of Fifties #########
      fif = venueframe[venueframe.Runs >= 50].shape[0]
      fif = fif - cen

      ######### Highest Score #########
      h = venueframe['Runs'].max()

      ######### Number of Zeros #########
      zero = venueframe[venueframe.Runs == 0].shape[0]


    ####################  Rate the Elements Before Calculation  ####################

      #### numInnings ####
      if (numInnings==1):
        numInnings = 1
      elif (numInnings==2):
        numInnings = 2
      elif (numInnings==3):
        numInnings = 3
      elif (numInnings==4):
        numInnings = 4
      elif (numInnings>=5):
        numInnings = 5 

      #### average ####
      if (average>=0.0 and average<=9.9):
        average = 1
      elif (average>=10.0 and average<=19.9):
        average = 2
      elif (average>=20.0 and average<=29.9):
        average = 3
      elif (average>=30.0 and average<=39.9):
        average = 4
      elif (average>=40.0):
        average = 5   

      #### sr ####
      if (sr>=0.0 and sr<=49.9):
        sr = 1
      elif (sr>=50.0 and sr<=59.9):
        sr = 2
      elif (sr>=60.0 and sr<=79.0):
        sr = 3
      elif (sr>=80.0 and sr<=99.9):
        sr = 4
      elif (sr>=100.0):
        sr = 5  

      #### cen ####
      if (cen==1):
        cen = 4
      elif (cen>=2):
        cen = 5

      #### fif ####
      if (fif==1):
        fif = 4
      elif (fif>=2):
        fif = 5

      #### h ####
      if (h>=1 and h<=24):
        h = 1
      elif (h>=25 and h<=49):
        h = 2
      elif (h>=50 and h<=99):
        h = 3
      elif (h>=100 and h<=149):
        h = 4
      elif (h>=150):
        h = 5   

      ven = (0.4262 * average) + (0.2566 * numInnings) + (0.1510 * sr) + (0.0787 * cen) + (0.0556 * fif) + (0.0328 * h)
    else:
      ven = 0
    perPlayerVenue.append(ven)
  Venues.append(perPlayerVenue)    

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


In [24]:
VenuesFrame = pd.DataFrame(Venues, columns = listOfVenue)

In [25]:
VenuesFrame

Unnamed: 0,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,Windhoek,Lauderhill,Sharjah,Bulawayo,Nairobi (Gym),Centurion,Port Elizabeth,Dhaka,Chattogram,Durban,Johannesburg,Providence,Basseterre,Cape Town,Hobart,Sydney,Adelaide,Perth,Melbourne,Kochi,Visakhapatnam,Jamshedpur,Ahmedabad,Kanpur,Delhi,Kingstown,Gros Islet,Lahore,Karachi,Rawalpindi,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
0,1.300673,0.000000,0.833800,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.000000,1.331651,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.000000,0.000000,0.000000,1.435057,1.075705,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.000000,0.000000,0.000000,0.000000,0.000000,0.892793,0.000000,0.00000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,1.284897,0.000000,1.135718,0.000000,0.000000,0.000000,0.696253,0.44026,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1395,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,1.28509,0.000000,0.000000,0.0,0.0,0.0,0.0,0.00000,0.439989,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1396,0.000000,0.000000,0.000000,1.044327,1.300710,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,2.067947,0.893062,1.044158,0.00000,0.440376,0.998807,0.0,0.0,0.0,0.0,0.00000,0.440328,0.0,0.0,0.0,0.0,0.8338,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.742328,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,1.044303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1397,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.053636,0.00000,0.0,0.0,0.0,0.0,0.000000,0.440038,1.646254,0.00000,0.000000,0.000000,0.0,0.0,0.0,0.0,1.13612,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.696976,0.0,0.0,1.149783,0.0,0.623548,...,0.440207,0.0,0.925355,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1398,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,0.0,0.0,1.044110,0.000000,0.000000,0.00000,0.000000,0.440376,0.0,0.0,0.0,0.0,0.00000,0.000000,0.0,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [26]:
VenuesFrame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1400 entries, 0 to 1399
Columns: 170 entries, ICCA Dubai to New Delhi
dtypes: float64(170)
memory usage: 1.8 MB


In [27]:
listOfBatsmanFrame = pd.DataFrame(listOfBatsman, columns=["Players"])

In [28]:
listOfBatsmanFrame

Unnamed: 0,Players
0,NP Kenjige
1,ME Sanuth
2,Aamer Yamin
3,Aamir Kaleem
4,Aarif Sheikh
...,...
1395,K Zondo
1396,DNT Zoysa
1397,B Zuiderent
1398,Zulfiqar Babar


#### Final Data for training the model

In [29]:
playerPerformance = pd.concat([listOfBatsmanFrame, ConsistencyFrame, FormFrame, OppositionsFrame, VenuesFrame], axis = 1)

In [30]:
playerPerformance

Unnamed: 0,Players,Consistency,Form,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
0,NP Kenjige,1.4050,-2.004600,1.010860,1.007168,0.000000,0.801,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,1.300673,0.000000,0.833800,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,ME Sanuth,2.2902,-8.791000,0.856168,0.000000,0.982054,0.000,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,1.331651,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aamer Yamin,3.0472,0.000000,0.000000,0.000000,0.000000,0.000,0.869665,0.988045,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,1.435057,1.075705,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aamir Kaleem,1.7130,-0.986267,0.000000,0.851031,0.000000,0.000,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.892793,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aarif Sheikh,1.3782,2.946200,0.908585,0.000000,0.000000,0.000,0.000000,0.000000,0.363022,0.684573,0.40613,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,1.284897,0.000000,1.135718,0.000000,0.000000,0.000000,0.696253,0.44026,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1395,K Zondo,2.4700,0.000000,0.000000,0.000000,0.000000,0.000,0.707442,0.000000,0.000000,0.000000,0.00000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,1.017295,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1396,DNT Zoysa,1.6474,0.000000,0.708615,0.000000,0.000000,0.000,1.372533,1.776569,0.000000,0.000000,0.00000,0.965262,0.0,0.000000,1.785083,1.051178,1.232964,0.000000,1.412722,0.0,0.00000,0.920191,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,1.044327,1.300710,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,1.044303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1397,B Zuiderent,2.0838,0.000000,0.709460,1.423501,0.709366,0.000,0.406898,0.407553,0.000000,0.000000,0.00000,0.407085,0.0,0.858963,0.965217,0.717133,0.406945,1.578994,0.768200,0.0,0.78095,0.813189,1.237507,0.0,0.882058,0.597337,9.866605,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.053636,0.00000,0.0,0.0,...,0.440207,0.0,0.925355,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1398,Zulfiqar Babar,1.7130,0.000000,0.000000,0.000000,0.000000,0.000,0.000000,0.709506,0.000000,0.000000,0.00000,0.000000,0.0,0.407553,0.000000,0.000000,1.115704,0.000000,0.000000,0.0,0.00000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [31]:
batting = pd.read_csv("/content/drive/My Drive/Projects/Cricket Prediction/Batting.csv")

In [32]:
# Cleaning data

batting = batting[batting.Runs != 'DNB']
batting = batting[batting.Runs != 'TDNB']
batting = batting[batting.Runs != 'sub']
batting = batting[batting.Runs != 'absent']
batting = batting.rename(columns={"Player 1":"Players", "Start Date":"StartDate"})

In [33]:
batting = batting.drop(columns=['Team', 'Mins', 'Bf', '4s', '6s', 'Sr', 'Inns', 'Opposition', 'Ground', 'StartDate'])

In [34]:
batting

Unnamed: 0,Players,Runs
2,NP Kenjige,1*
3,NP Kenjige,6*
4,ME Sanuth,6
5,ME Sanuth,40
7,NP Kenjige,0
...,...,...
45281,Zulfiqar Babar,1*
45282,Zulqarnain Haider,12*
45283,Zulqarnain Haider,6
45284,Zulqarnain Haider,11


In [35]:
runs = []
for st in batting['Runs'].values:
  st = re.findall(r'[0-9]+', st)
  if not st:
    st.append('0')
  r = float(st[0])
  ######## Rate the run attribute ########
  if (r>=0 and r<=24):
    r = 1
  elif (r>=25 and r<=49):
    r = 2
  elif (r>=50 and r<=74):
    r = 3
  elif (r>=75 and r<=99):
    r = 4
  elif (r>=100):
    r = 5  

  runs.append(r)
batting['Runs'] = runs

In [36]:
batting.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 36109 entries, 2 to 45285
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   Players  36109 non-null  object
 1   Runs     36109 non-null  int64 
dtypes: int64(1), object(1)
memory usage: 846.3+ KB


In [37]:
batting

Unnamed: 0,Players,Runs
2,NP Kenjige,1
3,NP Kenjige,1
4,ME Sanuth,1
5,ME Sanuth,2
7,NP Kenjige,1
...,...,...
45281,Zulfiqar Babar,1
45282,Zulqarnain Haider,1
45283,Zulqarnain Haider,1
45284,Zulqarnain Haider,1


In [38]:
# Now we have to join playerPerformance and batting dataframe to create our final dataset

finalBatting = pd.merge(batting, playerPerformance, on="Players")

In [39]:
finalBatting.isna().any()

Players           False
Runs              False
Consistency       False
Form              False
U.A.E.            False
                  ...  
Moratuwa          False
Jalandhar         False
Nairobi (Club)    False
Berri             False
New Delhi         False
Length: 201, dtype: bool

In [40]:
finalBatting.to_csv("/content/drive/My Drive/Projects/Cricket Prediction/finalBatting.csv")

In [42]:
finalBatting = pd.read_csv("/content/drive/My Drive/Projects/Cricket Prediction/finalBatting.csv")

In [56]:
X_batting = finalBatting.drop(columns=['Runs', 'Unnamed: 0', 'Players'])
y_batting = finalBatting['Runs']

In [57]:
X_batting

Unnamed: 0,Consistency,Form,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,Windhoek,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
0,1.4050,-2.0046,1.010860,1.007168,0.000000,0.801,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.300673,0.000000,0.8338,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,1.4050,-2.0046,1.010860,1.007168,0.000000,0.801,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.300673,0.000000,0.8338,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,1.4050,-2.0046,1.010860,1.007168,0.000000,0.801,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.300673,0.000000,0.8338,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2.2902,-8.7910,0.856168,0.000000,0.982054,0.000,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,1.331651,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2.2902,-8.7910,0.856168,0.000000,0.982054,0.000,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,1.331651,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
36104,1.7130,0.0000,0.000000,0.000000,0.000000,0.000,0.0,0.709506,0.0,0.0,0.0,0.0,0.0,0.407553,0.0,0.0,1.115704,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
36105,1.9882,0.0000,0.000000,0.000000,0.000000,0.000,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.963956,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
36106,1.9882,0.0000,0.000000,0.000000,0.000000,0.000,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.963956,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
36107,1.9882,0.0000,0.000000,0.000000,0.000000,0.000,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.963956,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [58]:
y_batting

0        1
1        1
2        1
3        1
4        2
        ..
36104    1
36105    1
36106    1
36107    1
36108    1
Name: Runs, Length: 36109, dtype: int64

### Bowling Data

# Oversampling SMOTE

In [59]:
pip install -U imbalanced-learn

Requirement already up-to-date: imbalanced-learn in /usr/local/lib/python3.6/dist-packages (0.7.0)


In [60]:
from imblearn.over_sampling import SMOTE
X_resample_batting, y_resample_batting = SMOTE().fit_sample(X_batting,y_batting.values.ravel())

In [61]:
X_resample_batting.shape

(118315, 199)

In [63]:
X_batting.shape

(36109, 199)

# Splitting the datasets into the Training set and Test set

In [65]:
###############   FOR BATTING   ###############
from sklearn.model_selection import train_test_split
X_train_batting, X_test_batting, y_train_batting, y_test_batting = train_test_split(X_resample_batting, y_resample_batting, test_size=0.3, random_state = 1)

In [66]:
print(X_train_batting.shape)

print(X_test_batting.shape)

print(y_train_batting.shape)

print(y_test_batting.shape)

(82820, 199)
(35495, 199)
(82820,)
(35495,)


In [69]:
X_train_batting

Unnamed: 0,Consistency,Form,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,Windhoek,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
67833,2.087982,0.000000,0.000000,0.000000,0.000000,0.0,0.002770,0.000000,0.000000,0.0,0.000000,0.971799,0.0,0.004448,0.000000,0.624444,0.000000,0.000000,0.995541,0.0,1.174684,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
57968,1.589564,0.000000,0.000000,0.007915,0.000000,0.0,0.007895,0.000000,0.627574,0.0,0.000000,1.189215,0.0,0.374175,0.696906,0.000000,0.547329,0.773533,0.556994,0.0,0.011304,0.689770,1.049603,0.000000,0.404474,0.000000,0.402515,0.00000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,...,0.448013,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
19269,2.776700,1.291816,1.415500,1.691164,0.000000,0.0,2.113822,0.406410,1.217563,0.0,0.000000,1.520479,0.0,1.583237,0.960253,1.009618,0.810840,1.533647,1.163132,0.0,1.110542,0.558521,1.565340,0.705481,1.721355,0.000000,0.000000,0.00000,0.000000,1.542182,0.0,0.0,1.946748,0.000000,0.0,1.588471,1.586214,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
95167,4.525500,0.000000,12.299160,0.000000,1.239789,0.0,2.375396,2.280263,0.912092,0.0,0.000000,2.161422,0.0,2.152911,2.262323,2.403981,2.331931,2.045917,0.000000,0.0,2.297287,2.214753,0.000000,0.000000,0.707424,0.000000,1.064248,0.00000,0.000000,0.000000,0.0,0.0,2.603002,2.258076,0.0,0.000000,2.293146,2.599053,1.843359,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.440349,0.0,0.440044,0.0,1.284967,0.0,0.0,0.8338,2.548971,0.000000,0.0,1.302891,0.0,1.486538,0.000000,0.774209,0.0,0.0,0.8338,0.0,0.0,0.0,0.0,0.0,0.774489,0.0,0.0,0.0,0.0,0.0,0.774565,1.180079,0.0,0.0,0.440375
104833,4.259900,0.000000,0.000000,0.908377,1.055727,0.0,1.470722,13.615169,0.701524,0.0,1.008605,13.573606,0.0,1.646129,1.992352,1.720838,0.000000,0.852806,2.303100,0.0,1.772234,1.900581,0.000000,0.000000,0.000000,0.000000,0.000000,0.40734,0.000000,0.000000,0.0,0.0,1.434861,1.593825,0.0,1.132826,1.283285,0.000000,1.691930,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,1.329085,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,12.296405,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
50057,3.620700,1.204422,0.859891,1.061705,0.000000,0.0,1.808936,2.024133,0.858124,0.0,0.000000,1.921469,0.0,0.000000,1.548357,1.960083,1.561701,0.000000,1.730646,0.0,1.720874,1.638756,0.000000,0.000000,1.361133,1.511244,0.000000,0.00000,0.000000,0.000000,0.0,0.0,1.254898,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.925309,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
98047,3.896000,1.392242,0.000000,0.708361,0.000000,0.0,2.198139,1.898700,0.000000,0.0,0.000000,2.257706,0.0,1.891157,0.000000,1.787949,1.721070,1.286559,2.425052,0.0,1.883540,1.770751,0.000000,1.495220,1.166635,1.064343,0.000000,0.00000,0.000000,0.000000,0.0,0.0,1.946571,1.149656,0.0,1.435250,0.000000,0.000000,1.030848,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.440294,0.0,0.000000,0.0,0.000000,0.0,0.0,0.0000,0.000000,0.440083,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
5192,4.204000,0.000000,0.000000,1.098601,0.000000,0.0,1.812366,13.183487,1.046239,0.0,0.000000,1.726355,0.0,1.738209,2.004727,1.891033,0.000000,1.046404,2.053406,0.0,2.014435,1.743237,0.857056,0.000000,0.801000,12.343459,0.000000,0.00000,0.960438,0.000000,0.0,0.0,2.290870,1.690841,0.0,1.332043,1.999894,0.000000,1.795092,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,0.440156,0.0,0.0,0.0000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000
77708,4.164900,0.000000,0.704157,1.227742,0.000000,0.0,2.189564,2.207631,1.270251,0.0,0.000000,1.975971,0.0,2.180178,2.079605,2.160530,1.944196,1.003731,2.013932,0.0,0.000000,2.083580,0.907287,0.000000,0.000000,0.801000,0.858461,0.00000,0.000000,0.000000,0.0,0.0,2.597460,1.538046,0.0,1.929443,0.000000,1.331493,0.773323,0.0,...,0.000000,0.0,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,0.0,0.8338,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000


# Feature Scaleing

In [70]:
###############   FOR BATTING   ###############
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train_batting = pd.DataFrame(sc.fit_transform(X_train_batting), columns = X_train_batting.columns.values, index = X_train_batting.index)
X_test_batting = pd.DataFrame(sc.transform(X_test_batting), columns = X_test_batting.columns.values, index = X_test_batting.index)

In [71]:
X_train_batting

Unnamed: 0,Consistency,Form,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,Windhoek,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
67833,-0.706262,0.047489,-0.387885,-0.555357,-0.274363,-0.098952,-0.835630,-0.805423,-0.843367,-0.109143,-0.215099,-0.348335,-0.139769,-0.945780,-1.230720,-0.541640,-0.758261,-0.475608,-0.316365,-0.278272,-0.196431,-1.013122,-0.373632,-0.445290,-0.541980,-0.543819,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.796638,-0.840634,-0.35939,-0.534051,-0.390145,-0.567593,-0.703337,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
57968,-0.963421,0.047489,-0.387885,-0.548994,-0.274363,-0.098952,-0.832385,-0.805423,0.391258,-0.109143,-0.215099,-0.231266,-0.139769,-0.703843,-0.629850,-0.924086,-0.487293,0.074949,-0.635603,-0.278272,-0.978514,-0.500009,0.372865,-0.445290,-0.331161,-0.543819,0.433252,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.796638,-0.840634,-0.35939,-0.534051,-0.390145,-0.567593,-0.703337,-0.095014,...,0.950287,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
19269,-0.350918,0.452880,1.003172,0.804139,-0.274363,-0.098952,0.501009,-0.599045,1.551941,-0.109143,-0.215099,-0.052893,-0.139769,0.087330,-0.402793,-0.305738,-0.356837,0.615955,-0.194368,-0.278272,-0.239551,-0.597645,0.739667,1.461817,0.355221,-0.543819,-0.400392,-0.261525,-0.283701,2.621496,-0.0914,-0.063281,0.185768,-0.840634,-0.35939,2.245163,2.295881,-0.567593,-0.703337,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
95167,0.551375,0.047489,11.698889,-0.555357,5.381872,-0.098952,0.666628,0.352511,0.950989,-0.109143,-0.215099,0.292230,-0.139769,0.460107,0.719848,0.548251,0.396212,0.980559,-1.041063,-0.278272,0.558239,0.634409,-0.373632,-0.445290,-0.173257,-0.543819,1.803760,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.516940,1.615315,-0.35939,-0.534051,3.492971,3.129390,0.789632,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,1.653313,-0.230254,0.864310,-0.278278,4.969464,-0.157247,-0.099195,3.774328,7.053251,-0.271868,-0.131657,5.907756,-0.195955,7.361358,-0.077999,11.432294,-0.083294,-0.127936,2.074768,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,5.485273,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,11.933229,11.433220,-0.068437,-0.037925,13.337050
104833,0.414339,0.047489,-0.387885,0.174871,4.542133,-0.098952,0.093822,6.108456,0.536739,-0.109143,5.484100,6.437245,-0.139769,0.128485,0.487080,0.129854,-0.758261,0.131371,0.635465,-0.278272,0.205271,0.400700,-0.373632,-0.445290,-0.541980,-0.543819,-0.400392,1.277589,-0.283701,-0.306813,-0.0914,-0.063281,-0.072550,0.892855,-0.35939,1.447959,1.782914,-0.567593,0.666987,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,5.148418,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,18.335228,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
50057,0.084543,0.425454,0.457157,0.298128,-0.274363,-0.098952,0.307967,0.222446,0.844818,-0.109143,-0.215099,0.163025,-0.139769,-0.948691,0.104269,0.276382,0.014893,-0.475608,0.218751,-0.278272,0.170745,0.205931,-0.373632,-0.445290,0.167467,0.648825,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.163367,-0.840634,-0.35939,-0.534051,-0.390145,-0.567593,-0.703337,-0.095014,...,-0.184238,-0.115807,3.902236,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
98047,0.226584,0.484395,-0.387885,0.014082,-0.274363,-0.098952,0.554395,0.158750,-0.843367,-0.109143,-0.215099,0.344075,-0.139769,0.288823,-1.230720,0.170957,0.093792,0.440091,0.724240,-0.278272,0.280097,0.304120,-0.373632,3.596694,0.066092,0.296140,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.185679,0.409765,-0.35939,1.977085,-0.390145,-0.567593,0.131565,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,1.653072,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,0.655257,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
5192,0.385497,0.047489,-0.387885,0.327788,-0.274363,-0.098952,0.310138,5.889245,1.214897,-0.109143,-0.215099,0.057963,-0.139769,0.188739,0.497749,0.234092,-0.758261,0.269163,0.453702,-0.278272,0.368091,0.283654,0.235922,-0.445290,-0.124484,9.197394,-0.400392,-0.261525,3.310487,-0.306813,-0.0914,-0.063281,0.359426,0.998373,-0.35939,1.796513,2.996389,-0.567593,0.750540,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,1.542667,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
77708,0.365323,0.047489,0.304112,0.431602,-0.274363,-0.098952,0.548966,0.315628,1.655595,-0.109143,-0.215099,0.192372,-0.139769,0.477950,0.562309,0.399147,0.204255,0.238790,0.424967,-0.278272,-0.986113,0.536831,0.271648,-0.445290,-0.541980,0.088314,1.377556,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.514143,0.832189,-0.35939,2.841734,-0.390145,1.326368,-0.077009,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,3.774328,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.19104,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979


In [72]:
X_test_batting

Unnamed: 0,Consistency,Form,U.A.E.,Scotland,Namibia,Nepal,Zimbabwe,New Zealand,Netherlands,Oman,U.S.A.,West Indies,P.N.G.,Bangladesh,Pakistan,England,Australia,Kenya,India,Africa XI,Sri Lanka,South Africa,Canada,Hong Kong,Ireland,Afghanistan,Bermuda,Asia XI,ICC World XI,ICCA Dubai,Al Amerat,Kirtipur,Harare,Wellington,Aberdeen,Amstelveen,Kuala Lumpur,Cuttack,Auckland,Windhoek,...,Nairobi (Jaff),Schiedam,Glasgow,Deventer,Port Moresby,Hyderabad (Sind),Sheikhupura,Galle,Georgetown,Chandigarh,Nairobi,Derby,Gujranwala,Singapore,Tangier,Leicester,Jodhpur,Vijayawada,Pietermaritzburg,Cairns,Amritsar,Sialkot,Chelmsford,Taupo,Sargodha,Northampton,Worcester,Kandy,Quetta,Hove,Patna,Nairobi (Aga),Kwekwe,New Plymouth,Ballarat,Moratuwa,Jalandhar,Nairobi (Club),Berri,New Delhi
25449,-0.362682,0.047489,0.307426,-0.227745,-0.274363,-0.098952,0.255541,0.053700,-0.843367,-0.109143,-0.215099,-0.871612,-0.139769,0.060528,10.038142,0.345679,0.192623,-0.475608,0.369120,-0.278272,-0.236927,0.284079,-0.373632,-0.445290,-0.172645,-0.543819,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.140438,0.196347,-0.359390,-0.534051,-0.390145,0.423053,0.376031,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,1.263737,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
54684,0.177518,0.047489,0.011265,-0.064066,-0.274363,-0.098952,0.437340,0.004894,-0.843367,-0.109143,-0.215099,-0.871612,1.297886,0.369456,0.548272,0.143319,0.205643,0.208250,0.594384,-0.278272,0.454715,0.440163,-0.373632,-0.445290,-0.040742,-0.105477,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.264720,0.330161,-0.359390,-0.534051,1.301288,0.638206,-0.703337,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,2.320794,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,1.854719,7.314080,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,2.807609,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
12248,-1.387204,0.212930,-0.387885,0.088552,-0.274363,3.140544,-0.438668,-0.805423,-0.843367,-0.109143,-0.215099,-0.871612,1.232073,-0.948691,-1.230720,-0.924086,-0.758261,-0.475608,-1.041063,-0.278272,-0.986113,-1.013122,-0.373632,-0.445290,-0.230353,-0.222244,-0.400392,-0.261525,-0.283701,1.989520,-0.0914,-0.063281,-0.056788,-0.840634,-0.359390,-0.534051,-0.390145,-0.567593,-0.703337,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
36762,0.490390,0.047489,-0.020552,0.441712,-0.274363,-0.098952,0.311204,0.279567,1.341046,-0.109143,-0.215099,0.107638,-0.139769,0.584222,0.603628,0.491039,0.308638,0.481898,0.632168,2.837981,-0.986113,0.703525,0.381338,-0.445290,-0.015024,0.056184,1.487821,-0.261525,3.496632,-0.306813,-0.0914,-0.063281,0.266007,1.900729,-0.359390,1.796074,-0.390145,1.327373,0.544808,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,3.188109,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,4.051101,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
23381,-0.696760,0.319247,-0.387885,-0.227963,-0.274363,-0.098952,-0.837383,-0.805423,-0.843367,-0.109143,-0.215099,-0.871612,-0.139769,-0.220688,-1.230720,-0.924086,-0.758261,-0.475608,-0.255082,-0.278272,-0.308442,-1.013122,-0.373632,-0.445290,-0.541980,0.108622,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.163315,-0.840634,-0.359390,-0.534051,-0.390145,-0.567593,-0.703337,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
65358,0.602144,0.448037,-0.387885,0.299596,-0.274363,-0.098952,0.061272,-0.805423,-0.843367,-0.109143,-0.215099,0.311547,-0.139769,0.603181,0.822280,0.508928,0.313472,8.281026,0.704172,-0.278272,0.464909,0.480217,0.382634,-0.445290,0.172418,0.013218,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,-0.009372,1.822781,3.380655,-0.534051,-0.390145,-0.567593,1.278934,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
49584,6.074687,0.370626,0.605088,0.298637,-0.274363,-0.098952,0.517360,0.060856,0.783571,-0.109143,-0.215099,0.130315,-0.139769,-0.948691,0.639379,0.277892,0.057598,-0.475608,0.441848,-0.278272,0.309490,0.288446,-0.373632,-0.445290,0.224334,0.771350,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.440287,-0.840634,-0.359390,-0.534051,-0.390145,-0.567593,-0.028027,-0.095014,...,-0.184238,-0.115807,3.490771,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
107910,0.443025,0.047489,-0.387885,0.251925,-0.274363,-0.098952,0.400787,0.136452,0.550703,-0.109143,-0.215099,-0.871612,-0.139769,0.199438,0.392924,0.297413,0.043838,0.497361,0.393068,-0.278272,0.219091,0.375819,0.389094,-0.445290,0.050410,-0.543819,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.438371,0.556609,-0.359390,-0.534051,3.339251,1.044401,0.008293,-0.095014,...,-0.184238,-0.115807,-0.258358,-0.169538,-0.119808,-0.280463,-0.230254,-0.347692,6.718538,-0.242731,5.700884,-0.099195,-0.222341,3.446027,-0.271868,8.774329,5.325642,2.076426,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,1.137373,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979
18498,-0.627467,0.297290,0.807399,0.215335,-0.274363,-0.098952,-0.837383,-0.805423,0.844567,-0.109143,-0.215099,-0.328876,-0.139769,0.026788,-0.749658,-0.924086,-0.758261,-0.475608,-0.744886,-0.278272,0.333538,-0.320956,-0.373632,2.284940,-0.118780,0.951393,-0.400392,-0.261525,-0.283701,-0.306813,-0.0914,-0.063281,0.313836,-0.840634,-0.359390,-0.534051,-0.390145,-0.567593,-0.346874,-0.095014,...,-0.184238,-0.115807,-0.258358,5.711458,-0.119808,-0.280463,-0.230254,4.364142,-0.278278,-0.242731,-0.157247,-0.099195,-0.222341,-0.223663,-0.271868,-0.131657,-0.214240,-0.195955,-0.175946,-0.077999,-0.089495,-0.083294,-0.127936,-0.086296,-0.071734,-0.128096,-0.164824,-0.191040,-0.065224,-0.151485,-0.091567,-0.080671,-0.067384,-0.064399,-0.064399,-0.095863,-0.097622,-0.068437,-0.037925,-0.074979


In [73]:
import numpy as np

X_train_batting = np.array(X_train_batting)
X_test_batting = np.array(X_test_batting)
y_train_batting = np.array(y_train_batting)
y_test_batting = np.array(y_test_batting)

In [74]:
X_train_batting

array([[-0.70626212,  0.04748891, -0.38788511, ..., -0.06843701,
        -0.03792511, -0.0749791 ],
       [-0.96342098,  0.04748891, -0.38788511, ..., -0.06843701,
        -0.03792511, -0.0749791 ],
       [-0.35091829,  0.45287954,  1.00317194, ..., -0.06843701,
        -0.03792511, -0.0749791 ],
       ...,
       [ 0.38549707,  0.04748891, -0.38788511, ..., -0.06843701,
        -0.03792511, -0.0749791 ],
       [ 0.36532343,  0.04748891,  0.30411208, ..., -0.06843701,
        -0.03792511, -0.0749791 ],
       [ 0.01825139,  0.31858791, -0.38788511, ..., -0.06843701,
        -0.03792511, -0.0749791 ]])

# Model Building

In [75]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from xgboost import XGBClassifier

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.model_selection import cross_val_score

In [76]:
models_batting = []
models_batting.append(('DTC', DecisionTreeClassifier(criterion= 'entropy', random_state=0)))
models_batting.append(('KNC', KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2)))
models_batting.append(('NB', GaussianNB()))
models_batting.append(('RFC', RandomForestClassifier(n_estimators=500, criterion='entropy', random_state=0)))
models_batting.append(('SVC', SVC(random_state = 0, kernel = 'rbf')))
models_batting.append(('XGB', XGBClassifier()))

In [None]:
results = []
names = []
kFold = []

for name, model in models_batting:
  model.fit(X_train_batting, y_train_batting)
  y_pred = model.predict(X_test_batting)
  accuracies = accuracy_score(y_test_batting, y_pred)
  fold = cross_val_score(estimator = model, X = X_train_batting, y = y_train_batting, cv = 10)
  results.append(accuracies*100)
  names.append(name)
  kFold.append(fold.mean()*100)
final_comparison_batting = pd.DataFrame(list(zip(names, results, kFold)), columns = ['Model Name', 'Accuracy', 'K-Fold'])

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

In [None]:
final_comparison_batting.sort_values(by=['Accuracy'], ascending=False)