# PUBG Finish Placement Prediction

Battle Royale-style video games have taken the world by storm. 100 players are dropped onto an island empty-handed and must explore, scavenge, and eliminate other players until only one is left standing, all while the play zone continues to shrink.

PlayerUnknown's BattleGrounds (PUBG) has enjoyed massive popularity. With over 50 million copies sold, it's the fifth best selling game of all time, and has millions of active monthly players.

The team at PUBG has made official game data available for the public to explore and scavenge outside of "The Blue Circle." This competition is not an official or affiliated PUBG site - Kaggle collected data made possible through the PUBG Developer API.

You are given over 65,000 games' worth of anonymized player data, split into training and testing sets, and asked to predict final placement from final in-game stats and initial player ratings.

What's the best strategy to win in PUBG? Should you sit in one spot and hide your way into victory, or do you need to be the top shot? Let's let the data do the talking!

### Check out the dataset description here.
### Lets start the Predictive Analysis.

## Importing the libraries

In [1]:
import numpy as np
import pandas as pd
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC

### Importing the dataset

In [2]:
dataset = pd.read_csv('dataset/train_V2.csv')
test = pd.read_csv('dataset/test_V2.csv')

### Taking care of missing data

In [3]:
dataset = dataset.fillna(dataset.median())
test = test.fillna(test.median())

### We have to predict weather a player wins the match or not based on his stats. We have percentile winning placement , where 1 corresponds to 1st place, and 0 corresponds to last place in the match. So converting the given field to 1 or 0 correspond to weather player win or not.

In [4]:
dataset['winPlacePerc'] = dataset['winPlacePerc'].astype(np.int64)

### Data Shuffling

In [5]:
dataset = shuffle(dataset)
test = shuffle(test)

### Selecting desireable fields

In [6]:
x = dataset.iloc[:, 3:28].values
y = dataset.iloc[:, -1].values
X_test = test.iloc[:, 3:28].values

### Train-Test Split

In [7]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25)

## Data Preprocessing

### Taking care of Categorical Data

In [8]:
labelencoder_X_train = LabelEncoder()
x_train[:, 12] = labelencoder_X_train.fit_transform(x_train[:, 12])
onehotencoder = OneHotEncoder(categorical_features = [12])
x_train = onehotencoder.fit_transform(x_train).toarray()

labelencoder_X_test = LabelEncoder()
x_test[:, 12] = labelencoder_X_test.fit_transform(x_test[:, 12])
onehotencoder = OneHotEncoder(categorical_features = [12])
x_test = onehotencoder.fit_transform(x_test).toarray()

labelencoder_X_test = LabelEncoder()
X_test[:, 12] = labelencoder_X_test.fit_transform(X_test[:, 12])
onehotencoder = OneHotEncoder(categorical_features = [12])
X_test = onehotencoder.fit_transform(X_test).toarray()

In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.


### Feature Scaling

In [9]:
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)
X_test = sc.transform(X_test)

## Classification

### 1. Decision Tree Classifier

### Fitting Decision Tree Classification to the Training set

In [10]:
classifier = DecisionTreeClassifier(criterion = 'entropy')
classifier.fit(x_train, y_train)

DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=None,
            max_features=None, max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, presort=False, random_state=None,
            splitter='best')

### Predicting the Test set results

In [16]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [17]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))

95.90 % Accuracy


### 2. Random Forest Classifier

### Fitting Random Forest Classification to the Training set

In [None]:
classifier = RandomForestClassifier(n_estimators = 1000, criterion = 'entropy')
classifier.fit(x_train, y_train)

### Predicting the Test set results

In [16]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [17]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))

95.90 % Accuracy


### 3. Naive Bayes

### Fitting Naive Bayes to the Training set

In [10]:
classifier = GaussianNB()
classifier.fit(x_train, y_train)

GaussianNB(priors=None, var_smoothing=1e-09)

### Predicting the Test set results

In [11]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [12]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))

90.85 % Accuracy


### 4. Logistic Regression

### Fitting Logistic Regression to the Training set

In [13]:
classifier = LogisticRegression()
classifier.fit(x_train, y_train)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='warn',
          n_jobs=None, penalty='l2', random_state=None, solver='warn',
          tol=0.0001, verbose=0, warm_start=False)

### Predicting the Test set results

In [14]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [15]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))

97.22 % Accuracy


### 5. K-NN

### Fitting K-NN to the Training set

In [None]:
classifier = KNeighborsClassifier()
classifier.fit(x_train, y_train)

### Predicting the Test set results

In [None]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [None]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))

### 6. SVM
### Fitting SVM to the Training set

In [None]:
classifier = SVC(kernel = 'linear')
classifier.fit(x_train, y_train)

### Predicting the Test set results

In [None]:
y_pred = classifier.predict(x_test)
Y_pred = classifier.predict(X_test)

### Accuracy

In [None]:
print("{:0.2f} % Accuracy".format(classifier.score(x_test,y_test)*100))