# Weedle's Cave challenge in R 
## Predicting wins and losses between pokémon in Gen 1

### This is "Weedle's Cave" challenge. Can we build a model that'll predict which pocket monster will win or lose?

##### Personal objectives: Get familiar with R.

###### First, Read in the data! We'll read in the pokémon list, the combats that we'll model from, and the final test data that we need to populate for our Kaggle submission. Should be self explanatory, but the super high-level workflow is as follows:

1. Profile/clean/investigate the data
2. Build some models and see what works well
3. Apply the final chosen model to our test data for our Kaggle submission

In [23]:
# first row contains variable names, comma is separator 
# assign the variable id to row names
# note the / instead of \ on mswindows systems 

data_pokemon <- read.csv("/Users/kintesh/Documents/kaggle/pokemon/data/pokemon.csv")
data_combats <- read.csv("/Users/kintesh/Documents/kaggle/pokemon/data/combats.csv")
data_tests <- read.csv("/Users/kintesh/Documents/kaggle/pokemon/data/tests.csv")
# mydata <- read.table("/Users/kintesh/Documents/kaggle/pokemon/data/pokemon.csv", header=TRUE, 
#    sep=",", row.names="id", quote="")

colnames(data_pokemon) = c('PokeNum', 'Name', 'Type1', 'Type2', 'HP', 'Attack', 
                           'Defense', 'SpAtk', 'SpDef', 'Speed', 'Generation', 'Legendary')

In [24]:
head(data_pokemon,5)
head(data_combats,5)
head(data_tests,5)

PokeNum,Name,Type1,Type2,HP,Attack,Defense,SpAtk,SpDef,Speed,Generation,Legendary
1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
5,Charmander,Fire,,39,52,43,60,50,65,1,False


First_pokemon,Second_pokemon,Winner
266,298,298
702,701,701
191,668,668
237,683,683
151,231,151


First_pokemon,Second_pokemon
129,117
660,211
706,115
195,618
27,656


######  Let's get information on the battles and their outcomes - what was pokemon 1, what was their type, etc etc? But first, rename the columns so we can merge more easily.

In [29]:
# Create a CombatID for each combat in the combats dataset, just in case we need to join back later. 
# Using the column bind function to do so.

id <- rownames(data_combats)
data_combats2 <- cbind(id=id, data_combats)

colnames(data_combats2) <- c('CombatID', 'First_pokemon', 'Second_pokemon', 'Winner')
# Convert CombatID field to numeric
data_combats2[,'CombatID'] <- as.numeric(as.character(data_combats2[,'CombatID']))

pokemon_first <- data_combats2[c('CombatID', 'First_pokemon')]
colnames(pokemon_first) = c("CombatID", "PokeNum")

pokemon_second <- data_combats2[c('CombatID', 'Second_pokemon')]
colnames(pokemon_second) = c("CombatID", "PokeNum")

head(data_combats2,5)
head(pokemon_first,5)
head(pokemon_second,5)

CombatID,First_pokemon,Second_pokemon,Winner
1,266,298,298
2,702,701,701
3,191,668,668
4,237,683,683
5,151,231,151


CombatID,PokeNum
1,266
2,702
3,191
4,237
5,151


CombatID,PokeNum
1,298
2,701
3,668
4,683
5,231


######  Merge on the pokemon stats for the first pokemon, then do the same for the second pokemon

In [30]:
# Check the types in each column before we begin.
sapply(data_pokemon, class)
sapply(pokemon_first, class)
sapply(pokemon_second, class)

combats_1_stats <- merge(x = pokemon_first,  y = data_pokemon, by = "PokeNum", all.x = TRUE)
combats_2_stats <- merge(x = pokemon_second, y = data_pokemon, by = 'PokeNum', all.x = TRUE)

# Order the data in place
combats_1_stats <- combats_1_stats[order(combats_1_stats$CombatID),]
combats_2_stats <- combats_2_stats[order(combats_2_stats$CombatID),] 

head(combats_1_stats,5)
head(combats_2_stats,5)

Unnamed: 0,PokeNum,CombatID,Name,Type1,Type2,HP,Attack,Defense,SpAtk,SpDef,Speed,Generation,Legendary
16474,266,1,Larvitar,Rock,Ground,50,64,50,45,50,41,2,False
43959,702,2,Virizion,Grass,Fighting,91,90,72,90,129,108,5,True
11690,191,3,Togetic,Fairy,Flying,55,40,85,80,105,40,2,False
14633,237,4,Slugma,Fire,,40,40,40,70,40,20,2,False
9235,151,5,Omastar,Rock,Water,70,60,125,115,70,55,1,False


Unnamed: 0,PokeNum,CombatID,Name,Type1,Type2,HP,Attack,Defense,SpAtk,SpDef,Speed,Generation,Legendary
18329,298,1,Nuzleaf,Grass,Dark,70,70,40,60,40,60,3,False
43682,701,2,Terrakion,Rock,Fighting,91,129,90,72,90,108,5,True
41535,668,3,Beheeyem,Psychic,,75,75,75,125,95,40,5,False
42497,683,4,Druddigon,Dragon,,77,120,90,60,90,48,5,False
14121,231,5,Shuckle,Bug,Rock,20,10,230,10,230,5,2,False
