# AI/ML Coding Interview Assessment: Pokemon Dataset

Welcome to the AI/ML coding interview assessment! In this assessment, you will work with the Pokemon dataset to demonstrate your data analysis and model building skills.

The assessment is divided into four steps:
1. Data Exploration
2. Distribution Analysis
3. Feature Selection & Type Prediction Model
4. Attack Prediction Model

You will be evaluated based on your ability to complete these steps, your understanding of statistics, and the accuracy of your models.

Let's get started!

## Setup

First, let's import the necessary libraries and load the Pokemon dataset.

In [2]:
# Import standard libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plot style
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 8)

# Import the interview package
from interview import data

# Load the Pokemon dataset
pokemon_df = data.load_pokemon_data()

# Display the first few rows
pokemon_df.head()

Unnamed: 0,abilities,against_bug,against_dark,against_dragon,against_electric,against_fairy,against_fight,against_fire,against_flying,against_ghost,...,percentage_male,pokedex_number,sp_attack,sp_defense,speed,type1,type2,weight_kg,generation,is_legendary
0,"['Overgrow', 'Chlorophyll']",1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,...,88.1,1,65,65,45,grass,poison,6.9,1,0
1,"['Overgrow', 'Chlorophyll']",1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,...,88.1,2,80,80,60,grass,poison,13.0,1,0
2,"['Overgrow', 'Chlorophyll']",1.0,1.0,1.0,0.5,0.5,0.5,2.0,2.0,1.0,...,88.1,3,122,120,80,grass,poison,100.0,1,0
3,"['Blaze', 'Solar Power']",0.5,1.0,1.0,1.0,0.5,1.0,0.5,1.0,1.0,...,88.1,4,60,50,65,fire,,8.5,1,0
4,"['Blaze', 'Solar Power']",0.5,1.0,1.0,1.0,0.5,1.0,0.5,1.0,1.0,...,88.1,5,80,65,80,fire,,19.0,1,0


## Step 1: Data Exploration

In this step, you will explore the Pokemon dataset to understand its structure, features, and basic statistics.

Tasks:
1. Examine the dataset structure (shape, columns, data types)
2. Check for missing values and handle them appropriately
3. Calculate basic statistics for numeric columns
4. Explore the distribution of Pokemon types
5. Identify any interesting patterns or relationships in the data

In [None]:
# Examine the dataset structure
# Your code here

In [None]:
# Check for missing values
# Your code here

In [None]:
# Calculate basic statistics for numeric columns
# Your code here

In [None]:
# Explore the distribution of Pokemon types
# Your code here

In [None]:
# Identify interesting patterns or relationships
# Your code here

## Step 2: Distribution Analysis

In this step, you will analyze the distributions of key Pokemon attributes: weight, attack, defense, speed, and type.

Tasks:
1. Create visualizations to show the distributions of weight, attack, defense, and speed
2. Analyze how these attributes vary across different Pokemon types
3. Identify any outliers and discuss their impact
4. Calculate and visualize correlations between numeric attributes
5. Draw insights from your analysis

In [None]:
# Create visualizations for weight, attack, defense, and speed distributions
# Your code here

In [None]:
# Analyze attribute variations across Pokemon types
# Your code here

In [None]:
# Identify and discuss outliers
# Your code here

In [None]:
# Calculate and visualize correlations
# Your code here

In [None]:
# Draw insights from your analysis
# Your code here

## Step 3: Feature Selection & Type Prediction Model

In this step, you will explore which features are most important for determining Pokemon type, select the most relevant features, and build a model to predict Pokemon type.

Tasks:
1. Explore the relationship between various features and Pokemon types
2. Identify which features are most predictive of Pokemon type
3. Select the most relevant features for your model
4. Build a classification model using your selected features
5. Evaluate the model's performance
6. Interpret the results and discuss feature importance

In [None]:
# Explore feature importance for type prediction
# Your code here

In [None]:
# Select the most relevant features
# Your code here

In [None]:
# Build a model with selected features
# Your code here

In [None]:
# Evaluate the model's performance
# Your code here

In [None]:
# Interpret the results and discuss feature importance
# Your code here

## Step 4: Attack Prediction Model

In this step, you will build a model to predict a Pokemon's attack stat based on multiple features.

Tasks:
1. Identify which features might be relevant for predicting attack stat
2. Prepare the data for modeling (features and target)
3. Split the data into training and testing sets
4. Select an appropriate algorithm for this regression task
5. Train the model and evaluate its performance
6. Interpret the results and discuss the model's strengths and limitations

In [None]:
# Identify relevant features for predicting attack
# Your code here

In [None]:
# Prepare the data for modeling
# Your code here

In [None]:
# Split the data into training and testing sets
# Your code here

In [None]:
# Select and train a regression model
# Your code here

In [None]:
# Evaluate the model's performance
# Your code here

In [None]:
# Interpret the results
# Your code here

## Conclusion

Congratulations on completing the AI/ML coding interview assessment!

In this assessment, you have:
1. Explored the Pokemon dataset and understood its structure
2. Analyzed distributions of key Pokemon attributes
3. Identified important features and built a classification model to predict Pokemon type
4. Built a regression model to predict Pokemon attack stat using multiple features

Please take a moment to summarize your findings and reflect on your approach to each task.

In [None]:
# Summary and reflection
# Your code here