A standardized environment for conducting AI/ML coding interviews using the Kaggle Pokemon dataset. This platform allows candidates to demonstrate their data analysis and model building skills in a live interview setting.
- Standardized Assessment: Structured prompts to evaluate candidates' skills in data exploration, statistical analysis, and model building
- Cross-Platform Compatibility: Works seamlessly across macOS, Linux, and Windows
- Python Version Support: Compatible with Python 3.9-3.12
- Flexible Environment: Supports both external Jupyter notebooks and VSCode's integrated notebook experience
- Essential Libraries: Includes pandas, numpy, scikit-learn, tensorflow, pytorch, matplotlib, seaborn, and other common data science packages
The assessment follows a structured four-step process:
- Data Exploration: Evaluate the candidate's familiarity with Python and data analysis libraries
- Distribution Analysis: Assess statistical understanding and visualization skills
- Type Prediction Model: Evaluate model building skills for classification
- Attack Prediction Model: Assess model building skills for regression
- Python 3.9 or higher
-
Clone the repository:
git clone https://github.com/yourusername/agate.git cd agate
-
Run the setup script to create a virtual environment and install dependencies:
python scripts/setup_env.py
-
Activate the virtual environment:
- Windows:
.venv\Scripts\activate
- Unix/Linux/MacOS:
source .venv/bin/activate
- Windows:
If you prefer to set up the environment manually:
-
Create a virtual environment:
python -m venv .venv
-
Activate the virtual environment:
- Windows:
.venv\Scripts\activate
- Unix/Linux/MacOS:
source .venv/bin/activate
- Windows:
-
Install the package:
pip install -e .
-
Activate the virtual environment as described above.
-
Open the assessment notebook:
jupyter notebook notebooks/pokemon_assessment.ipynb
Or with VSCode:
code notebooks/pokemon_assessment.ipynb
-
Follow the prompts in the notebook to guide the candidate through the assessment.
agate/
├── datasets/ # Dataset files
│ └── pokemon/ # Pokemon dataset
├── interview/ # Main module package
│ ├── data.py # Data loading and processing
│ ├── visualization.py # Plotting and visualization
│ ├── models.py # Model building utilities
│ └── utils.py # General utilities
├── notebooks/ # Jupyter notebooks
│ ├── pokemon_assessment.ipynb # Main assessment
│ └── solutions/ # Reference solutions
├── scripts/ # Utility scripts
│ └── setup_env.py # Environment setup script
└── tests/ # Test suite
The platform uses a consistent naming convention for dataset columns:
- All column names are lowercase (e.g.,
attack
instead ofAttack
) - Multi-word column names use underscores (e.g.,
sp_attack
instead ofSp. Atk
) - Type columns are named
type1
andtype2
- Stat columns include:
hp
,attack
,defense
,sp_attack
,sp_def
,speed
- Other columns include:
weight_kg
,height_m
,is_legendary
,generation
This convention makes the code more pythonic and easier to work with programmatically.
pytest
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- The Pokemon dataset is sourced from Kaggle and is in the public domain (CC0)