Skip to content

victorhrls/scikit-learn-intro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“˜ Scikit-Learn Intro Study Plan

This repository contains a concise, hands-on study plan to get comfortable applying Scikit-Learn for practical machine learning tasks. The focus is on understanding the core API, building end-to-end workflows, and preparing for code interviews with minimal but effective coverage.


🎯 Goals

  • Understand the Scikit-Learn API and the fit/transform/predict pattern.
  • Build reproducible pipelines for preprocessing and modeling.
  • Evaluate models with proper train/test splits and metrics.
  • Tune models quickly with cross-validation and simple search.
  • Apply to small tabular datasets end-to-end.

🧱 Core Concepts to Learn

  • Estimators, Transformers, Pipelines
  • Train/test split, cross-validation
  • Feature preprocessing: scaling, encoding, imputation
  • Model evaluation: classification vs regression metrics
  • Model selection: GridSearchCV, RandomizedSearchCV
  • Saving/loading models (joblib)

πŸ› οΈ Minimal Toolkit

  • Pipeline, ColumnTransformer
  • StandardScaler, OneHotEncoder, SimpleImputer
  • train_test_split, cross_val_score
  • GridSearchCV, RandomizedSearchCV
  • Baselines: DummyClassifier, DummyRegressor
  • Models: LogisticRegression, RandomForestClassifier, RandomForestRegressor, GradientBoostingRegressor

πŸ“š Study Sequence (3–5 hours)

  1. Quick API tour with a toy dataset (iris, boston alternative: fetch_california_housing).
  2. Preprocessing with ColumnTransformer (numeric vs categorical).
  3. Pipelines: preprocessing + model in one object.
  4. Evaluation: train_test_split, cross_val_score, metrics (accuracy, f1, roc_auc, rmse/mae).
  5. Hyperparameter tuning with GridSearchCV/RandomizedSearchCV.
  6. Export model with joblib and reload for inference.

πŸ§ͺ Template: Tabular Classification

About

study on scikit-learn to apply to other projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published