Skip to content

Conversation

AddyM
Copy link

@AddyM AddyM commented Oct 18, 2025

Description

Fixes #267

This PR resolves the ImportError: cannot import name 'tosequence' from 'sklearn.utils' that occurs with scikit-learn >= 1.7.0.

Root Cause

The tosequence utility function was deprecated in scikit-learn 1.5 and completely removed in version 1.7.0, breaking sklearn-pandas on import.

Changes Made

In sklearn_pandas/pipeline.py:

  • Removed the import of sklearn.utils.tosequence
  • Added steps = list(steps) at the beginning of TransformerPipeline.__init__
  • Changed self.steps = tosequence(steps) to self.steps = steps

In tests/test_pipeline.py:

  • Added regression tests for list input
  • Added regression tests for tuple input
  • Added regression tests for generator input
  • Added test verifying steps is always a list type

Why This Works

The tosequence utility simply converted various input types to a list. By calling list(steps) at the start of __init__, we:

  • Provide identical functionality to the original tosequence
  • Handle generators/iterators correctly (consumed early, before zip())
  • Remove dependency on sklearn internal utilities
  • Work across all sklearn versions
  • Simplify the codebase

Testing

  • All existing tests pass (4/4 original tests)
  • All new regression tests pass (4/4 new tests)
  • Manually tested with sklearn 1.7.2
  • Verified backward compatibility
  • Tested with list, tuple, and generator inputs

Test Results

8 passed in 0.76s

Compatibility

  • scikit-learn >= 1.7.0 (broken before this fix)
  • scikit-learn 1.5.x - 1.6.x (works before and after)
  • scikit-learn < 1.5.0 (works before and after)

Checklist

  • Code changes implement the fix
  • Tests added for regression prevention
  • All tests passing
  • No breaking changes to public API
  • Backward compatible with older sklearn versions

- Remove deprecated sklearn.utils.tosequence import (removed in sklearn 1.7)
- Convert steps to list at start of __init__ to handle all input types
- Replace tosequence(steps) with direct list assignment
- Fixes compatibility with scikit-learn >= 1.7.0
- Maintains backward compatibility with older sklearn versions
- Add regression tests for list, tuple, and generator inputs
- Verify steps attribute is always a list type

The tosequence utility was deprecated in sklearn 1.5 and removed in 1.7.
Using list() directly provides the same functionality without depending
on sklearn internals.

Fixes scikit-learn-contrib#267
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ImportError: cannot import name 'tosequence' from 'sklearn.utils' with sklearn >=1.7.0

1 participant