Describe the Bug
The current dependency constraint for datasets is loose so newer versions such as datasets==4.8.5 may be installed. In that version, ShufflingConfig can no longer be imported from datasets, which causes flame/data.py to fail on import:
ImportError: cannot import name 'ShufflingConfig' from 'datasets'
and crashes the training process.
Suggested local fix:
pip install "datasets==4.5.0"
easy dependency fix:
Pin datasets in pyproject.toml to:
which is the last version supports ShufflingConfig.
Steps to Reproduce the Bug
Install a newer datasets version, for example:
pip install "datasets==4.8.5"
Then run:
python - <<'PY'
import flame.data
PY
Actual result:
ImportError: cannot import name 'ShufflingConfig' from 'datasets'
This also causes training to fail during startup when flame.data is imported.
Expected Behavior
flame.data should import successfully with the dependency versions allowed by pyproject.toml.
Environment Information
- Torch: 2.12.0+cu130
- Triton: 3.7.0
Describe the Bug
The current dependency constraint for
datasetsis loose so newer versions such asdatasets==4.8.5may be installed. In that version,ShufflingConfigcan no longer be imported fromdatasets, which causesflame/data.pyto fail on import:and crashes the training process.
Suggested local fix:
pip install "datasets==4.5.0"easy dependency fix:
Pin
datasetsinpyproject.tomlto:which is the last version supports ShufflingConfig.
Steps to Reproduce the Bug
Install a newer
datasetsversion, for example:pip install "datasets==4.8.5"Then run:
python - <<'PY'
import flame.data
PY
Actual result:
This also causes training to fail during startup when flame.data is imported.
Expected Behavior
flame.datashould import successfully with the dependency versions allowed bypyproject.toml.Environment Information