Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kedro ipython immediately fails for starter=spaceflights-pyspark-viz #210

Closed
gpierard opened this issue Jan 23, 2024 · 6 comments
Closed

Comments

@gpierard
Copy link

gpierard commented Jan 23, 2024

Description

installed the requirements.txt, getting this immediately after kedro ipython
kedro.io.core.DatasetError: Class 'spark.SparkDataset' not found, is this a typo?

is there any additional config needed for starter=spaceflights-pyspark-viz?

I tried this on Windows with 0.19.1 and 0.19.2 both yielding the same issue. kedro-datasets is 1.5.1

@gpierard gpierard changed the title kedro ipython immediately fails kedro ipython immediately fails for starter=spaceflights-pyspark-viz Jan 23, 2024
@datajoely
Copy link
Contributor

datajoely commented Jan 23, 2024

Answered on stackoverflow as well, but would be easiest to track here

if you explicitly install pip install kedro-datasets[spark.SparkDataset] does it work? This should be part of the requirements.txt

@merelcht
Copy link
Member

Hi @gpierard, thanks for flagging this issue. Could you provide some more info so it's easier for the team to find out what's going on?

  • Which kedro version are you using?
  • If you have kedro-datasets installed, which version are you using?
  • Which python version are you using?
  • Which OS (Unix/Windows)?

Thanks!

@astrojuanlu
Copy link
Member

IIUC from kedro-org/kedro#3545 (comment), Kedro 0.17.6 and Python 3.8.5, am I right @gpierard?

Any chance you can upgrade to a newer version? 0.17.6 is very old.

@gpierard
Copy link
Author

gpierard commented Jan 23, 2024

thanks for your answers, sorry I should have mentioned that I tried this on Windows with 0.19.1 and 0.19.2 both yielding the same issue. kedro-datasets is 1.5.1 and kedro-datasets[spark.SparkDataset] is installed as well. This issue is fully separate from the others mentioned.

@merelcht
Copy link
Member

@gpierard The datasets were only renamed in kedro-datasets 1.7.0 so before that it would be looking at spark.SparkDataSet. Can you try using the latest version 2.0.0?

@gpierard
Copy link
Author

I confirm that kedro-datasets 2.0.0 solves the issue and that the spark session is correctly configured. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants