New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error happens when I try to load data #359
Comments
@andrewzhang1 I cannot reproduce. This works for me with Python 2.7.11 and Spark 2.0.2 with Pixiedust 1.0.6.
|
@rajrsingh I attended David Taieb's demo yesterday, and I did not use IBM's Data science cloud environment. By instead, I use my own local jupyter on my linux. I showed this error to David, and David told me there're might be a bug there. I like to confirm: no spark needs to be installed (or connected) on my local machines in order use pixiedust. |
By the way, I'm not the only one getting this error by using local jupyter notebook. |
@andrewzhang1 Yes, this is a valid bug that happens when running PixieDust on a plain Python Notebook without Spark. |
update sampleData() api to successfully load data when notebook is not using spark
As a
<user type>
, I want to<task>
so that<goal>
(make this the title)Expected behavior
Actual behavior
Steps to reproduce the behavior
I used jupyter notebook with anaconda python 2.7 on my local PC to do a project with pixiedust. Here are the errors when I tried to load data frame:
!pip install pixiedust
import pixiedust
df = pixiedust.sampleData("https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD")
Downloading 'https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD' from https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD
Downloaded 214106 of 214106 bytes
Creating pySpark DataFrame for 'https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD'. Please wait...
Successfully created pySpark DataFrame for 'https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD'
AttributeError Traceback (most recent call last)
in ()
----> 1 df = pixiedust.sampleData("https://data.sfgov.org/api/views/vv57-2fgy/rows.csv?accessType=DOWNLOAD")
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/environment.pyc in wrapper(*args, **kwargs)
83 kwargs.pop("fromScala")
84 fromScala = True
---> 85 retValue = func(*args, **kwargs)
86 if fromScala and retValue is not None:
87 from pixiedust.utils.javaBridge import JavaWrapper
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/sampleData.pyc in sampleData(dataId)
78 def sampleData(dataId=None):
79 global dataDefs
---> 80 return SampleData(dataDefs).sampleData(dataId)
81
82 class SampleData(object):
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/sampleData.pyc in sampleData(self, dataId)
91 return self.loadSparkDataFrameFromSampleData(dataDefs[str(dataId)])
92 elif "https://" in str(dataId) or "http://" in str(dataId) or "file://" in str(dataId):
---> 93 return self.loadSparkDataFrameFromUrl(str(dataId))
94 else:
95 print("Unknown sample data identifier. Please choose an id from the list below")
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/sampleData.pyc in loadSparkDataFrameFromUrl(self, dataUrl)
126 "url": dataUrl
127 }
--> 128 return Downloader(dataDef).download(self.dataLoader)
129
130
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/sampleData.pyc in download(self, dataLoader)
150 try:
151 print("Creating pySpark DataFrame for '{0}'. Please wait...".format(displayName))
--> 152 return dataLoader(path, self.dataDef.get("schema", None))
153 finally:
154 print("Successfully created pySpark DataFrame for '{0}'".format(displayName))
/home/azhang/anaconda/lib/python2.7/site-packages/pixiedust/utils/sampleData.pyc in dataLoader(self, path, schema)
103 def dataLoader(self, path, schema=None):
104 #TODO: if in Spark 2.0 or higher, use new API to load CSV
--> 105 load = ShellAccess["sqlContext"].read.format('com.databricks.spark.csv')
106 if schema is not None:
107 def getType(t):
AttributeError: 'NoneType' object has no attribute 'read'
The text was updated successfully, but these errors were encountered: