Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Apparent Canopy IPython bugs with PySide bindings. #5116

Closed
jtratner opened this issue Oct 5, 2013 · 7 comments
Closed

BUG: Apparent Canopy IPython bugs with PySide bindings. #5116

jtratner opened this issue Oct 5, 2013 · 7 comments
Milestone

Comments

@jtratner
Copy link
Contributor

jtratner commented Oct 5, 2013

Opening up a separate issue about some Canopy IPython bugs that pandas (or new BigQuery IO module) may be producing. @azbones, @jacobschaer, @sean-schaefer and others, let's work here to figure it out.

Related comments in #4140.

@jtratner
Copy link
Contributor Author

jtratner commented Oct 5, 2013

To start with, @azbones can you clarify something: does the behavior you cite only occur when using the BigQuery module or does it happen with pandas generally? What happens if you remove the print statements from the PR?

@jtratner
Copy link
Contributor Author

jtratner commented Oct 5, 2013

Snippet of @azbones report here:

Hey @sean-schaefer if you have some time, could you try a clean install of this branch of Pandas using the standard 2.7.x and try the read_gbq and write_gbq? Other than any Pandas dependencies, you should only need to easy_install bigquery. I've tested it on Canopy on Mac OSX and Windows 7 and using Anaconda on Windows 7. I have a trouble ticket in with Enthought and they gave me some guidance on how to perhaps isolate the issue we saw with the Canopy iPython Editor throwing that "pkgutil.py find_module() takes exactly 3 arguments (2 given)" error which I will try tomorrow.

If you could try the base 2.7.x Python with whatever OSs you have that would be great.


@sean-schaefer it seems to be a problem just with the Canopy app editor version of ipython as ipython --pylab=qt which uses qt console also does not seem to throw that bug. They told me that is the ipython mode that the app runs in, so not sure what is going on. I'll post when I know more. You might try just ipython and then ipython --pylab=qt

@azbones
Copy link

azbones commented Oct 8, 2013

@jtratner I've only seen it with the BigQuery module so far and not Pandas in general. Specifically, it seems to only be within the Enthought Canopy distribution when using the iPython editor when launched from the Canopy UI. I've seen it across Windows 7 and Mac OSX. I updated my ticket with Enthought and was waiting to see if they had any other thoughts. Initially, they thought it was the QT_API binding as they used pyside, but I think I have eliminated that as the sole potential cause.

Here is the last message I left on the Enthought ticket I opened:

I finally got to try this again. I tried it on a cleaner Win 7 system that had 64-bit Canopy Version: 1.1.0.1371 installed as the default Python environment. What I found was:

Setting the QT binding:

set QT_API=pyqt

Launching iPython three different ways:

ipython
ipython qtconsole
ipython qtconsole --pylab=qt

In all three configurations, our code worked.

changing the QT binding to:

set QT_API=pyside

All three configurations of iPython also worked correctly with our code. I know per your earlier link (https://support.enthought.com/entries/22305234-IPython-updating-this-package-affects-IPython-in-terminal-but-not-in-Canopy-application) , that the command line iPython and the iPython within the Canopy GUI may be different. version yields 1.7.1 in both the Canopy App version and the pylab command line iPython.

So, pyside in the command line doesn't seem to be the issue. The other thing I tried was changing the Canopy Pylab backend preferences to 'wx'. This, like disabling Pylab in the preference, also allows our code to work without the error. I've never used wx, so not sure if this is a viable option to QT or not.

The issue seems to be isolated to just iPython within the Canopy UI. I have several work arounds:

-disable Pylab
-change backend to wx
-(the overly hackish) insert an empty list like so "loader = importer.find_module(fullname,[])" in C:\Users\collin\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\Lib\pkgutil.py
-use qtconsole from the command line

@jtratner
Copy link
Contributor Author

jtratner commented Oct 8, 2013

It's incredible to me that the code could do that. Can you tell if it's the
pandas module vs the Google library?

@jtratner
Copy link
Contributor Author

jtratner commented Oct 8, 2013

I'm going to see if I can reproduce it locally. Btw - have you tried
IPython notebook? Interactive like QT console but it's much nicer to use
(with better tab completion)

@azbones
Copy link

azbones commented Oct 8, 2013

@jtratner here is the stack trace. I'm guessing it has something to do with the iPython within Canopy app using different components than the command line. It took a while to isolate it because we were all using different flavors of iPython.

I'll try it with notebook as I haven't yet.

#note gbq.read_gbq is our module. Also, we have seen this error across Mac OSX and Windows 7 so far.

gbq.read_gbq('SELECT * FROM [publicdata:samples.shakespeare]') 
--------------------------------------------------------------------------- 
TypeError Traceback (most recent call last) 
<ipython-input-2-c86774d931e9> in <module>() 
----> 1 gbq.read_gbq('SELECT * FROM [publicdata:samples.shakespeare]')

/Users/uschaj3/Repositories/pandas/pandas/io/gbq.pyc in read_gbq(query, project_id, destination_table, index_col, col_order, **kwargs) 
452 raise 
453 try: 
--> 454 job = client.Query(**query_args) 
455 except bigquery_client.BigqueryInvalidQueryError as Ex: 
456 print('Error Parsing Query: ' + str(query))

/Users/uschaj3/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/bigquery-2.0.15-py2.7.egg/bigquery_client.pyc in Query(self, query, destination_table, create_disposition, write_disposition, priority, preserve_nulls, allow_large_results, dry_run, use_cache, min_completion_ratio, **kwds) 
1923 request = {'query': query_config} 
1924 _ApplyParameters(request, dry_run=dry_run) 
-> 1925 return self.ExecuteJob(request, **kwds) 
1926 
1927 def Load(self, destination_table_reference, source,

/Users/uschaj3/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/bigquery-2.0.15-py2.7.egg/bigquery_client.pyc in ExecuteJob(self, configuration, sync, project_id, upload_file, job_id) 
1588 job = self.RunJobSynchronously( 
1589 configuration, project_id=project_id, upload_file=upload_file, 
-> 1590 job_id=job_id) 
1591 else: 
1592 job = self.StartJob(

/Users/uschaj3/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/bigquery-2.0.15-py2.7.egg/bigquery_client.pyc in RunJobSynchronously(self, configuration, project_id, upload_file, job_id) 
1573 upload_file=None, job_id=None): 
1574 result = self.StartJob(configuration, project_id=project_id, 
-> 1575 upload_file=upload_file, job_id=job_id) 
1576 if result['status']['state'] != 'DONE': 
1577 job_reference = BigqueryClient.ConstructObjectReference(result)

/Users/uschaj3/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/bigquery-2.0.15-py2.7.egg/bigquery_client.pyc in StartJob(self, configuration, project_id, upload_file, job_id) 
1478 filename=upload_file, mimetype='application/octet-stream', 
1479 resumable=resumable) 
-> 1480 result = self.apiclient.jobs().insert( 
1481 body=job_request, media_body=media_upload, 
1482 projectId=project_id).execute()

/Users/uschaj3/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/bigquery-2.0.15-py2.7.egg/bigquery_client.pyc in apiclient(self) 
469 discovery_document = pkgutil.get_data( 
470 'bigquery_client', 'discovery/%s.bigquery.%s.rest.json' 
--> 471 % (_ToFilename(self.api), self.api_version)) 
472 except IOError: 
473 discovery_document = None

/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/pkgutil.pyc in get_data(package, resource) 
576 """ 
577 
--> 578 loader = get_loader(package) 
579 if loader is None or not hasattr(loader, 'get_data'): 
580 return None

/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/pkgutil.pyc in get_loader(module_or_name) 
462 else: 
463 fullname = module_or_name 
--> 464 return find_loader(fullname) 
465 
466 def find_loader(fullname):

/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/pkgutil.pyc in find_loader(fullname) 
473 """ 
474 for importer in iter_importers(fullname): 
--> 475 loader = importer.find_module(fullname) 
476 if loader is not None: 
477 return loader

TypeError: find_module() takes exactly 3 arguments (2 given)

@jtratner
Copy link
Contributor Author

jtratner commented Oct 8, 2013

okay, this error is pretty clearly the fault of google's biquery module's call to pkgutil (469 discovery_document = pkgutil.get_data(), so there's no way for pandas to fix it (and you'd encounter this if you were using google's bigquery directly), so I'm going to close this, because we have no control over it. If you/we can figure out a minimal working example of the error, might be useful to pass it along to the bigquery people.

@jtratner jtratner closed this as completed Oct 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants