Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed scheduling due to utils.py 'basestring' is not defined? #17

Closed
trzytematyczna opened this issue May 16, 2017 · 2 comments
Closed

Comments

@trzytematyczna
Copy link

trzytematyczna commented May 16, 2017

Hi,
I am trying to use your parser, and while running the command pipeline admin$ python pipeline.py BuildDataset --start 2000 --end 2001 --local-scheduler I am getting the error, which is connected to " NameError: name 'basestring' is not defined" in the utils.py. I looked at the code, but tbh I struggle to get what the variable basestring suppose to be. Any indications what can I check or how can I solve this much appreciated!

Error:

DEBUG: Checking if BuildDataset(start=2000, end=2001) is complete
/Users/admin/anaconda/lib/python3.5/site-packages/luigi/worker.py:328: UserWarning: Task BuildDataset(start=2000, end=2001) without outputs has no custom complete() method
  is_complete = task.complete()
DEBUG: Checking if BuildLCCAuthorRepdocCorpusTfidf(start=2000, end=2001) is complete
INFO: Informed scheduler that task   BuildDataset_2001_2000_429339e3d6   has status   PENDING
WARNING: Will not run BuildLCCAuthorRepdocCorpusTfidf(start=2000, end=2001) or any dependencies due to error in complete() method:
Traceback (most recent call last):
  File "/Users/admin/anaconda/lib/python3.5/site-packages/luigi/worker.py", line 328, in check_complete
    is_complete = task.complete()
  File "/Users/admin/anaconda/lib/python3.5/site-packages/luigi/task.py", line 533, in complete
    outputs = flatten(self.output())
  File "/Users/admin/Desktop/DBLP_parser/dblp-master/pipeline/util.py", line 39, in output
    if isinstance(self.base_paths, basestring):
NameError: name 'basestring' is not defined

INFO: Informed scheduler that task   BuildLCCAuthorRepdocCorpusTfidf_2001_2000_429339e3d6   has status   UNKNOWN
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
DEBUG: Asking scheduler for work...
DEBUG: Done
DEBUG: There are no more tasks to run at this time
DEBUG: There are 1 pending tasks possibly being run by other workers
DEBUG: There are 1 pending tasks unique to this worker
DEBUG: There are 1 pending tasks last scheduled by this worker
INFO: Worker Worker(salt=038425231, workers=1, host=Monikas-MacBook-Pro.local, username=admin, pid=75675) was stopped. Shutting down Keep-Alive thread
INFO: 
===== Luigi Execution Summary =====

Scheduled 2 tasks of which:
* 1 failed scheduling:
    - 1 BuildLCCAuthorRepdocCorpusTfidf(start=2000, end=2001)
* 1 were left pending, among these:
    * 1 had dependencies whose scheduling failed:
        - 1 BuildDataset(start=2000, end=2001)

Did not run any tasks
This progress looks :( because there were tasks whose scheduling failed

===== Luigi Execution Summary =====

@macks22
Copy link
Owner

macks22 commented May 16, 2017

This is due to incompatibility with Python 3. The basestring type was a common base type for str and unicode that was removed in Python 3, when the default str type was changed to be unicode. The easiest fix for this is to run it with Python 2.

If you'd like, you can replace that line with if isinstance(self.base_paths, str):. There may be other incompatibilities with Python 3 after this though. If you're up to it, I'm glad to merge in a PR that gives full compatibility. You can try using the 2to3 tool to help.

@macks22
Copy link
Owner

macks22 commented May 22, 2017

Closing due to inactivity.

@macks22 macks22 closed this as completed May 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants