Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix #62 Optional path argument in JoinMapper #65

Open
wants to merge 5 commits into
from

Conversation

Projects
None yet
2 participants

Now to get source path from the mapper routine just add **kwargs to the arguments list. Here are some examples.

@dumbo.decor.primary
def map_primary(key, value, **kwargs):
  key, value = value.strip().split('\t')
  print >> sys.stderr, key, value, kwargs['path']
  yield key, value

Or you can specify desired argument directly

@dumbo.decor.primary
def map_primary(key, value, path, **kwargs):
  key, value = value.strip().split('\t')
  print >> sys.stderr, key, value, path
  yield key, value

Callable instances are also supported

@dumbo.decor.secondary
class MapSecondary(object):
  def __call__(self, key, value, path, **kwargs):
    key, value = value.strip().split(' ')
    print >> sys.stderr, value, path
    yield key, value

And previous mapper interface is working aswell

@dumbo.decor.primary
def map_primary(key, value):
  key, value = value.strip().split('\t')
  yield key, value

This approach allows easily extend interface to pass other arguments in the future

Owner

klbostee commented Jan 8, 2013

Sounds good! Will try to find some time to review and merge this soonish.

Did you measure the impact of this on performance in any way? I'm a bit worried that doing an additional (lambda) function call whenever the mapper gets called could make things substantially slower...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment