Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change DataIterator to work on get_example and get_batch and add methods... #40

Closed
wants to merge 8 commits into from

Conversation

mducoffe
Copy link

@mducoffe mducoffe commented Mar 6, 2015

... and batch attributes to Transformer object

…ods and batch attributes to Transformer object
@mducoffe
Copy link
Author

mducoffe commented Mar 6, 2015

Just to know if the code was okay. Also do I have to call get_epoch_iterator on every transformer class with example or batch stream for tests ? (Have also to correct indentation error)

@@ -23,9 +23,15 @@ def __iter__(self):

def __next__(self):
if self.request_iterator is not None:
data = self.data_stream.get_data(next(self.request_iterator))
if self.data_stream.batch :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, maybe it's better to move this logic to the Transformer base class. That way child classes can override it if necessary and don't need to duplicate methods if the implementation for batch/example is the same.

@bartvm
Copy link
Member

bartvm commented Mar 6, 2015

There are a few transformers which can share the implementation of their get_example and get_batch method, so here's how I would propose doing it:

  • Implement a get_data method on the Transformer base class that calls get_example and get_batch (like the logic you implemented in the DataIterator now). Create two get_example and get_batch methods which raise a NotImplementedError when called
  • Transformers that have the same logic for batch and example just override the get_data method
  • Transformers for which the methods are different, or which only implement one, override get_example/get_batch instead.

Hope that was clear

def __init__(self, data_stream, mapping, add_sources=None):
super(Mapping, self).__init__(data_stream)
def __init__(self, data_stream, mapping, batch=False, add_sources=None):
super(Mapping, self).__init__(data_stream, batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you can remove it here too.

@bartvm
Copy link
Member

bartvm commented Mar 9, 2015

Seems like there's a merge conflict, so you'll need to rebase. Let me know if you need some help doing that!

Melanie Ducoffe added 5 commits March 10, 2015 09:31
…a_from_example, get_batch, swap Unpack and Batch, remove batch from the inputs of the classes derived from Transformer
…to CCW_Transformer

Conflicts:
	fuel/transformers/__init__.py
@mducoffe
Copy link
Author

@bartvm it is eventually done !

@mducoffe
Copy link
Author

@bartvm do u know what is coverage test in the github tests ?

@bartvm
Copy link
Member

bartvm commented Mar 16, 2015

Coverage is the number of lines of code that are run by unit tests divided by the total number of lines of code. If it decreases it means that this percentage dropped. In this case it's because you added 9 new lines of code to transformers/__init__.py, but only 7 of those are tested, so the coverage decreases slightly. There's no need to worry about this too much, it's mostly to remind people that their code needs to be tested, and if the drop is very large then it is a sign that tests need to be written. You can actually click on the details and see exactly which line is currently untested.

@mducoffe
Copy link
Author

@bartvm is there any way to see which lines are not covered by the tests ? (sources are not available)

@bartvm
Copy link
Member

bartvm commented Mar 16, 2015

Not sure, never seen that before! It might be because of the merge conflicts earlier. You might want to try rebasing?

@@ -22,10 +22,14 @@ class Transformer(AbstractDataStream):
this attribute. Use it to access data from the wrapped data stream
by calling ``next(self.child_epoch_iterator)``.

batch : boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_input instead of batch, and there should be no new line between parameter descriptions.

@bartvm
Copy link
Member

bartvm commented Mar 20, 2015

Wait, I just realised you're working in #45 now, so I'll close this one then?

@bartvm bartvm closed this Mar 20, 2015
@mducoffe
Copy link
Author

yes please. And by the way thanks for the dropout on blocks, it is really
convenient to use with the computation graph

2015-03-20 13:48 GMT-04:00 Bart van Merriënboer notifications@github.com:

Wait, I just realised you're working in #45
#45 now, so I'll close this one then?


Reply to this email directly or view it on GitHub
#40 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants