Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL4J refac: Added some observation transform classes #7958

Merged
merged 4 commits into from Jul 20, 2019
Merged

RL4J refac: Added some observation transform classes #7958

merged 4 commits into from Jul 20, 2019

Conversation

aboulang2002
Copy link
Contributor

@aboulang2002 aboulang2002 commented Jun 29, 2019

What changes were proposed in this pull request?

This PR is a step toward a refac that i'd like to do on RL4J. Currently, transforms applied on observations are scattered throughout the code. For example, when using OpenAI's gym, grayscale and cropping are applied when frames are added to the HistoryProcessor; frame skipping is done in Policy and Learning classes, along with the creation of the INDArray, its normalization and the merge of the stored frames. It is very difficult to clearly see what happens to the observations. Also, changing the behavior is not possible without changing the code directly in the library.

To me, transforming the raw observations into something usable by Policy/Learning is a concern on its own and ideally shouldn't "leak" or be apparent in any way outside an "observation" namespace. More precisely, the MDP (the model representation of the game) should offer observations exactly as seen without assuming anything about how they will be used, and on the other side, the Policy/Learning should receive fully transformed observations.

With this PR, I add a series of small and simple transform classes that can be assembled together to form a transform as complex as required. They are not used anywhere right now but will be in future PRs. This way, one can easily change a piece of the overall transform to test an idea. For example if, instead of skipping a fixed number of frames, one wants to do what Google DeepMind does (skip two, then take the component-wise maximum over two consecutive frames) this should be really easy to implement and test.

How was this patch tested?

Several unit tests and an integration test.

Quick checklist

The following checklist helps ensure your PR is complete:

  • Eclipse Contributor Agreement signed, and signed commits - see IP Requirements page for details
  • Reviewed the Contributing Guidelines and followed the steps within.
  • Created tests for any significant new code additions.
  • Relevant tests for your changes are passing.

Signed-off-by: unknown <aboulang2002@yahoo.com>
@saudet
Copy link
Contributor

saudet commented Jul 1, 2019

Thanks! It is good to see someone taking time to figure out what we should be doing for this. Looking at these classes makes me realize that what we need might already be in DataVec though. Have you considered adapting DataSetPreProcessor and using DataSet for the "observations"? We also need to consider the limitations of ND4J. It doesn't have any operations to "crop and resize" NDArrays, yet, so we won't be able to use these classes for that, until it's implemented in ND4J. @AlexDBlack What's your opinion?

@AlexDBlack
Copy link
Contributor

we have resize bilinear, resize nearest neighbor, and crop_and_resize ops if we need those
https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/parity_ops/resize_linear.cpp
https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/parity_ops/resize_neighbor.cpp
https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/parity_ops/crop_and_resize.cpp
(Corresponding Java classes/wrappers for those already exist too)

Those all operate on 4d NHWC data IIRC (we can easily permute for NCHW if need-be)

Not sure on the details here (only quickly skimmed the PR, and haven't looked into RL4J internals too much in general). But using DataSet/DataSetPreProcessor would give us access to existing pipelines and normalizers etc, which is better than duplicating that functionality...

@aboulang2002
Copy link
Contributor Author

I am not too familiar with DataVec but I agree with both of you, using DataSet/DataSetPreProcessor is the way to go. I will make the changes.

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
@aboulang2002
Copy link
Contributor Author

I changed the implementation to use DataSetPreProcessor. I did not port the SignalingTransform, that was to be used as a way to inspect or record data within the transform process. On second thought, recording should not be done within the transform pipeline and one can easily code his own DataSetPreProcessor if one wants to inspect any point of the transformation.

Again, no part of this commit is used yet. That will come in future PRs.

@aboulang2002
Copy link
Contributor Author

aboulang2002 commented Jul 8, 2019

Something is weird. The test when_dataSetIs15wx10h_expect_3wx4hDataSet() fails when using nd4j version 1.0.0.0-SNAPSHOT but passes when using 1.0.0.0-beta4.

See issue #7993

@saudet
Copy link
Contributor

saudet commented Jul 12, 2019

At first glance, that looks great!! Thanks a lot. Since this is pretty generic functionality though, I think this should be moved into DataVec or ND4J, @AlexDBlack ?

About failing tests, master isn't especially stable at the moment, let's wait and see a bit.

@AlexDBlack
Copy link
Contributor

Yep, CropAndResizeDataSetPreProcessor, PermuteDataSetPreProcessor and RGBToGrapscaleDataSetPreProcessor can all go into ND4J, might as well make them available everywhere.

PipelineDataSetPreProcessor is redundant - we already have CompositeDataSetPreProcessor, but feel free to add your builder there too. But it already has varargs, which is probably even easier that the builder... https://github.com/eclipse/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/dataset/api/preprocessor/CompositeDataSetPreProcessor.java

@aboulang2002
Copy link
Contributor Author

There is one thing that RL4J needs, that is a way to tell if the frame should be skipped. I decided to use an empty DataSet to do that. So, for example, the SkippingDataSetPreProcessor will clear the input DataSet for skipped frames. It is not a behavior I'd expect for any preprocessor under ND4J. This is the reason why I introduced the PipelineDataSetPreProcessor. If one of the preprocessor clears the dataset, it will not call the remaining preprocessors and return immediately. It must return immediately because I found that some existing preprocessors could throw an exception when preprocessing empty datasets.

But, I agree this is mostly redundant code. I could write a wrapper DataSetPreProcessor that will only call the wrapped DataSetPreProcessor if the input dataset is not empty. That way, one could use the CompositeDataSetPreProcessor or the CombinedPreProcessor and simply wrap misbehaving DataSetPreProcessors.

So, I will move these DataSetPreProcessors in ND4J:

  • CropAndResizeDataSetPreProcessor
  • PermuteDataSetPreProcessor (Since it will be in ND4J, I'd like to make it able to permute more than just NCHW <-> NHWC)
  • RGBtoGrayscaleDataSetPreProcessor

Do you prefer that I remove PipelineDataSetPreProcessor and instead add a wrapper so we can use misbehaving DataSetPreProcessors?

@saudet
Copy link
Contributor

saudet commented Jul 12, 2019

I think we could update CompositeDataSetPreProcessor to add the additional skipping behavior as well? Or is there a limitation that prevents us from modifying it that way?

@aboulang2002
Copy link
Contributor Author

Seems reasonable. But it should be an option. Maybe someone's code depends on the "continue-on-empty-dataset" behavior. So this should remain the default behavior. I will add an option to change it to "stop-on-empty-dataset".

@aboulang2002
Copy link
Contributor Author

I don't see any test directory in nd4j-api. Are the test elsewhere? Where should I move the test of CropAndResizeDataSetPreProcessor, PermuteDataSetPreProcessor and RGBToGrapscaleDataSetPreProcessor?

@saudet
Copy link
Contributor

saudet commented Jul 13, 2019

They're all under nd4j-tests, including the ones for preprocessors, so you can put them there too, that's fine.

@aboulang2002
Copy link
Contributor Author

Summary of last commit:

  • renamed RL4J namespace ...observation.preprocessors to ...observation.preprocessor
  • Moved some DataSetPreProcessors from RL4J to ND4J
  • Added option to make CompositeDataSetPreProcessor stop processing when dataset is cleared.
  • Changed RGBtoGrayscaleDataSetPreProcessor to process all the examples in the dataset (was processing just the first)

…; Some DataSetPreProcessors moving from RL4J to ND4J

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
@saudet saudet changed the title RL4J refac: Added some observation transform classes [WIP] RL4J refac: Added some observation transform classes Jul 16, 2019
@saudet
Copy link
Contributor

saudet commented Jul 16, 2019

Change the title back when you feel you're done with everything! Thanks

@aboulang2002 aboulang2002 changed the title [WIP] RL4J refac: Added some observation transform classes RL4J refac: Added some observation transform classes Jul 16, 2019
@aboulang2002
Copy link
Contributor Author

I think I'm done with this PR. Thanks.

@saudet
Copy link
Contributor

saudet commented Jul 16, 2019

I think we can move all preprocessors to the same place though. Is there a reason to leave them in RL4J modules?

@aboulang2002
Copy link
Contributor Author

For now, I'd like to leave them in RL4J. The two remaining preprocessors, PoolingDataSetPreProcessor and SkippingDataSetPreProcessor, behave differently from those in ND4J. In the case of ND4J preprocessors, the processor acts on each example of the dataset independently of the others. In the case of the two RL4J preprocessors, they act on a serie of datasets. Plus, although there are no explicit guarantee for that, I would expect idempotency from ND4J preprocessors; two calls to a given ND4J preprocessor with the same dataset should always give the same result. This is not the case with RL4J preprocessors as they maintain an internal state and need to be reset after each "game". Finally, the way I intended to use the preprocessors in RL4J requires input of a dataset with exactly one example per dataset. I don't know how they will act when large datasets are iterated with with batching.

I like your idea though. There are probably some good use cases with RNN where having pooling and skipping preprocessors would be nice. I think I will extend these two preprocessors to support that in the future. But for now, I'd like to keep my focus on RL4J.

@saudet
Copy link
Contributor

saudet commented Jul 17, 2019

I still think they should be with the rest in ND4J, but I'll leave the decision up to @AlexDBlack.

Copy link
Contributor

@AlexDBlack AlexDBlack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm fine with PoolingDataSetPreprocessor/ResettableDataSetPreProcessor/SkippingDataSetPreProcessor staying in RL4J only.

Also flagged some minor nitpicks, fix if you want, but otherwise I'm fine for this to be merged. 👍

Signed-off-by: Alexandre Boulanger <Alexandre.Boulanger@ia.ca>
Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
@aboulang2002
Copy link
Contributor Author

I signed my last commit with the wrong user. I tried to amend it but didn't work. Does anyone know how to fix this?

@saudet
Copy link
Contributor

saudet commented Jul 18, 2019

Damn it, it looks like your contribution is too large, so Eclipse considers it as third-party content. I'll need to file a CQ for that...

@saudet
Copy link
Contributor

saudet commented Jul 18, 2019

Ok, it's opened here: https://dev.eclipse.org/ipzilla/show_bug.cgi?id=20415
Hopefully this won't take more than a week, but they have a tendency to take a very long time...
We can probably just merge this right away, and if they annoy us about this at release time, we'll point them back to that page. @AlexDBlack ?

@saudet
Copy link
Contributor

saudet commented Jul 18, 2019

We can use git commit --amend --reset-author to redo the last commit and reset the author, but I don't think it's necessary.

@aboulang2002
Copy link
Contributor Author

Thanks @saudet. I'll be careful to submit only small PRs in the future!

@AlexDBlack
Copy link
Contributor

We can probably just merge this right away, and if they annoy us about this at release time, we'll point them back to that page. @AlexDBlack ?

+1,565 −1

Some contributions of code to be maintained by Eclipse Projects must be reviewed by the IP Team; in particular if they contain third party content and/or are over 1,000 Lines of Code. Further information can be read via the Eclipse Handbook.

sigh
yeah, let's wait, I'm curious to see how long it takes for future reference.
Snapshots still have CUDA issues so it's not a huge deal if this is delayed a bit...

Also I can't find the 1,000 lines of code limit referenced anywhere in the handbook... https://www.eclipse.org/projects/handbook/

@saudet
Copy link
Contributor

saudet commented Jul 18, 2019

It's in the poster here:
https://www.eclipse.org/legal/EclipseLegalProcessPoster.pdf
Which is linked from this section:
https://www.eclipse.org/projects/handbook/#ip-project-code

@aboulang2002
Copy link
Contributor Author

aboulang2002 commented Jul 18, 2019

Alternatively, I can just close this PR and re-submit the changes in this PR fragmented among a few smaller PRs. If it's less trouble for you guys. @saudet @AlexDBlack

@AlexDBlack
Copy link
Contributor

The PR has been accepted - https://dev.eclipse.org/ipzilla/show_bug.cgi?id=20415#c4
Thanks again @aboulang2002

@AlexDBlack AlexDBlack merged commit ee6aae2 into deeplearning4j:master Jul 20, 2019
@saudet saudet mentioned this pull request Feb 6, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants