Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache Ignite Dataset #22210

Merged
merged 17 commits into from
Oct 1, 2018
Merged

Apache Ignite Dataset #22210

merged 17 commits into from
Oct 1, 2018

Conversation

dmitrievanthony
Copy link
Contributor

This is a proposal to add IgniteDataset that allows to work with Apache Ignite.

Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads, delivering in-memory speeds at petabyte scale. This proposal is a part of a more global initiative to implement so called "TensorFlow on Apache Ignite" (IGNITE-8335, Design Document).

The integration is based on Apache Ignite Binary Client Protocol and TensorFlow tf.data.Dataset. More information about supported features you can find in README.md of this module.

Tests have also been added. They use docker to hide configuration complexity, so that the implemented functionality can be tested quite simply by manual run.

It's a copy of #21853, because previous request has accidentally been closed because I cleaned up merge commits from the branch.

@dmitrievanthony
Copy link
Contributor Author

Hello @mrry, @martinwicke, @perfinion. Sorry for confusion, I've recreated #21853 here. The current state is:

  • I've fixed all comments from @mrry.
  • During last CI run only Windows builds and Clang checks failed.
  • I've fixed Clang checks.

Regarding Windows builds I see the same issues I had and asked here: https://groups.google.com/a/tensorflow.org/forum/#!topic/build/ePYss0Kxcu4. It looks like we need to add copt=-DWIN32_LEAN_AND_MEAN on CI server for Windows builds.

Guys, let me ask you to continue review.

@perfinion perfinion added the kokoro:force-run Tests on submitted change label Sep 11, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 11, 2018
Copy link
Contributor

@mrry mrry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that many of the style/performance/correctness comments apply generally across the code, but I've only commented at the first instance.

# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Apache Ignite is a memory-centric distributed database, caching, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: all module, class, and function docstrings should begin with a one-line summary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated docs and added a short description in the first line.

limitations under the License.
==============================================================================*/

#include "ignite_binary_object_parser.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: use the full absolute path to included headers in the same project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Status BinaryObjectParser::Parse(uint8_t** ptr,
std::vector<Tensor>* out_tensors,
std::vector<int32_t>* types) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types argument seems to be unused, except in what seems to be a recursive call to this method. Delete it if it's unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, excellent point, I completely forgot about it. The reason of types to be here is to build a schema of every object that we receive and check it.

I added this logic today, so it affects this code and CheckTypes method in IgniteDatasetIterator. Please take a look.

==============================================================================*/

#include <vector>
#include "tensorflow/core/framework/dataset.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header seems to be unused. If you depend on other headers indirectly, include them directly here or (preferably) in the .cc file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, look like the following two headers are enough:

#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/lib/core/status.h"

int32_t length = *((int32_t*)*ptr);
*ptr += 4;
Tensor tensor(cpu_allocator(), DT_STRING, {});
tensor.scalar<std::string>()() = std::string((char*)*ptr, length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tensorflow::string instead of std::string everywhere. Some platforms use a different string implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""
if self.fields is None:
object_type = types[self.type_id]
if object_type is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object_type can never be None, because None is not a value in types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added check like:

if self.type_id in types:

Done.

if is_array:
return tensor_shape.TensorShape([None])
return tensor_shape.TensorShape([])
raise Exception("Unsupported type [type_id=%d]" % self.type_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ValueError instead of Exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced (where it's appropriate).

return self.to_flat_rec([])

def to_permutation(self):
"""Returns a permutation that should be applied to order object leafs."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: s/leafs/leaves/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

name="page_size")
self.username = ops.convert_to_tensor("" if username is None else username,\
dtype=dtypes.string, name="username")
self.password = ops.convert_to_tensor("" if password is None else password,\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and self.cert_password) will encode potentially sensitive information in the GraphDef and send it over insecure channels. Is it necessary to include this information here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary, but it's the simple way to start using.

We provide two ways to specify these parameters: via parameters of dataset or via environment variables on the nodes where dataset will be actually instantiated. As far as I understand, in the second case sensitive information won't be included into GraphDef and passed via insecure channels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should only support the environment variable, in that case? I'm concerned that novice users won't understand the distinction, take the simple route, and leak sensitive information.

However, security isn't my domain, so @martinwicke can you please make a determination here?

"""
super(IgniteDataset, self).__init__()

with IgniteClient(host, port, username, password, certfile, keyfile,\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for a line continuation character here (or in the cases below) because it is implied by the open parentheses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

@dmitrievanthony
Copy link
Contributor Author

Hi @mrry. Thank you for very detailed and deep review, I really appreciate it. I think I fixed all your comments today, so could you please have a look my changes?

@dmitrievanthony
Copy link
Contributor Author

Meanwhile, @martinwicke, @perfinion, could you please rerun tests? I made a lot of changes, would be great to check CI statuses.

@yongtang yongtang added the kokoro:force-run Tests on submitted change label Sep 13, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 13, 2018
@yongtang yongtang added kokoro:run kokoro:force-run Tests on submitted change labels Sep 13, 2018
@kokoro-team kokoro-team removed kokoro:run kokoro:force-run Tests on submitted change labels Sep 13, 2018
@yongtang
Copy link
Member

: added kokoro:run label to help start the tests.

@yongtang
Copy link
Member

@dmitrievanthony The test failure for Experimental clang-format Check is related to clang-format. You can use clang-format -is --style=google <filename.cc> to fix them.

However, different clang-format versions may generate different outputs. I think you could use clang-format bundled with Ubuntu 16.04 to get the outputs that matches the Experimental clang-format Check.

If you have Docker installed on your machine, I think you may use:

docker run -i -t --rm \
    -v $PWD:/tensorflow -w /tensorflow --net=host ubuntu:16.04 \
    sh -c 'apt-get -y update && apt-get -y install clang-format && clang-format -i --style=google tensorflow/contrib/ignite/kernels/ignite_byte_swapper.h'

to format the ignite_byte_swapper.h that is consistent with Experimental clang-format Check.

@dmitrievanthony
Copy link
Contributor Author

dmitrievanthony commented Sep 17, 2018

Hi @mrry. It's been 6 days since I fixed all review comments. Could you please take a look?

public:
ByteSwapper(bool big_endian) {
int x = 1;
bool is_little_endian = (*(char *)&x == 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the formulation here to work out the endianness statically:

constexpr bool kLittleEndian = __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it similar way.

name="page_size")
self.username = ops.convert_to_tensor("" if username is None else username,\
dtype=dtypes.string, name="username")
self.password = ops.convert_to_tensor("" if password is None else password,\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should only support the environment variable, in that case? I'm concerned that novice users won't understand the distinction, take the simple route, and leak sensitive information.

However, security isn't my domain, so @martinwicke can you please make a determination here?

@dmitrievanthony
Copy link
Contributor Author

Regarding security question, @mrry, @martinwicke, I think we can add warning in case user specifies sensitive data via parameters. What do you think?

@martinwicke
Copy link
Member

I think we should make it hard for credentials to end up in files or on the network in the clear. Therefore, I would prefer if we insist here that all credentials are present on the executing machine already, and we make it impossible to use credentials that are stored in the graph.

I think environment variables would work fine in this case, can we restrict it to that for all sensitive information?

@dmitrievanthony
Copy link
Contributor Author

I agree, it's reasonable, @mrry, @martinwicke. I updated code so that sensitive information is not encoded in graph. Be aware that it's still used in python for initial access (that is required to get data schema), but after that only environment variables are used.

@dmitrievanthony
Copy link
Contributor Author

Do we have any open questions, @mrry?

mrry
mrry previously approved these changes Sep 23, 2018
Copy link
Contributor

@mrry mrry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go ahead and merge this.

@martinwicke martinwicke added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Sep 24, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 24, 2018
@dmitrievanthony
Copy link
Contributor Author

dmitrievanthony commented Sep 24, 2018

Guys, looks like I introduced a bug on Windows when changed ignite_byte_swapper.h. It's fixed now, so please rerun tests who can do it.

Also, as we found out previously, it was a bug in master that leads to Windows GPU build failure. See #22210 (comment):

@dmitrievanthony I created a PR #22258 to fix the Windows GPU build failure issue.

Should I merge master into my branch? Or it's fine that Windows GPU build is broken by non-related to my code reason?

@yongtang
Copy link
Member

@dmitrievanthony The windows GPU fix has been submitted internally so the build should pass. Let me help with rerun the test.

@yongtang yongtang added kokoro:run kokoro:force-run Tests on submitted change labels Sep 24, 2018
@kokoro-team kokoro-team removed kokoro:run kokoro:force-run Tests on submitted change labels Sep 24, 2018
@dmitrievanthony
Copy link
Contributor Author

Ok, @mrry, @martinwicke, @yongtang, all tests I expected to pass actually passed. XLA fails, but it fails all the time by reasons that are not related to my code.

Derek, sorry for bothering you again, but we need you approval again. After that this PR can be merged, right?

Copy link
Contributor

@mrry mrry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving. There are a few more internal steps required before the PR is merged, but you shouldn't need to do anything else from this point.

@dmitrievanthony
Copy link
Contributor Author

Hi, @mrry, @martinwicke. I'm not sure how long it takes to pass all internal steps, but it looks like it takes time. Meanwhile, as far as I see, there are several conflicts appeared. Shall I fix them?

@mrry
Copy link
Contributor

mrry commented Sep 28, 2018

No need to update the branch. We're in the process of getting it to merge internally.

FYI: You should check the diff between what we end up merging, and the original PR. Our internal checks seem to be more picky about style and other issues than the presubmit. It has also been necessary to disable the SSL tests and remove the checked-in private key file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants