New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apache Ignite File System #22194
Apache Ignite File System #22194
Conversation
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
Thanks @dmitrievanthony. Could you check the cla for all contributors who have commits in this PR? |
Hi, @yifeif. Thank you for attention. Did I understand correctly from the message above that all authors have signed CLA and the problem is only that the committer and the author of one commit are not the same? If so, I asked @artemmalykh to write here that he's ok with this PR. |
I'm totally OK with PR! |
Thanks @dmitrievanthony and @artemmalykh. Looks good now. |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
Hi, @mrry, @martinwicke, @yifeif. After merge of #22210 I rebased and updated this pull request, so now it's ready for review. It's quite big, but less that Ignite Dataset. Derek, I'll be grateful if you have a look and start reviewing. Please let me know if changes required. Would be great if someone starts tests and clicks |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
Hi, @mrry. Any updates? |
tensorflow/contrib/igfs/__init__.py
Outdated
plugged into Hadoop or Spark deployments. This contrib package contains an | ||
intergration between IGFS and TensorFlow. | ||
|
||
@@IGFS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module doesn't appear to import a symbol called IGFS
. Do you need this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
|
||
namespace tensorflow { | ||
|
||
string GetEnvOrElse(const string &env, string default_value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark private functions as static
or put them in an anonymous namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, marked as static
.
const string &user_name); | ||
~IGFSClient(); | ||
|
||
inline Status Handshake(CtrlResponse<HandshakeResponse> *res) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline
is implied for member functions defined in the class definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's exactly member function defined in the class definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my point is that the inline
here is redundant, so it doesn't need to be there (even though the compiler accepts and ignores it).
The other headers don't use inline
when the function is defined inline in the class definition, so please delete the inline
specifiers in this file for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I got it. I've removed inline
in headers here and in ignite_client.h
and ignite_byte_swapper.h
as well.
tensorflow/contrib/BUILD
Outdated
@@ -126,6 +126,7 @@ py_library( | |||
}) + select({ | |||
"//tensorflow:with_ignite_support": [ | |||
"//tensorflow/contrib/ignite", | |||
"//tensorflow/contrib/igfs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we combine these two modules into one? IGFS seems like it would fit under "//tensorflow/contrib/ignite"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have them in separate modules by several reasons. First of all, they have very different functionality ("//tensorflow/contrib/ignite"
provides an ability to read data from Apache Ignite Cache while "//tensorflow/contrib/igfs"
provides allows to use file system, based on Apache Ignite). User may want to use only one feature, but not other. It looks like very reasonable in some cases. I think it's a case when "write programs that do one thing and do it well" is good, isn't it? And by the way, after contrib
sunsetting it makes even more sense to have them in separate packages.
Are you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't agree. The two modules are coupled by dependencies from this new code to the tensorflow/contrib/ignite
module. The tf.contrib.igfs
module doesn't export any public symbols, so users won't interact with it directly... and (if built) it will be loaded if they like it or not.
Perhaps most importantly, the shared object rules in the BUILD
file mean that the code in "//tensorflow/contrib/ignite:ignite_client"
will be present in both tensorflow/contrib/igfs/_igfs_ops.so
and tensorflow/contrib/ignite/_dataset_ops.so
, which can lead to a One Definition Rule violation and undefined behavior.
So, for all those reasons, life would be much easier if the code were combined in one module, with a single shared object!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I merged two modules together.
return Status::OK(); | ||
} | ||
|
||
std::shared_ptr<IGFSClient> IGFS::CreateClient() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the "shared"-ness ever used here? It looks like this could return a unique_ptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced by unique_ptr
, done.
TF_RETURN_IF_ERROR(ReadString(&key)); | ||
TF_RETURN_IF_ERROR(ReadString(&val)); | ||
|
||
res->insert(std::pair<string, string>(key, val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could std::move(key), std::move(val)
here to avoid copying the strings again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Status ExtendedTCPClient::WriteString(string str) { | ||
if (!str.empty()) { | ||
TF_RETURN_IF_ERROR(WriteBool(false)); | ||
unsigned short l = str.length(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unsafe cast and should raise a compiler warning: str.length()
returns a size_t
. You need to check that it fits in the range of an unsigned short
before casting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
const std::map<string, string> &properties) | ||
: Request(command_id_), | ||
user_name_(std::move(user_name)), | ||
path_(std::move(path)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path
is a const string&
, so moving it has no effect. If you want to avoid a copy, you need to take path
as a (by-value) string
or a string&&
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, just removed copy()
.
int32_t len; | ||
TF_RETURN_IF_ERROR(client->ReadInt(&len)); | ||
|
||
entries = std::vector<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: entries.clear()
is slightly more readable, since the object must already have been constructed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Hi @mrry. Any updates? |
const string &user_name); | ||
~IGFSClient(); | ||
|
||
inline Status Handshake(CtrlResponse<HandshakeResponse> *res) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, my point is that the inline
here is redundant, so it doesn't need to be there (even though the compiler accepts and ignores it).
The other headers don't use inline
when the function is defined inline in the class definition, so please delete the inline
specifiers in this file for consistency.
if (!str.empty()) { | ||
TF_RETURN_IF_ERROR(WriteBool(false)); | ||
size_t l = str.length(); | ||
if (l > 0xFFFF) return errors::InvalidArgument("String is too long"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, WriteShort()
takes a (signed) int16_t
, so the maximum value is 0x7fff
. Maybe use std::numeric_limits<int16_t>::max
to avoid errors here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done.
tensorflow/contrib/BUILD
Outdated
@@ -126,6 +126,7 @@ py_library( | |||
}) + select({ | |||
"//tensorflow:with_ignite_support": [ | |||
"//tensorflow/contrib/ignite", | |||
"//tensorflow/contrib/igfs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't agree. The two modules are coupled by dependencies from this new code to the tensorflow/contrib/ignite
module. The tf.contrib.igfs
module doesn't export any public symbols, so users won't interact with it directly... and (if built) it will be loaded if they like it or not.
Perhaps most importantly, the shared object rules in the BUILD
file mean that the code in "//tensorflow/contrib/ignite:ignite_client"
will be present in both tensorflow/contrib/igfs/_igfs_ops.so
and tensorflow/contrib/ignite/_dataset_ops.so
, which can lead to a One Definition Rule violation and undefined behavior.
So, for all those reasons, life would be much easier if the code were combined in one module, with a single shared object!
Hi @mrry. I merged |
tensorflow/contrib/ignite/BUILD
Outdated
"kernels/igfs/igfs_writable_file.h", | ||
], | ||
deps = [ | ||
"//tensorflow/contrib/ignite:ignite_client", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This should be ":ignite_client"
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tensorflow/contrib/ignite/BUILD
Outdated
@@ -31,39 +32,88 @@ tf_custom_op_library( | |||
deps = [":dataset_kernels"], | |||
) | |||
|
|||
tf_custom_op_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ":igfs_kernels"
and ":dataset_kernels"
share some symbols (in ":ignite_client"
), to avoid the ODR issues, it'll be easiest to merge "_dataset_ops.so" and "_igfs_ops.so" into a single shared object (and hence tf_custom_op_library()
rule).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, good point. I've merged them into a single shared library _ignite_ops.so
.
Hi @mrry. I've updated the code, please have a look. |
It looks like Sanity failed because of |
Thanks @mrry, looks like everything is fixed. Currently only one test fails and it fails in non related to my code part. Hope the PR will be merged soon :) |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
PiperOrigin-RevId: 220969388
This is a proposal to support Apache Ignite File System in TensorFlow.
Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads, delivering in-memory speeds at petabyte scale. This proposal is a part of a more global initiative to implement so called "TensorFlow on Apache Ignite" (IGNITE-8335, Design Document).
The integration is based on custom filesystem plugin from TensorFlow side and IGFS Native API from Apache Ignite side. More information about this module you can find in README.md.
Tests have also been added. They use docker to hide configuration complexity, so that the implemented functionality can be tested quite simply by manual run.
Be aware that this pull request is based on previous #22210 that is not merged yet (significant part of the code you see here will be merged in #22210).