Skip to content

Conversation

@karllessard
Copy link
Collaborator

Fixes #320

Make sure the TF native libraries have been loaded properly before instantiating a String tensor, since we need to know in advance the size of the TF_TString native class to compute the required size of the whole buffer.

Also cleaned up a few unrelated unit tests... (@rnett if you want to take a look, especially at BooleanMaskTest)

@karllessard karllessard requested a review from Craigacp May 13, 2021 14:56

/** Make sure all TensorFlow native libraries have been loaded properly */
public static void init() {
// Do nothing, we'll let the class static initializer load the native library if needed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use Class.forName("org.tensorflow.TensorFlow") to trigger initialization? Having an empty method that we call to trigger the class's static initializer seems unpleasant (and likely to make static analysis or linters grumpy).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should already have been working correctly with init() in the static initialization block, and having that one call NativeLibrary.load(). Why do we need to change this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteSequenceTensorBuffer can't see the TensorFlow.init() method as it's package private, and nothing on the codepath for TString.scalarOf triggered TensorFlow.init() by hitting a static initialiser in other classes in org.tensorflow.

Copy link
Contributor

@saudet saudet May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can make init() public or something, but that should be about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it because right now, everytime TensorFlow.init() is called (which is done in several places), the JavaCPP loader is invoked to load TF classes, while it should basically be done only once. There is probably no issue calling it multiple times but no reasons neither afaik. So moving this call exclusively to the class static initializer of TensorFlow ensure that we try to load the library only once.

Now @Craigacp , you seem to foresee potential problems doing it that way, can you explain a bit more?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an empty method that we call to ensure a class is initialised is a code smell. Class.forName is documented as triggering the class initialization (and it's mentioned in the language spec - https://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4), so that's always going to work, and we can document at the use site that it's to trigger library loading.

Samuel's suggestion of making init public also seems fine, though my preference in that case would be to have a private static boolean isLoaded, guard the NativeLibrary.load call by that, and also make the block synchronized on TensorFlow.class to ensure there aren't races in native library loading. @saudet everything in NativeLibrary.load after the call to Loader.load(org.tensorflow.internal.c_api.global.tensorflow.class) and the isLoaded check is irrelevant now right? It looks like that's all related to the old JNI library loader.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a new version using Class.forName as you suggested @Craigacp . Of course it makes that code more verbose than with my previous version but since it's only being called internally, I think it's fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loader.load() was designed to be called multiple times. It already has the equivalent of an isLoaded flag.

Copy link
Collaborator

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karllessard karllessard merged commit 94f5b15 into tensorflow:master May 17, 2021
@karllessard karllessard deleted the string-size-bug branch May 17, 2021 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TString.scalarOf core dumps on OSX

3 participants