Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLIPPER-143] String/JSON Container Output #134

Merged
merged 39 commits into from
May 9, 2017

Conversation

Corey-Zumar
Copy link
Contributor

No description provided.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/200/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/208/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/209/
Test FAILed.

@Corey-Zumar
Copy link
Contributor Author

@dcrankshaw This is ready for review. Can you make sure that the tutorial still runs properly? I've verified that it works after rebuilding sklearn container, but I haven't tested with tensorflow due to preexisting import issues on my machine.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/210/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/211/
Test PASSed.

Copy link
Contributor Author

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcrankshaw Merged the latest develop and addressed your comments. Will add the requested unit test later today

// At minimum, the output contains an unsigned
// integer specifying the number of string
// outputs
int outputLenBytes = 4;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// for storing string lengths. Advance past this segment
// for now
responseBuffer.position(baseStringLengthsPosition + (4 * numOutputs));
for (int i = 0; i < predictions.size(); i++) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
* @return The number of bytes written to the buffer
*/
public int toBytes(ByteBuffer buffer) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - renamed to encodeUTF8ToBuffer

def predict_ints(self, inputs):
return np.array([np.sum(x) for x in inputs], dtype='float32')
def predict_ints(self, input_item):
return str(sum(input_item))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We UTF-8 encode them before serialization. See PredictionResponse#add_output in rpc.py.

total_string_length * MAXIMUM_UTF_8_CHAR_LENGTH_BYTES)
self.memview = memoryview(self.output_buffer)
struct.pack_into("<I", self.output_buffer, 0, num_outputs)
self.string_content_end_position = 4 + (4 * num_outputs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// If the string output is not JSON-formatted, include
// it as a raw string in the query response
clipper::json::add_string(json_response, PREDICTION_RESPONSE_KEY_OUTPUT,
query_response.output_.y_hat_);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if I'm not mistaken, the json::add_string method should take care of this for us. I'll change the documentation to this effect.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/241/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/242/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/243/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/244/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/246/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/249/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/250/
Test FAILed.

@dcrankshaw
Copy link
Contributor

That may be true, but this is one place where it's critical to the whole purpose of the system to enable these batch-optimizations. Otherwise Clipper is a non-starter for any models that need to run on the GPU or use BLAS libraries. I also don't think it's a huge added burden on the developer.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/251/
Test PASSed.

Copy link
Contributor

@dcrankshaw dcrankshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this looks good to me. Add a comment about the batch-prediction interface and I'll wait for unit tests to pass then this is good to go.

* @return A JSON-formatted serializable string to be
* returned to Clipper as a prediction result
*/
public abstract SerializableString predict(I inputVector);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Corey-Zumar Note that Model.predict should actually take the full batch of input vectors at once, rather than having the RPC system call predict in a for-loop. This allows models to apply batch processing optimizations if they want. E.g. this is particularly critical for the tensorflow model when running on a GPU.

I fixed this in the Python code but I'm leaving it as-is for now in the Java container code. I have a fix for this as part of my re-org of the Java codebase in #133.

List<Float> eventCodes = new ArrayList<>();
for (int i = 0; i < eventHistory.length; i++) {
// Begin building a JSON array
StringBuilder eventCodeJson = new StringBuilder("[");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, don't manually construct JSON. Use a json serialization library.

@@ -2,7 +2,7 @@ FROM clipper/py-rpc:latest

MAINTAINER Dan Crankshaw <dscrankshaw@gmail.com>

RUN pip install tensorflow
RUN pip install tensorflow==1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure that the tensorflow version is compatible with our pre-trained model

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/252/
Test PASSed.

@dcrankshaw dcrankshaw merged commit ea5647d into ucbrise:develop May 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants