Skip to content

Adds support for deploying PySpark models#196

Merged
Corey-Zumar merged 22 commits intoucbrise:developfrom
dcrankshaw:deploy_pyspark_models
Jun 5, 2017
Merged

Adds support for deploying PySpark models#196
Corey-Zumar merged 22 commits intoucbrise:developfrom
dcrankshaw:deploy_pyspark_models

Conversation

@dcrankshaw
Copy link
Contributor

You can see how it works by looking at the integration tests I added in integration-tests/.

This should wait for #192 before being merged.

Note that after this gets merged, we need to create an automated build for clipper/pyspark-container on DockerHub.

@dcrankshaw dcrankshaw force-pushed the deploy_pyspark_models branch from 69af1ad to a84d2c4 Compare June 4, 2017 02:05
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/382/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/383/
Test FAILed.

@dcrankshaw
Copy link
Contributor Author

jenkins test this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/384/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/385/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/386/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/387/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/388/
Test FAILed.

cd $DIR
if [ -z ${SPARK_HOME+x} ]; then
echo "Downloading Spark"
curl -o spark.tgz https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we pull vars for these versions?

import findspark
findspark.init()
import pyspark
from pyspark import SparkConf, SparkContext
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SparkConf used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

import findspark
findspark.init()
import pyspark
from pyspark import SparkConf, SparkContext
Copy link
Contributor

@feynmanliang feynmanliang Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SparkConf used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

COPY containers/python/python_container_conda_deps.txt /lib/

RUN curl -o /spark.tgz https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz \
&& cd / && tar zxf /spark.tgz && mv /spark-2.1.1-bin-hadoop2.7 /spark \
Copy link
Contributor

@feynmanliang feynmanliang Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we extract vars for these versions and envsubst or BUILDARGS them in?

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary style review + questions. Will attempt to train and deploy PySpark models after #192 is merged and we rebase on develop.

if local_path != remote_path:
if os.path.isdir(local_path):
self._copytree(local_path, remote_path)
# self._copytree(local_path, remote_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# where dst may or may not exist. We cannot use
# shutil.copytree() alone because it stipulates that
# dst cannot already exist
def _copytree(self, src, dst, symlinks=False, ignore=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale behind deleting this method? Did you run into an issue with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it wasn't working for recursive copying of directories.

else:
print(
"Warning: Anaconda environment was either not found or exporting the environment "
"failed. Your function will still be serialized deployed, but may fail due to "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second sentence should read Your function will still be serialized and deployed, but may fail due to... (missing and)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh good

# Remove temp files
shutil.rmtree(serialization_dir)

return deploy_result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a Returns section to documentation with type and explanation of deploy_result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


return deploy_result

def deploy_predict_function(self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should also have Returns documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

spark_model_path)

def predict_ints(self, inputs):
if self.input_type != rpc.INPUT_TYPE_INTS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we performing an additional check on the validity of the input type? Lines 270-278 of rpc.py should take care of this validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I copied this code from the python_container.py code. I didn't realize it was redundant. I'll remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Can you remove these checks from python_container as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah done

preds = self.predict_func(self.spark, self.model, inputs)
return [str(p) for p in preds]

def _log_incorrect_input_type(self, input_type):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this if input type validation is redundant (see comment above regarding preexisting validation in rpc.py)

PORT_RANGE = [34256, 40000]


def find_unbound_port():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have to be done now, but we should eventually create a reusable module that we can import this functionality from because it's also defined in many_apps_many_models.py in the same directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Filed #201.


def parseData(line, obj, pos_label):
fields = line.strip().split(',')
# return LabeledPoint(obj(int(fields[0]), pos_label), [float(v)/255.0 for v in fields[1:]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


vol = "{model_repo}/{name}/{version}".format(
model_repo=MODEL_REPO, name=name, version=version)
print("Vol is: %s" % vol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing debug print statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/397/
Test FAILed.

@dcrankshaw
Copy link
Contributor Author

jenkins test this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/398/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/399/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/400/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/401/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/402/
Test PASSed.

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple minor comments. This works well!

correct_input_type = rpc.input_type_to_string(self.input_type)
print(
"Attempted to use prediction function for input type {incorrect_input_type}.\
This model-container was configured accept data for input type {correct_input_type}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Missing "to". This should be

Attempted to use prediction function for input type {incorrect_input_type}, but this model-container was configured to accept data for input type {correct_input_type}


CMD ["/container/pyspark_container_entry.sh"]

# vim: set filetype=dockerfile:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually isn't a comment, it's a vim modeline directive http://vim.wikia.com/wiki/Modeline_magic

Copy link
Contributor

@Corey-Zumar Corey-Zumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Corey-Zumar Corey-Zumar merged commit 946fc1d into ucbrise:develop Jun 5, 2017
@dcrankshaw dcrankshaw mentioned this pull request Jun 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants