Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jupyter notebooks for transmogrify samples #231

Merged
merged 2 commits into from
Mar 11, 2019

Conversation

rajdeepd
Copy link
Contributor

Related issues
Refer to issue(s) addressed in this pull request from [Issues]

#211

Describe the proposed solution
Jupyter notebook samples based on BeakerX based kernel

Describe alternatives you've considered

Apache Toree, almond-sh based kernels, most of these solutions were not as straight forward as beakerx

Additional context
none

@codecov
Copy link

codecov bot commented Feb 24, 2019

Codecov Report

Merging #231 into master will decrease coverage by 4.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #231      +/-   ##
=========================================
- Coverage   86.42%   82.4%   -4.02%     
=========================================
  Files         312     312              
  Lines       10187   10187              
  Branches      336     548     +212     
=========================================
- Hits         8804    8395     -409     
- Misses       1383    1792     +409
Impacted Files Coverage Δ
...alesforce/op/cli/gen/templates/SimpleProject.scala 0% <0%> (-100%) ⬇️
.../scala/com/salesforce/op/cli/gen/ProblemKind.scala 0% <0%> (-100%) ⬇️
...cala/com/salesforce/op/cli/gen/FileInProject.scala 0% <0%> (-100%) ⬇️
...in/scala/com/salesforce/op/cli/CommandParser.scala 0% <0%> (-98.12%) ⬇️
...cala/com/salesforce/op/cli/gen/ProblemSchema.scala 0% <0%> (-96.56%) ⬇️
...src/main/scala/com/salesforce/op/cli/gen/Ops.scala 0% <0%> (-94%) ⬇️
...ain/scala/com/salesforce/op/cli/SchemaSource.scala 0% <0%> (-87.94%) ⬇️
...a/com/salesforce/op/cli/gen/ProjectGenerator.scala 0% <0%> (-87.5%) ⬇️
...in/scala/com/salesforce/op/cli/CliParameters.scala 0% <0%> (-80.96%) ⬇️
...src/main/scala/com/salesforce/op/cli/CliExec.scala 0% <0%> (-80%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4da94f8...f786cb9. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 24, 2019

Codecov Report

Merging #231 into master will decrease coverage by 4%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #231      +/-   ##
==========================================
- Coverage   86.39%   82.39%   -4.01%     
==========================================
  Files         312      312              
  Lines       10183    10183              
  Branches      335      548     +213     
==========================================
- Hits         8798     8390     -408     
- Misses       1385     1793     +408
Impacted Files Coverage Δ
...alesforce/op/cli/gen/templates/SimpleProject.scala 0% <0%> (-100%) ⬇️
.../scala/com/salesforce/op/cli/gen/ProblemKind.scala 0% <0%> (-100%) ⬇️
...cala/com/salesforce/op/cli/gen/FileInProject.scala 0% <0%> (-100%) ⬇️
...in/scala/com/salesforce/op/cli/CommandParser.scala 0% <0%> (-98.12%) ⬇️
...cala/com/salesforce/op/cli/gen/ProblemSchema.scala 0% <0%> (-96.56%) ⬇️
...src/main/scala/com/salesforce/op/cli/gen/Ops.scala 0% <0%> (-94%) ⬇️
...ain/scala/com/salesforce/op/cli/SchemaSource.scala 0% <0%> (-87.94%) ⬇️
...a/com/salesforce/op/cli/gen/ProjectGenerator.scala 0% <0%> (-87.5%) ⬇️
...in/scala/com/salesforce/op/cli/CliParameters.scala 0% <0%> (-80.96%) ⬇️
...src/main/scala/com/salesforce/op/cli/CliExec.scala 0% <0%> (-80%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05963bc...2f7b5e0. Read the comment docs.

@tovbinm
Copy link
Collaborator

tovbinm commented Feb 25, 2019

@rajdeepd please move the notebook files into helloworld/notebooks folder

@rajdeepd rajdeepd force-pushed the jupyter branch 3 times, most recently from 53e0571 to 05215ef Compare February 26, 2019 10:26
@rajdeepd
Copy link
Contributor Author

@tovbinm done

@tovbinm
Copy link
Collaborator

tovbinm commented Feb 26, 2019

  1. Please clean the notebooks from the evaluation results and just keep the commands.
  2. In between the commands please copy over the docs from here - https://docs.transmogrif.ai/en/stable/examples/index.html for each notebook.

@rajdeepd
Copy link
Contributor Author

rajdeepd commented Mar 1, 2019

@tovbinm made the changes

@@ -0,0 +1,128 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need ListOfFiles.ipynb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test notebook to make sure training data is mounted in the container path, will remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@rajdeepd
Copy link
Contributor Author

rajdeepd commented Mar 5, 2019

@tovbinm please review

Copy link
Collaborator

@leahmcguire leahmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!! Thanks for contributing!


```bash
docker run -p 8888:8888 -v /Users/rdua/work/github/rajdeepd/TransmogrifAI/helloworld/notebooks:/home/beakerx/helloworld-notebooks \
-v /Users/rdua/work/github/rajdeepd/TransmogrifAI/helloworld:/home/beakerx/helloworld beakerx/beakerx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just change example to $TransmografaiPATH

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"cell_type": "markdown",
"metadata": {},
"source": [
"After model has been fitted we use `scoreAndEvaluate()` function to evaluate the metrics"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to mention that you can change out the data before doing this by either setting a new path or a new reader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

"cell_type": "markdown",
"metadata": {},
"source": [
"After model has been fitted we use scoreAndEvaluate() function to evaluate the metrics"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to mention that you can change out the data before doing this by either setting a new path or a new reader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@@ -0,0 +1,100 @@
# Transmogrify on Jupyter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransmogrifAI on Jupyter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,100 @@
# Transmogrify on Jupyter

In this section we will look at how Transmogrify can be run within Scala notebooks on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransmogrifAI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

In this section we will look at how Transmogrify can be run within Scala notebooks on
Jupyter.

We are going to leverage [BeakerX](http://beakerx.com/) scala kernel for Jupyter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala kernel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


* Apache Maven
* Python 3
* JDK 8 (JDK 10 or above can cause issues with Transmogrify)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply JDK 8 only - don't mention anything else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Installation using pip

```$xslt
sudo pip install beakerx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have to be sudo? (just curios)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes pip install fails because of permissions issue

BeakerX provides a [docker container image](https://hub.docker.com/r/beakerx/beakerx/) on docker hub.

Assuming your Transmogrify source code is downloaded at `/Users/rdua/work/github/rajdeepd/TransmogrifAI`. You can use
the following command to start the container. We need the directory above so that we can mount sample notebooks and dataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which command are you referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker run command - updated the doc


### Set TransmogrifaiPATH

export TransmogrifaiPATH=<transmogrify installation dir>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

escape with `

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export TransmogrifaiPATH=<TransmogrifAI installation dir>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

### Run the beakerx Container

```bash
docker run -p 8888:8888 -v $TransmogrifaiPATH/helloworld/notebooks:/home/beakerx/helloworld-notebooks \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to mount both paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is for jupyter notebooks, another for data -- data is helloworld so the jupyter notebook mount point cannot access it


On opening the image in the browser you will notice that in the home page

![notebook_home][notebooks_home]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing image notebook_home?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


#### OpTitanicSimple

[OpTitanicSimple.ipynb](http://localhost:8888/notebooks/helloworld-jupyter/OpTitanicSimple.ipynb)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure that the links for the notebooks are correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing out..

"outputs": [],
"source": [
"val fittedWorkflow = workflow.train()\n",
"println(s\"Summary: ${fittedWorkflow.summary()}\")"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

println(s\"Summary: ${fittedWorkflow.summaryPretty()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"outputs": [],
"source": [
"val fittedWorkflow = workflow.train()\n",
"println(s\"Summary: ${fittedWorkflow.summary()}\")"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - println(s\"Summary: ${fittedWorkflow.summaryPretty()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Collaborator

@leahmcguire leahmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rajdeepd
Copy link
Contributor Author

rajdeepd commented Mar 9, 2019

@tovbinm @leahmcguire please merge

@tovbinm tovbinm merged commit 6a593f6 into salesforce:master Mar 11, 2019
@tovbinm
Copy link
Collaborator

tovbinm commented Mar 11, 2019

Thank you @rajdeepd

@tovbinm tovbinm mentioned this pull request Apr 10, 2019
@tovbinm tovbinm mentioned this pull request Jul 11, 2019
@salesforce-cla
Copy link

Thanks for the contribution! Unfortunately we can't verify the commit author(s): Leah McGuire <l***@s***.com>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants