Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added two classification examples using Vowpal Wabbit #733

Merged
merged 10 commits into from
Nov 21, 2019

Conversation

chenhuims
Copy link
Contributor

@chenhuims chenhuims commented Nov 12, 2019

  • One example uses Sentiment140 data for twitter sentiment classification
  • The other example applies VW algorithm to the adult census dataset

@welcome
Copy link

welcome bot commented Nov 12, 2019

💖 Thanks for opening your first pull request! 💖 We use semantic commit messages to streamline the release process. Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix. This helps us to create release messages and credit you for your hard work!
Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

Make sure to check out the developer guide for guidance on testing your change.

Copy link
Contributor

@drdarshan drdarshan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution! Could you please remove the output cells (esp. ones with images) and resubmit the PR? I'll sign off.

drdarshan
drdarshan previously approved these changes Nov 14, 2019
Copy link
Contributor

@drdarshan drdarshan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your contribution.

@chenhuims chenhuims changed the title Added two classification examples using Vowpal Wabbit feat: Added two classification examples using Vowpal Wabbit Nov 14, 2019
@mhamilton723
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link

codecov bot commented Nov 14, 2019

Codecov Report

Merging #733 into master will increase coverage by 11.56%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #733       +/-   ##
===========================================
+ Coverage   68.56%   80.13%   +11.56%     
===========================================
  Files         230      230               
  Lines        9197     9197               
  Branches      504      504               
===========================================
+ Hits         6306     7370     +1064     
+ Misses       2891     1827     -1064
Impacted Files Coverage Δ
...osoft/ml/spark/io/http/PartitionConsolidator.scala 95.55% <0%> (+2.22%) ⬆️
...m/microsoft/ml/spark/io/http/HTTPTransformer.scala 97.5% <0%> (+2.5%) ⬆️
...om/microsoft/ml/spark/lightgbm/LightGBMUtils.scala 94.93% <0%> (+2.53%) ⬆️
...com/microsoft/ml/spark/core/contracts/Params.scala 95.74% <0%> (+4.25%) ⬆️
...a/com/microsoft/ml/spark/io/http/HTTPClients.scala 57.14% <0%> (+5.35%) ⬆️
...icrosoft/ml/spark/downloader/ModelDownloader.scala 85.88% <0%> (+5.88%) ⬆️
...scala/com/microsoft/ml/spark/io/http/Parsers.scala 75% <0%> (+6.25%) ⬆️
...a/com/microsoft/ml/spark/lightgbm/TrainUtils.scala 91.03% <0%> (+8.96%) ⬆️
...n/scala/org/apache/spark/ml/param/ArrayParam.scala 70% <0%> (+10%) ⬆️
...rosoft/ml/spark/core/schema/BinaryFileSchema.scala 100% <0%> (+12.5%) ⬆️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dece5ae...bfc4e96. Read the comment docs.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723
Copy link
Collaborator

@chenhuims Looks like the data download and load didnt work. your dispay df shows 0 rows. It might have something to do with the schema you provided. Try with inferSchema set to true instead of making and passing a schema.

@chenhuims
Copy link
Contributor Author

@chenhuims Looks like the data download and load didnt work. your dispay df shows 0 rows. It might have something to do with the schema you provided. Try with inferSchema set to true instead of making and passing a schema.

Thanks for checking. There was indeed some issue with data downloading. I will try to fix this issue.

@microsoft microsoft deleted a comment from azure-pipelines bot Nov 20, 2019
@microsoft microsoft deleted a comment from azure-pipelines bot Nov 20, 2019
@microsoft microsoft deleted a comment from azure-pipelines bot Nov 20, 2019
@chenhuims
Copy link
Contributor Author

@mhamilton723 I made a fix and the notebook runs without issue on my ADB workspace. However, the E2E test still failed. Could you grant me access to the ADB workspace of the testing pipeline so I can check the detailed logs?

@chenhuims
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@microsoft microsoft deleted a comment from azure-pipelines bot Nov 21, 2019
@chenhuims
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723 mhamilton723 merged commit 3da1d14 into microsoft:master Nov 21, 2019
@welcome
Copy link

welcome bot commented Nov 21, 2019

Congrats on merging your first pull request, we appreciate your support! 🎉🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants