Win build and app tutorials #1436

gmanlan · 2018-06-18T23:41:13Z

Tutorial 1: Up-to-date (VS2017/mlpack3.0.2) Windows Build Guide (doc/guide/build_windows.hpp)
Tutorial 2: Sample ML C++ App for Windows (doc/guide/sample_ml_app.hpp)
VS project: Sample App for Windows (doc/examples/sample-ml-app)

Tested using Win10, VS2017, latest version of mlpack, armadillo 8.500, boost 1.66

zoq

This is awesome, the windows workflow sounds so much easier this way.

zoq · 2018-06-19T19:47:20Z

doc/guide/build_windows.hpp

+and make sure you can use it from the Command Prompt (may need to add to the PATH)
+
+- Download the latest mlpack release from here:
+<a href="http://www.mlpack.org/files/mlpack-3.0.2.tar.gz">mlpack-3.0.2</a>


I was thinking if it might be useful to provide an alias for the latest package so that we don't have to update this tutorial once we have a new release. @rcurtin what do you think?

Changed the fixed path - now using the generic download path so users can grab the latest stable release

I already go through doc/ when I do every release and increment any version numbers, so it's not a huge issue either way. Aliases are nice if we can do it; do you have a suggested way to do it?

zoq · 2018-06-19T19:50:03Z

doc/guide/build_windows.hpp

+
+@section build_instructions Windows build instructions
+
+- Unzip mlpack to "C:\mlpack\mlpack-3.0.2"


Is there any way to use an existing home directory? Something like ~/on linux.

This is tricky - we could use %userprofile% in Windows, but that directory is not very friendly for the purpose of an easy-to-follow tutorial (and it changes depending on the windows version) - I think for simplicity we should use the most basic directory we can imagine (there is also a note that says this path is just for reference). Long paths or paths with spaces may produce issues in Windows so we are aiming for the safest option.

Let me know if you still want to change it.

zoq · 2018-06-19T19:55:14Z

doc/guide/sample_ml_app.hpp

+(i.e.: mlpack/tests/data/german.csv), assuming the labels don't require normalization.
+
+@code
+bool loaded = mlpack::data::Load("data/german.csv", dataset);


Should we define dataset, labels first? I guess people might copy each line, and end up with some errors.

Yep, added the missing dataset definition.

zoq · 2018-06-19T19:58:10Z

doc/guide/sample_ml_app.hpp

+Row<size_t> predictions;
+rf.Classify(dataset, predictions);
+const size_t correct = arma::accu(predictions == labels);
+printf("\nTraining Accuracy: %f", (double(correct) / double(labels.n_elem)));


hm, should we use std::cout here? Usually we don't use printf in the codebase.

Changed all the printf to cout - thanks

zoq · 2018-06-19T19:59:29Z

doc/guide/sample_ml_app.hpp

+Now that our model is trained and validated, we save it to a file so we can use it later.
+
+@code
+mlpack::data::Save("mymodel.xml", "model", rf, false,


Save should be able to derive the format from the filename.

Right, removed the unnecessary parameter

rcurtin

Hey there Germán,

Thanks so much for taking the time to write this. I think it's a huge improvement to documentation and will help a lot of Windows users. I think it's really nice to have the Windows project in the repo ready to go too, so that users can base their own code off the example solution you've given. So definitely this is a much needed improvement to the state of the documentation.

Some minor comments---

Do you want to add the BSD license to the various code files? You could also add your name as '@author' if you like.
Would it be possible to use AppVeyor to ensure that the example can build? This would really help us ensure that the code doesn't go out of date, which will happen over the years as Visual Studio changes versions, etc.

Thanks again! 👍 (or should I use the 🚀 emoji? I am still figuring these things out)

rcurtin · 2018-06-20T14:28:43Z

doc/guide/build_windows.hpp

@@ -0,0 +1,95 @@
+/*! @page build_windows Building mlpack From Source
+
+@section build_intro Introduction


I think (I would have to check) that Doxygen doesn't use unique identifiers for individual pages, so this reference build_intro will collide with the one from the build page, so I guess we should change these to, e.g., build_windows_intro, etc.

rcurtin · 2018-06-20T14:33:09Z

doc/guide/build_windows.hpp

@@ -0,0 +1,95 @@
+/*! @page build_windows Building mlpack From Source


Should we add "On Windows" to the page title?

rcurtin · 2018-06-20T14:41:14Z

doc/guide/sample_ml_app.hpp

@@ -0,0 +1,197 @@
+/*! @page sample_ml_app Sample C++ ML App


Might be a good idea to add something about Windows here too.

rcurtin · 2018-06-20T14:45:47Z

doc/guide/sample_ml_app.hpp

+@section sample_crossvalidation Cross-Validating
+
+To evaluate the classifier, we use K-Fold cross-validation. We also define which metric to use in order
+to assess the quality of the trained model.


Hm, so the Random Forest rf isn't actually used in this section. So maybe it would be better to restate it as

Instead of training the Random Forest directly, we could also use k-fold cross-validation for training, which will give us a measure of performance on a held-out test set. This can give us a better estimate of how the model will perform when given new data.

(or something like that?)

rcurtin · 2018-06-20T14:48:36Z

doc/guide/sample_ml_app.hpp

+KFoldCV<RandomForest<GiniGain, RandomDimensionSelect>, Accuracy> cv(k, 
+	dataset, labels, numClasses);
+double cvAcc = cv.Evaluate(numTrees, minimumLeafSize);
+cout << "\nKFoldCV Accuracy: " << cvAcc;


Should we extract one of the models trained on cross-validation?

rf = cv.Model(); // this will get the model trained on the last fold

Alternately I guess we could mention that we could train on all of the training data, and the k-fold CV is just to get an idea of the performance. I'm not picky, I'm just hoping to ensure that the example doesn't confuse anyone.

Right. Added a note for that.

rcurtin · 2018-06-20T15:21:36Z

doc/guide/sample_ml_app.hpp

+Now that our model is trained and validated, we save it to a file so we can use it later.
+
+@code
+mlpack::data::Save("mymodel.xml", "model", rf, false);


It might be worth pointing out that we could also save as mymodel.bin which will be much smaller. The XML saves are huge because of all the XML tags :(

gmanlan · 2018-06-21T02:19:59Z

I'm thinking that using AppVeyor for the example VS project would require to modify the project configuration (i.e. paths) so it works with the build system - however this would break the consistency between the example project and the 'sample_project_config' section of the 'sample_ml_app.hpp' tutorial ... what do you think?

zoq · 2018-06-23T20:59:19Z

I guess we could use mlpack-latest instead of mlpack-3.0.2 or something similar to get the same paths, do you think that would be reasonable, or is there anything else I missed? If the path is the main reason, we could also create an alias as part of the build script.

rcurtin · 2018-06-25T15:40:16Z

I can set up an mlpack-latest.tar.gz link on mlpack.org for the sake of documentation, but I don't think it's a problem to hardcode mlpack-3.0.2 and then update it after each release. What do you think, would that work?

gmanlan · 2018-06-26T01:06:46Z

To reduce the overhead of updating the doc each time a new release is available, I have changed the mlpack download path to "http://www.mlpack.org/download.html" so it always refers to the latest version.

rcurtin · 2018-07-03T19:36:39Z

I talked to @gmanlan over email, I think maybe the best thing to do here is merge as-is, and create an issue for the VS build of the tutorial.

For that build, I think the best idea is to have a special VS project configuration that we can keep in some directory like .appveyor/. Then in appveyor.yml we can just copy that configuration into place, overwriting the existing configuration, and then run the example build as an extra step.

zoq · 2018-07-03T20:28:37Z

Sounds reasonable to me, no need to delay this really helpful tutorial any longer.

zoq

Looks good to me, no more comments from my side.

ShikharJ

I am +1 for this as well.

rcurtin

Great, I'll leave 3 days before merge for any more comments. When I merge, I'll open an issue for the AppVeyor build that we can handle some other time.

rcurtin · 2018-07-06T15:08:26Z

Thanks again for the contribution! I really appreciate it. I forgot to add, if you'd like to add your name to src/mlpack/core.hpp and COPYRIGHT.txt, please feel free and I will merge it! And if you'd like some mlpack stickers to put on your laptop, feel free to send Marcus or I an email with your mailing address and we will get them sent. :)

I opened #1463 for the build part.

gmanlan · 2018-07-07T00:31:25Z

Great - I'm glad it helps. I just realized that we have not linked/updated the main doc/tutorials page at http://www.mlpack.org/docs/mlpack-3.0.2/doxygen/tutorials.html, so I will be updating this soon to make sure users can find both Linux and Windows tutorials.

gmanlan added 4 commits June 12, 2018 17:15

windows build tutorial using up-to-date versions

b65a22a

sample end-to-end ML C++ app tutorial

9186526

sample end-to-end ML C++ app tutorial

f29fefb

Sample ML C++ App for Windows (tutorial source code)

2519c0f

zoq reviewed Jun 19, 2018

View reviewed changes

gmanlan added 4 commits June 19, 2018 16:08

std::cout and model saving minor improvements

28b94c8

fixing xcopy cmd issue using *

09985f3

generic path to download the latest stable release

d3214f7

minor code tweaks + dataset note

9de581c

rcurtin reviewed Jun 20, 2018

View reviewed changes

gmanlan added 3 commits June 20, 2018 18:55

+ license headers

68f7bc3

sections/title fixed

c8a4abc

+ notes and clarifications on cv and model saving

65dc007

zoq approved these changes Jul 3, 2018

View reviewed changes

ShikharJ approved these changes Jul 3, 2018

View reviewed changes

rcurtin approved these changes Jul 3, 2018

View reviewed changes

rcurtin merged commit e3fe135 into mlpack:master Jul 6, 2018

rcurtin mentioned this pull request Jul 6, 2018

AppVeyor build for Windows tutorial #1463

Closed


		@section build_instructions Windows build instructions

		- Unzip mlpack to "C:\mlpack\mlpack-3.0.2"

		@@ -0,0 +1,95 @@
		/*! @page build_windows Building mlpack From Source

		@section build_intro Introduction

Win build and app tutorials #1436

Win build and app tutorials #1436

Conversation

gmanlan commented Jun 18, 2018

zoq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gmanlan commented Jun 21, 2018

zoq commented Jun 23, 2018

rcurtin commented Jun 25, 2018

gmanlan commented Jun 26, 2018

rcurtin commented Jul 3, 2018

zoq commented Jul 3, 2018

zoq left a comment

Choose a reason for hiding this comment

ShikharJ left a comment

Choose a reason for hiding this comment

rcurtin left a comment

Choose a reason for hiding this comment

rcurtin commented Jul 6, 2018

gmanlan commented Jul 7, 2018