New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

added install.R. runtime.txt, and R notebook #129

Closed

abernauer wants to merge 2 commits into mlpack:master from abernauer:MovieLens-prediction-notebook-R

abernauer commented Oct 21, 2020

Added the R notebook
Updated the binder directory with a runtime.txt file containing the R version.
Updated the binder directory with a install.R file with packages to install.


          added install.R. runtime.txt, and R notebook

13fac44

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels

mlpack-bot bot commented Oct 21, 2020

Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:

All code should follow the style guide
Documentation added for any new functionality
Tests added for any new functionality
Tests that are added follow the testing guide
Headers and license information added to the top of any new code files
HISTORY.md updated if the changes are big or user-facing
All CI checks should be passing

Thank you again for your contributions! 👍

review-notebook-app bot commented Oct 21, 2020

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

zoq reviewed

View reviewed changes

binder/install.R Outdated

		@@ -0,0 +1,6 @@
		install.packages("mlpack")

Member

zoq Oct 21, 2020

I might be wrong but I think mlpack isn't in CRAN - mlpack/mlpack#2636. I guess the easiest workaround for now is to enable the R bindings in the conda package. On your local setup you manually build the R bindings?

Author

abernauer Oct 21, 2020

Good catch, I forgot about this when writing that line. You can install r packages on the command line using R CMD INSTALL mlpack.tar.gz. I got the bindings from a git hub archive of the R bindings from yashwant.

Author

abernauer Oct 21, 2020

So that line should be removed. Yeah enabling the R bindings in the conda package is probably the easiest. The other work around is installing the package with what I mentioned above.

Member

zoq Oct 23, 2020

I guess since installing the necessary packages takes a while it makes sense to include the files in the conda package. Will update the package later and post an update here.

shrit added c: examples s: keep open and removed s: unanswered s: unlabeled labels

abernauer commented

View reviewed changes

Author

abernauer left a comment •

edited

Loading

My bad the comments were pending.

binder/install.R Outdated

		@@ -0,0 +1,6 @@
		install.packages("mlpack")

Author

abernauer Oct 21, 2020

Good catch, I forgot about this when writing that line. You can install r packages on the command line using R CMD INSTALL mlpack.tar.gz. I got the bindings from a git hub archive of the R bindings from yashwant.

binder/install.R Outdated

		@@ -0,0 +1,6 @@
		install.packages("mlpack")

Author

abernauer Oct 21, 2020

So that line should be removed. Yeah enabling the R bindings in the conda package is probably the easiest. The other work around is installing the package with what I mentioned above.

Member

birm commented Oct 26, 2020

If unintentional, this works towards #97. 🥇
This is awesome, but I don't feel I can give an approval since I learned exactly enough R to marginally pass assignments where it was needed 🙃

Author

abernauer commented Oct 27, 2020

@birm Yes sort of unintentional. Though I had fun writing the example and would love to adapt some of the other examples for the R bindings.


          removed install.packages(mlpack)

9884ce6

Author

abernauer commented Oct 28, 2020

Pushed the commit addressing the comments by @zoq 😄

rcurtin reviewed

View reviewed changes

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

I think it would be really great if you added a quick header comment describing what this example does. You could use the exact same comment from the other notebooks.

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

The ratings-only.csv.gz file is 62MB, so this is a fairly large example. Maybe it would be good to mention that in case a user thinks that this notebook will only take a second to run? :)

Reply via ReviewNB

Member

zoq Nov 11, 2020

Maybe we should switch to the small data set, as done for the the other notebook. We can use the same settings, but it's much faster. In movie-lens-cf-py.ipynb I used https://lab.mlpack.org/data/MovieLens-small.zip which is < 1MB.

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Do you think we should put this comment just before we load the data? Also, it's probably good to introduce the whole idea of the MovieLens 20M dataset first. Marcus has a nice comment describing it in the C++ notebook. (Where 20M refers to the number of ratings.)

Also some quick formatting changes might be helpful. For instance, "userId an integer id for the user" might be clearer as "userId , an integer id representing the user".

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Let's :)

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Peek not Peak :)

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Ah, nice---if you want to use descriptive statistics, it's probably good to add in a comment that this is mlpack support. Specifically we are using mlpack to compute some nice statistics about each dimension in our data, but here there is only one dimension (the rating itself), so we get some nice statistics on the ratings.

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Above these cells it could be nice to add a comment block describing that we will train the model, that we want to do a split into a train/test set first, and we'll use mlpack tools for both of these tasks, etc. etc. Basically the reader will be scrolling through, and it's really helpful for them if we can describe why we're doing something instead of making them try and figure out why we did it. :)

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

Looks like maybe one more space is needed so the arguments to cf() align? Or maybe that is just my browser.

What does the # model comment mean?

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020

let's :)

Reply via ReviewNB

movie_lens_prediction_with_cf/movie-lens-cf-r.ipynb

    
            @@ -0,0 +1,933 @@
          
              {

Member

rcurtin Nov 11, 2020 •

edited

Loading

I actually don't recognize a single one of these movies (that doesn't mean they aren't good 😄). I see that in the Python notebook Marcus used user 2; maybe user 2 has less obscure taste? :) (Marcus also printed all of the movies that the user had previously rated, so that the reader can see the correlation between the two. That would probably also be helpful here.)

Reply via ReviewNB

Member

zoq Nov 11, 2020

Same I don't know any of the listed movies, maybe that means I should watch more movies, also the movies are quite random, don't really see a pattern :)

Member

rcurtin commented Nov 11, 2020

Hey @abernauer, thanks for taking the time to implement this! It looks great. I left a bunch of comments through ReviewNB; hopefully they're helpful. I do have some slight (but not serious) concern about the CF model itself; I wonder if we need to tune it with different parameters so that the movies it shows are more common. Probably the rank parameter is what would need to be tuned. Maybe worth a shot to see what happens and if the results look qualitatively better or worse? 👍

Member

shrit commented Jul 3, 2024

@abernauer this is nice work, but I will have to close this PR, we did a major refactoring of this repository and I would doubt that it will be easy to solve all the merge conflicts. Therefore, if you are still interested feel free to open this PR again.

shrit closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c: examples s: keep open s: needs review