-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#TidyTuesday hotel bookings and recipes | Julia Silge #26
Comments
Thank you for sharing these amazing techniques! I loved the skim function in particular. I got stuck on the Ggally part though, I wasn´t able to install it by running # Github I'm new to RStudio, but I hope to learn more from your amazing videos. Cheers, |
@jstello Try installing it straight from CRAN via |
hey julia, how do you get your code to look so neat and formatted? is there an r studio functionality that helps format your code as you type? |
Error: The first argument to [fit_resamples()] should be either a model or workflow. I dont know how to shake this error? even when i copy your code exactly |
@ntihemuka I do make heavy use of one of the RStudio shortcuts to reindent lines, which helps with how code looks a lot. I select all (command-A on a mac) and then reindent (command-I). You can see lots of shortcuts here. The other thing I do is try to follow tidyverse style most of the time, but I'm not perfect on that. This blog post is older and predates a change in tune where now the first argument to function like |
thanks!
…On Mon, May 24, 2021 at 4:07 PM Julia Silge ***@***.***> wrote:
@ntihemuka <https://github.com/ntihemuka> I do make heavy use of one of
the RStudio shortcuts to reindent lines, which helps with how code looks a
lot. I select all (command-A on a mac) and then reindent (command-I). You
can see lots of shortcuts here
<https://support.rstudio.com/hc/en-us/articles/200711853-Keyboard-Shortcuts>.
The other thing I do is try to follow tidyverse style
<https://style.tidyverse.org/> most of the time, but I'm not perfect on
that.
This blog post is older and predates a change in tune where now the first
argument to function like tune_grid() or fit_resamples() needs to be a
model or a workflow; be sure to put that first now. If you want to see an
updated version of this analysis, check out this Get Started article on
tidymodels.org <https://www.tidymodels.org/start/case-study/>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS3DBKKHQUPTL7UUVSKREC3TPJTSZANCNFSM44BZZQDA>
.
|
Hi Dr. Silge, I tried this example from the website https://www.tidymodels.org/start/case-study/ and noticed an issue with the engine arguments. It appears you can't pass engine specific arguments like "num.threads" or "importance = impurity" with the new workflow syntax. It does work with the old set_engine syntax. |
@gunnergalactico That is correct and as expected; you can only set engine-specific arguments within |
Hi, I just think that Knn is only for classification in trainining data, and It shouldn't be used to predict for a new dataset (testing data). What do you think about it? Thank you and Best regards |
@nguyenlovesrpy A nearest neighbor model can definitely be used to predict for a new dataset; check out examples here for both regression and classification. |
Hello. First of all thank you for all these videos, there are really helpful! I have a question about the outcome in the confusion matrix. What are we evaluating exactly? Because when I sum the observations in the CF there are 22,900 observations, whereas the test set has 18,792 and the training set has 56,374. Why is this? |
Hello again. I think I figured it out. It is because of the Monte Carlo CV which uses in this case as validation 10% of the data 25 times, so we have 250% of observations of the training set. |
Yep, those predictions that are used in the confusion matrix are from the 25-fold resampling, where the predictions are on the held out (or "assessment") observations in each resample. You may be interested in trying out the |
Hi Julia, how the knn model estimate the correct k neighbors? Does model use a default value? |
@rcientificos You can check out details like that in the documentation for |
Thank you.!. What is alternative for step_downsample in recipes? or I have to use themis package? |
@rcientificos Yes, that's right. The function from |
Hello Julia, I noticed that you use the juiced data when you make the resamples in this vlog:
Am I correct that, to avoid leakage caused by It is a small point but I am thinking this is the modern simple example code:
Do I have this right? |
Yes @RaymondBalise that's right. You can see that the article here using the same hotel data takes an approach more like what you describe than what I have here. |
#TidyTuesday hotel bookings and recipes | Julia Silge
Last week I published my first screencast showing how to use the tidymodels framework for machine learning and modeling in R. Today, I’m using this week’s #TidyTuesday dataset on hotel bookings to show how to use one of the tidymodels packages recipes with some simple models!
https://juliasilge.com/blog/hotels-recipes/
The text was updated successfully, but these errors were encountered: