Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback #30

Open
DataStrategist opened this issue Sep 11, 2017 · 7 comments
Open

Feedback #30

DataStrategist opened this issue Sep 11, 2017 · 7 comments

Comments

@DataStrategist
Copy link

Outstanding! Congratulations, this is really outstanding work, and I'm contemplating introducing this in my workplace!

I have some feedback though:

  1. I love how you have https://radiant-rstats.github.io/docs/design/doe.html which perfectly mimics the app. That's awesome and a great idea! I think a bit of functionality could be provided by adding a hyperlink with a questionmark to each panel (perhaps next to the Print/Download icon that exists on each page? This would provide a quick reference to each help item. Failing this, you could easily have a HELP menu that could hyperlink to the document, with also an ABOUT tab and version info.

  2. I'm not too fond of the Data-> Manage visual. It's for all intents and purposes a crappier version of what you see in Data -> View and for large datasets doesn't really show anything interesting. Could I recommend that this first view contain three things (obviously linked to checkboxes): str(df), summary(df) and as a visual, it could have this: https://ropensci.org/blog/blog/2017/08/22/visdat (though you might prefer for this to be a seperate tab)

  3. Two omissions in the Data Science portfolio that I noticed are Random Forests and PCA. Random Forests would obviously go under Modelling, and PCA ... tough. I would want to give it some visibility... but it prolly makes most sense under a new menu: Data-> Reduce? I don't know...

  4. For the correlation plot, if you use PerformanceAnalytics::chart.Correlation, it even let's you group by another variable... check for example: PerformanceAnalytics::chart.Correlation(iris[-5],pch=21, bg='Species'), as well as showing some more stats. What do you think? I mean, this one is full of warnings, but it's coo, no?

  5. To connect to databases, i don't know if you have seen that Rstudio have this new odbc package that has a bunch of new features... I think probably without tooo much trouble we could provide the functionality that people save code snippets with the ODBC data needed to connect to their dbs... and as soon as they load that, we could populate "Datasets" with the tables in the db.

Anyway... don't want to overwhelm w/ too much stuff... I'm just excited! Please let me know what you think about all this stuff... if you are interested we can probably split up this issue into sub-issues. I would be willing to help with some of this stuff, if you want! What say you?

@vnijs
Copy link
Contributor

vnijs commented Sep 12, 2017

Thanks for the input @mexindian! Replies below:

  1. The navbar in Radiant has a ? icon and each page also has a ? on the bottom left of the page. Did you miss that it did I misunderstand your comment?

  2. Nice idea! I added some radio buttons for preview, str, and summary. Note that instead of base R's summary function i'm using getsummary from radiant.data. visdat is nice but I'm not sure about adding another dependency for this. There are quite a few ways to visualize the data in Data > Visualize and Data > Pivot already.

  3. I plan to start on Random Forest soon-ish and perhaps XGBoost. PCA and Factor analysis are available in the Multivariate Menu

  4. Nice correlation plot but I'm not sure about adding another dependencies. Also, the problem with these types of plots is that the scatter plots take really long to render for anything but smallish data. The correlation plot in Radiant only uses 1K datapoints for the scatter plots by default to limit this issue. Of course, the correlation estimates shown in the plot are based on all data. Getting histograms for all variables possible through Data > Visualize under the "Distribution" plot type

  5. This is a very important issue. I've had quite a few people ask about this and ideally you'd be able to access Rstudio's dialogs directly. See Feature request: Access to Data Import and DB Connection dialogs from Shiny / Gadget rstudio/rstudioapi#63. Right now you would load the data through Rstudio, start Radiant, and then use from global workspace in Data > Manage to access the data. You could of course also put the DB access code in R > Report your self. I'm open to ideas and suggestions

Thanks again @mexindian !

vnijs added a commit to radiant-rstats/radiant.data that referenced this issue Sep 12, 2017
@DataStrategist
Copy link
Author

  1. Nope, I missed it all... not sure how I could miss literally all of them even after looking for them! Great implementation!

  2. I guess you mean you have JUST added preview, str, getsummary? I agree that if you implement those then visdat is probably not required.

  3. Sorry man, I thought I went through every menu, but I confirm, I see it there now. doh 👎

  4. Totally get it about the dependancy. Unneeded.

  5. So I guess on this one, perhaps I am being too idealistic. The two solutions you mentioned require that people know R. I'm kinda even hoping to show this tool as a replacement for SPSS, ie, a fully no-code solution. That's why I was envisioning a situation where there could be a file loaded that would contain the db info and then people can select the db to access using radiant itself, which would connect to the dbs... but that would require a dependency to odbc... are you averse to that?

@vnijs
Copy link
Contributor

vnijs commented Sep 12, 2017

2, Correct. Use devtools::install_github("radiant-rstats/radiant.data") to install this version.
5. Sounds interesting but, unless you are talking about a sqlite db, I'm not exactly sure what that would look like

@DataStrategist
Copy link
Author

re: 5... My use case. I work with about 20-30 databases, and frequently need to connect here and there. So I'm thinking of a df that contains DSN names, driver type and connection info for each database. Save it as a csv. Then this file could be loaded by an analyst, who would check a box saying "database index file" or some such, then click load. This would enable a dropdown with all the database DSN names. Once a user clicks on a db name, then Radiant connects to the db and harvests all the table names, which it populates into a multi-select dropdown. From there users can select what to read in. Is my use-case too narrow?

@vnijs
Copy link
Contributor

vnijs commented Sep 21, 2017

I think you can do something like this through Rstudio's new Connections tab. Take a look at:

https://support.rstudio.com/hc/en-us/articles/115010915687-Using-RStudio-Connections

You may need the preview version of Rstudio for this. I have posted an issue to the rstudioapi package to see if it is possible to get access to the feature and/or the connection history from a Shiny app. We'll see how they respond

@DataStrategist
Copy link
Author

Yeah, exactly... the connections save on my pc... but you can also output the info to a code-chunk or a new file... and that file could be uploaded to radiant as a data source.

my dataframe idea above was just to manage multiple dbs rather than just 1... but the concept remains the same.

@vnijs
Copy link
Contributor

vnijs commented Sep 26, 2017

I'll look into this in more detail in the next few weeks @mexindian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants