Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Binder Button for JupyterLab and RStudio Server #286

Open
davidrpugh opened this issue Jul 8, 2019 · 10 comments

Comments

Projects
None yet
3 participants
@davidrpugh
Copy link

commented Jul 8, 2019

Is there interest in adding Binder buttons to the README? I have already added a Binder button to the Software Carpentry Python lesson and am in the progress of adding a Binder button to the Software Carpentry R lesson.

You would need to create two orphaned branches called binder-python and binder-r. I would then open a PR to add the files required to configure a Binder instance for JupyterLab and RStudio with all required packages/libraries pre-installed. I would then open a second PR to add two buttons to the README of the gh-pages branch.

Once complete, learners would be able to click to button for either Python or R and after a minute or two would have access to an instance of JupyterLab or RStudio server running in the cloud with all software pre-installed and ready to go for the lesson.

@remram44

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2019

I don't think there is much point using JupyterLab for this lesson, over the graphical program DB Browser for SQLite. It just adds unnecessary complexity to import data and run SQL code.

@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 8, 2019

@remram44 Thanks for the feedback. If most learners where only interested in learning SQL for its own sake then I would agree with you.

However, in my experience most (all?) learners want to learn to use SQL from either Python or R and are most interested in the last two episodes which demonstrate how this is done. Providing links to Binder instances makes it easier for learners to follow along with the last two episodes in an environment that more closely mimics their day-to-day research environment (than say DB Browser); learners can also easily replicate the Binder environments on their local machine later should they wish to do so.

Since both JupyterLab and RStudio provide a terminal window I generally teach the first several SQL episodes using the terminal from within either JupyterLab or RStudio (depending on whether the majority of the learners are more interested in Python or R) and then switch to either notebooks or R scripts for the last two lessons. The ability to seamlessly switch from terminal to a Python or R work environment helps re-enforce the usefulness of learning SQL and integrating it into daily workflows.

Finally, by including the links in the README we are simply providing the option for instructors and learners to use the Binder instances if they wish.

@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 8, 2019

To give everyone an idea of what this would look like I have created the two orphaned branches mentioned above and created a third branch off of gh-pages where I have added the buttons.

https://github.com/kaust-vislab/sql-novice-survey/tree/add-binder-buttons

@henrykironde

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2019

@remram44, I totally agree with what you have said, especially complexity in the maintenance of these branches. However, after looking at the sample, I think it could be useful for those people who have handled the first part of using the DB Browser.

If we decide to add this work to the project, I think we need to device a test mechanisms or make sure that everything(code and data) syncs with changes in the main repo and the main repo PR changes are tested against the notebooks infrastructure before a merge.
@davidrpugh thanks for the issue and the sample, let me know more regarding my concerns, I have not handled this kind of integration.

@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 8, 2019

@henrykironde Thanks for your feedback. First, I can commit to maintaining those branches as the maintenance burden is low. The Binder service will automatically re-build the images when there are changes to those branches. Changes to the branches are most likely to occur as we version bump the dependencies over time.

The notebooks directory can/should be removed. Currently I have included some notebooks that replicate and extend the code in the Databases and Python episode but this is just for demonstration purposes. That material, if there is interest in including it, should go into its own PR for possible inclusion into the main lesson material. This way the JupyterLab instance and the RStudio instance both start with only the survey.db file and it would be up to individual instructors to determine how to leverage JupyterLab or RStudio as they see fit in their workshops. This also eliminates the need to sync any code between main lesson and the Binder instances (which would be burdensome!).

Hopefully that makes sense!

@remram44

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2019

The lesson as currently organized only deals with use from R or Python in the last chapter. Maybe only that one needs to be available from Binder then? This would make it a lot easier to maintain, and would avoid overloading the learners with this complicated environment (SQL in Python in Jupyter in Binder...).

There are also practical reasons not to conduct the full lesson in Binder. Exploring/manipulating the CSVs directly, through programs the learners is used to (Excel), would not be possible in such an environment.

@remram44

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2019

On a separate note, does this need to be a separate branch? It could probably be on gh-pages, provided the out-of-date notebooks are removed (to not cause confusion).

@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

The lesson as currently organized only deals with use from R or Python in the last chapter. Maybe only that one needs to be available from Binder then? This would make it a lot easier to maintain, and would avoid overloading the learners with this complicated environment (SQL in Python in Jupyter in Binder...).

I think I may have created confusion by including the notebooks in the binder-python branch (and have now removed them). The branch should only contain the survey.db file and the files necessary to run JupyterLab on Binder (similar for binder-r branch and RStudio). Instructors should live code the last two lessons just as they normally would and for Python can choose between scripts or notebooks for the coding.

There are also practical reasons not to conduct the full lesson in Binder. Exploring/manipulating the CSVs directly, through programs the learners is used to (Excel), would not be possible in such an environment.

I think what is practical depends on the audience and the instructor and by providing the Binder instances we are providing additional options for instructors and learners.

There are two ways in which the JupyterLab and RStudio Binder instances could be used in teaching this lesson.

  1. Instructor teaches all but the last two episodes in the "normal" fashion but then uses the JupyterLab Binder instance to teach the Programming with Databases-Python lesson and/or uses the RStudio Binder instance to teach the Programming with Databases-R lesson.
  2. Instructor teaches the first 9 lessons using the terminal in either JupyterLab or R (depending) and then switches to using scripts/notebooks in JupyterLab or scripts in RStudio for the last two lessons.

I have tried both approaches and my audience(s) and I have both preferred option 2. When I teach the course, I use a Conda environment created on my local machine from the environment.yml file in the binder-python branch and then use JupyterLab as my teaching environment. The JupyterLab Binder instance allows learners to replicate my local environment without installing anything should they choose to do so (and to easily replicate this environment locally should they choose to do so). Similarly for R users and the RStudio Binder instance.

@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

On a separate note, does this need to be a separate branch? It could probably be on gh-pages, provided the out-of-date notebooks are removed (to not cause confusion).

These branch do need be separate from the main branch for two reasons.

  1. If we moved the Binder config files, say for Python and JupyerLab, to the gh-pages branch, then when Binder builds the instance all of the files in the gh-pages branch will end up being copied into the working directory of the Binder instance which is undesirable as it causes confusion for learners.
  2. We need Binder instances for both JupyterLab and RStudio and you cannot put config files for multiple languages into a single branch.
@davidrpugh

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

I have added a JupyterLab extension called jupyterlab_sql which provides DB Browser like functionality to JupyterLab. I have not taught with this extension yet but will do so this coming semester.

After launching the binder instance you click the SQL launcher button and then provide the following url to connect to the SQLite DB.

sqlite:///data/survey.db

After that things look pretty similar to DB Browser (to me at least!). Let me know if the above doesn't work for you...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.