Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Add an interactive shell powered by JupyterLite to the website #47428

Merged
merged 21 commits into from
Jun 23, 2022

Conversation

jtpio
Copy link
Contributor

@jtpio jtpio commented Jun 20, 2022

This allows for easily trying pandas in a web browser without installing anything.

Since the default kernel is based on Pyodide, it also includes a couple of other libraries by default such as matplotlib.

image

Other popular libraries have also adopted JupyterLite to power their documentation with an interactive shell:


TODO

  • Embed the JupyterLite REPL application
  • Create a separate repo for the JupyterLite deployment with a csv file and other custom settings
  • Add example code that can be copied and pasted
  • Move the demo repo to the pandas-dev organization

@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

Example of reading the iris.csv file:

image

without installing anything on your computer:

<iframe
src="https://jtpio.github.io/pandas-repl/repl/index.html?toolbar=1&kernel=python&code=import pandas as pd"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to be updated after the repo is transferred to pandas-dev organization: https://github.com/jtpio/pandas-repl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the content be directly transferred and initialized in the pandas repo? We've been trying to minimize the number of auxiliary repos and our web deployment is done within this repo already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually it's simpler to keep the content in a separate repo because the JupyterLite website is then deployed to GitHub Pages automatically when new changes are merged.

But if there is already a CI workflow deploying the pandas website on push, then this could also be integrated there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if there is already a CI workflow deploying the pandas website on push, then this could also be integrated there.

Looks like this might be relevant:

- name: Upload web
run: rsync -az --delete --exclude='pandas-docs' --exclude='docs' web/build/ docs@${{ secrets.server_ip }}:/usr/share/nginx/pandas
if: github.event_name == 'push' && github.ref == 'refs/heads/main'

Copy link
Contributor Author

@jtpio jtpio Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the content currently in https://github.com/jtpio/pandas-repl could be placed under web/lite for instance.

And then there could be an additional step running on CI to build the JupyterLite assets with jupyter lite build to web/build/lite, which would then be copied by the existing rsync command?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that workflow I think would be preferred

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with this approach. I'd keep the assets for the terminal in web/interactive_terminal or similar name.I think it's clearer. When building, web/build/lite or whatever it's standard is fine. If you copy files to that directory, the last steps of the website workflow should already take care of deploying it.

Also fine to keep things simple here, and create a DataFrame from the constructor first. We can discuss in a separate issue PR about the exact data and examples to use. Probably better to not mix the technical work with JupyterLite and the content of the terminal in the same discussion I think.

An example as simple as next would work for me:

df = pd.DataFrame({'num_legs': [2, 4], 'num_wings': [2, 0]},
                  index=['falcon', 'dog'])

Copy link
Contributor Author

@jtpio jtpio Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍

@jtpio jtpio changed the title Add an interactive shell powered by JupyterLite to the website ENH: Add an interactive shell powered by JupyterLite to the website Jun 20, 2022
@jtpio jtpio changed the title ENH: Add an interactive shell powered by JupyterLite to the website DOC: Add an interactive shell powered by JupyterLite to the website Jun 20, 2022
@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

[ ] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

It's not a new feature to pandas itself, but could be interesting to get it into the changelog somewhere for visibility?

@jtpio jtpio marked this pull request as ready for review June 20, 2022 14:34
@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

Marking as ready for review since the basic functionality should now be working. Happy to iterate more in this PR or follow-up PRs based on feedback.

As mentioned in the top comment, the repo used for the JupyterLite deployment should be moved to the pandas-dev organization on GitHub.

cc a couple of folks involved in the original issue: #46682

@datapythonista @bennaaym @psychemedia @hamedgibago

Thanks!

@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

Added some example code above the REPL that folks can copy paste:

image

@psychemedia
Copy link
Contributor

psychemedia commented Jun 20, 2022

Would it be useful to also cover other pandas read/write methods, to test file type reading, if nothing else and also surface filetypes etc. that may not be handled correctly?

There is a full list available here: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html

@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

@psychemedia have you tried to see if they work?

I think the main reason for having pd.read_csv() work in the REPL was to be able to load a dataset with a familiar API, and then manipulate it with pandas.

@psychemedia
Copy link
Contributor

I do have a test notebook (cribbed from pandas docs), but I can't even get read_csv() to work with https://jupyterlite.readthedocs.io/en/latest/_static/lab/index.html yet on data/iris.csv atm (I was assuming that demo was the latest one? I tried reloading in browser dev tools which generally seems to clear cache etc. but that made no difference.) Will try to have a look in a day or two: currently buried in meeting a teaching marking deadline.

@jtpio
Copy link
Contributor Author

jtpio commented Jun 20, 2022

Thanks @psychemedia for sharing the example notebook, this is useful. It looks like there is an issue saving the json file:

image

Mind opening an issue on the JupyterLite repo? Looks like this is an issue that should be fixed there, and is not strictly blocking for this PR since it can be addressed separately.

EDIT: tracked in jupyterlite/jupyterlite#682

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I added few suggestions, but the general idea looks perfect. Thanks for working on this!

.github/workflows/docbuild-and-upload.yml Outdated Show resolved Hide resolved
web/pandas/interactive_terminal/README.md Outdated Show resolved Hide resolved
web/pandas_web.py Outdated Show resolved Hide resolved
web/pandas/getting_started.md Outdated Show resolved Hide resolved
web/pandas/getting_started.md Show resolved Hide resolved
web/pandas/interactive_terminal/README.md Outdated Show resolved Hide resolved
@jtpio
Copy link
Contributor Author

jtpio commented Jun 22, 2022

Many thanks @datapythonista for the review and feedback!

The comments should have been addressed now.

@jtpio
Copy link
Contributor Author

jtpio commented Jun 22, 2022

For those who would like to try it out, you can download the built website here: https://github.com/pandas-dev/pandas/actions/runs/2541517660

image

And then start a local server with python -m http.server:

image

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jtpio really nice. lgtm

Only thing, not sure if I'd delete the iris csv, since now it's not used, and if we decide to use a csv later, we may probably try to find one with more dtypes than iris. But no big deal.

@jtpio
Copy link
Contributor Author

jtpio commented Jun 22, 2022

not sure if I'd delete the iris csv, since now it's not used,

Good catch. Happy to push a commit to remove it, so we keep things as simple as possible for now.

@jtpio
Copy link
Contributor Author

jtpio commented Jun 22, 2022

Happy to push a commit to remove it,

Done in 7d75312

@mroeschke mroeschke added Web pandas website Docs labels Jun 23, 2022
@mroeschke mroeschke added this to the 1.5 milestone Jun 23, 2022
@mroeschke mroeschke merged commit 2c947e0 into pandas-dev:main Jun 23, 2022
@mroeschke
Copy link
Member

Very cool! Thanks @jtpio

@jtpio jtpio deleted the jupyterlite branch June 24, 2022 07:25
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
…andas-dev#47428)

* Add an interactive shell powered by JupyterLite

* Update to the dedicated JupyterLite deployment

* Add example code

* Move build files to the pandas repo

* Build the jupyterlite website

* Load relative terminal

* Update example code

* Update wording

* Fix trailing spaces

* Move build dependencies to the top-level environment.yml

* Move to `web/interactive_terminal`

* Remove example code

* Add note about the loading time

* Update instructions in the README

* Update build command on CI

* Fix typo in .gitignore

* Lint environment.yml

* Remove unused import

* Undo unrelated changes in environment.yml

* Fix pre-commit check

* Remove unused csv file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Web pandas website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add interactive terminal to pandas website
4 participants