Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt in or opt out on vegafusion dependencies #3309

Open
mattijn opened this issue Jan 6, 2024 · 9 comments
Open

Opt in or opt out on vegafusion dependencies #3309

mattijn opened this issue Jan 6, 2024 · 9 comments

Comments

@mattijn
Copy link
Contributor

mattijn commented Jan 6, 2024

Opening this issue to discuss and collect insights.

Currently we have an altair package that relies on a few hard dependencies:

dependencies = [
    "typing_extensions>=4.0.1; python_version<\"3.11\"",
    "jinja2",
    "jsonschema>=3.0",
    "numpy",
    "pandas>=0.25",
    "toolz",
    "packaging"
]

In parallel efforts we are trying to reduce these hard dependencies (like pandas, do we really need numpy?)

But simultaneously, with the recent developments around vegafusion in combination with vl-convert, it would be great to have it somehow installed as a hard dependency instead as the current soft dependency approach.

I see four options, but maybe there are more (from a conda perspective)

  1. Do nothing, keep it as it is.
  2. Extend altair with vegafusion, anywidget and vl-convert-python as hard dependencies.
  3. Create an altair-base package which has the current minimal dependencies.
  4. Create an altair-vegafusion package that includes vegafusion, anywidget and vl-convert-python as hard dependencies.

Option 3 enables an opt-out approach.
Option 4 enables an opt-in approach.

Topics for consideration are at least:

  • WASM builds
  • ease-of-use
  • ease-of-installation
@jonmmease
Copy link
Contributor

Cross reference #2818. I think I would lean toward adding an extra_requires group for the optional dependencies, at least to start with as this would be a non-breaking change.

Eventually, I think the altair-base approach is worth considering. But this would need to correspond to a major version bump IMO.

@joelostblom
Copy link
Contributor

I think it would be nice to use vegafusion and vl-convert more by default in altair, e.g. enabling the vegafusion transformer as the default transformer and offline mode by default for jupytercharts. I don't have a strong preference whether it is via an opt-out or opt-in mechanism, but leaning slightly towards that the default is with all batteries includes / how we believe most people would get the most out of the library.

If we decide to go that path, I wonder what the default behavior should be when these libraries are not installed, should it be that altair shows a warning and asks to install vegafusion or change the renderer to "html". Or should it silently fall back to the html renderer when it can't find vegafusion? The former sounds safer and would likely reduce some confusing edgecases. I agree that a change like that would be in a major version.

@binste
Copy link
Contributor

binste commented Feb 20, 2024

I agree that it's nice if users can get as much as possible out of Altair by default without (much) configuration. At the same time, altair is mentioned in many requirements files by now and changing the dependencies to include a lot more, even in a major version bump, might ruffle some feathers.

I see it as two linked but somewhat separate discussions:

Simplifying installation

extra_requires groups solve this nicely in my opinion and I'd lean towards this instead of adding more dependencies to altair and creating an altair-base. It's the pip way to go to solve something like this and also taken by other libraries such as pandas. There example also shows nicely that this better scales to more optional dependencies in the future. Maybe we want to include avenger at one point or altair_ally, altair_tiles, ... The dependency groups allow us to do:

  • altair: What we have now
  • altair[all]: All optional dependencies below
  • altair[save]: Altair + vl-convert
  • altair[performance]: Altair + VegaFusion, Avenger?
  • altair[maps]: Altair + altair_tiles?
  • altair[ally]: Altair + altair_ally?
  • ...

Can be freely combined as altair[save, performance].

Enabling the magic of VegaFusion by default if it is installed

We could tackle this somewhat independently of the first topic. What do you think?

Related discussion: .to_dict/.to_json have to return a Vega spec if VegaFusion is used

@joelostblom
Copy link
Contributor

Separating into two issues sounds good 👍 For the packaging part, I'm on board with using extra_requires and keeping altair as the base with altair[all] being all optional dependencies (and that we create a corresponding altair-all conda package).

Would the next step be to decide which package to include in all? I'm thinking maybe vl-convert and everything vegafusion-related to start; maybe also vega_datasets? And then consider adding the additional packages (tiles, ally, avenger)? What do you think?

@jonmmease
Copy link
Contributor

For all, I'd recommend including at least:

  • vega-dataset (for all of our examples)
  • vl-convert-python (for image export, and other things)
  • vegafusion-python-embed and vegafusion (for chart.transformed_data and the "vegafusion" data transformer)
  • anywidget (for JupyterChart)

vegafusion-jupyter is going to be deprecated and won't be needed anymore with the integration of VegaFusion into JupyterChart.

avenger isn't ready for prime-time yet.

I would defer to @joelostblom and @binste on whether to include altair_tiles and altair_ally, but I don't have any objection. Do either have any dependencies that might be problematic?

@joelostblom
Copy link
Contributor

Good note on anywidget, I agree that should be included and your list looks great to me. I don't have any problematic dependencies in altair_ally at the moment, but I want to do some more work on the library before it is included in an official altair context; I'm planning to make time for it later in the spring.

@binste
Copy link
Contributor

binste commented Feb 29, 2024

altair_tiles only has 2 additional dependencies (mercantile and xyzservices) which are pure Python packages, both are well-maintained. So I think we could include it in all as well! I've been using it now for a while in production use cases and for single-view maps, it works. If we include it, I can also add a small example for it or some reference to it in the geoshape mark documentation.

I agree with your list for all @jonmmease.

@jonmmease
Copy link
Contributor

Great, let's include altair_tiles then as well! And yeah, it would be great to add some documentation for it directly in the main Altair docs.

@joelostblom
Copy link
Contributor

Related to packaging, I PRed an altair recipe into the default pyodide image in pyodide/pyodide#4580. The pyodide devs were very helpful and seem open to new packages being added, so let them know if there is anything else altair-related that you think should be in Pyodide by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants