Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to generate specification #311

Closed
philippjfr opened this issue Aug 30, 2022 · 5 comments
Closed

Add ability to generate specification #311

philippjfr opened this issue Aug 30, 2022 · 5 comments
Milestone

Comments

@philippjfr
Copy link
Member

All components currently implement .from_spec methods which instantiate the component from the declarative specification. We want to be able to do the reverse and construct a specification from component instances by implementing .to_spec methods.

For all the basic component types this should be fairly straightforward, e.g. a Source, View or Filter simply has to serialize it's parameters and its type. It becomes a little more difficult if we are dealing with references and variables because View.pipeline should generally not inline and serialize the entire Pipeline specification.

def to_spec(self, allow_refs=True):
    """
    Converts the component to a declarative specification that can be serialized to YAML.
    Whether sub-component definitions are inlined depends on the type of component,
    e.g. Filter and Transform components will be inlined on a Pipeline but a Pipeline will
    not be inlined on a View.

    Arguments
    -----------
    allow_refs: boolean
      Whether to allow exporting references or to inline the materialized values.

    Returns
    --------
    Declarative specification containing the definition of this component.
    """

Goals

  • We can serialize all component types individually but also a whole Dashboard or Pipeline definition.
  • We can handle references and variables
  • The exported specification faithfully roundtrips to an identical instance, i.e. we can go from instance -> specification -> instance and end up with an identical copy.
@jbednar
Copy link
Member

jbednar commented Aug 30, 2022

Sounds good! Can you explain "because View.pipeline should generally not inline and serialize the entire Pipeline specification" a bit?

@philippjfr
Copy link
Member Author

Sure, the problem in general is that a full-specification depends a bit on the context you are planning to use the exported specification in. Let's say you want to export a Pipeline as a standalone thing, in that case you want the specification to include the full definition of the Source. However when you're exporting a full dashboard you don't want to inline the Source definition in the Pipeline specification because multiple Pipeline objects may reference that very same Source. Therefore we need to be able to determine when to export a reference and when to inline the full specification.

@sophiamyang
Copy link

looks like it's partially implemented. Should we close this issue?

@philippjfr philippjfr added this to the v0.5.0 milestone Sep 27, 2022
@philippjfr
Copy link
Member Author

This has for the most part been implemented now so yes, I'll close.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants