Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pythonify #41

Merged
merged 20 commits into from
Mar 26, 2024
Merged

Pythonify #41

merged 20 commits into from
Mar 26, 2024

Conversation

mtrivedi50
Copy link
Contributor

v0.3.0, March 19, 2024

⚠️ Deprecations

  • CLI

    • The following CLI commands are deprecated:
      • prism agent [apply | build | run | delete]
      • prism compile
      • prism connect
      • prism create [agent | task | trigger]
      • prism spark-submit
    • The following CLI commands remain:
      • prism init
      • prism run
      • prism graph
  • Manifest

    • In previous versions, when a project was compiled, Prism would create a manifest.json that contained the project's targets, refs, tasks.
    • Starting in v0.3.0, Prism uses a SQLite database to store project, task, target, and run-related information.
    • prism graph still uses a manifest.json when serving the visualizer UI.
  • tasks and hooks in task arguments

    • In previous versions, task functions had two required arguments: tasks, and hooks. tasks was used to reference the output of other tasks, and hooks was used to access adapters specified in profile.yaml.
    • Starting in v0.3.0, Prism has replaced both of these with a CurrentRun object. See more information below.
  • profile.yaml

    • In previous versions, users could connect to SQL databases with adapters defined in their profile.yaml.
    • Starting in v0.3.0, Prism uses Connector instances to connect to databases. See more information below.
    • In addition, we have deprecated the dbt adapter.
  • triggers.yaml

    • In previous versions, users could run custom code after a project succeeded or failed using triggers.yaml.
    • Starting in v0.3.0, Prism uses Callback instances to run custom code based on a project's status. See more information below.

🚀 Performance improvements

N/A

✨ Enhancements

  • PrismProject entrypoint

    • In previous versions, users managed their project with a prism_project.py file and optional profile.yaml and triggers.yaml files. The entrypoint to a Prism project was the CLI — users had to run prism run to actually execute their project.
    • Starting in v0.3.0, the PrismProject class is the recommended entrypoint to a Prism project. This class has the following benefits:
      • Programatic access to Prism projects — i.e., Prism projects can be instantiated and run via standard Python
      • The PrismProject class provides more fine-grained control over running Prism projects.
      • Projects as code, not as YAML. Instead of dealing with multiple files, Prism projects can be written as code that lives in a single file.
    • The PrismProject class has two methods: run and graph. As the names suggest, run is used to execute Prism projects, and graph is used to launch the Prism Visualizer UI.
    • Here's how one uses the PrismProject in action:
      project = PrismProject(
          version="1.0",
          tasks_dir=Path.cwd() / "tasks",
          concurrency=2,
          ctx={
              "OUTPUT":  Path.cwd() / "output"
          },
      )
      
      if __name__ == "__main__":
          project.run()
  • CurrentRun context object

    • Rather than tasks and refs, object stores information about the current project run, specifically:
      CurrentRun.ctx(key: str, default_value: Any) -> Any    # for grabbing variables from the PrismProject's base context or runtime context
      CurrentRun.conn(self, connector_id: str) -> Connector    # for grabbing a connector class defined in the PrismProject's instantiation
      CurrentRun.ref(self, task_id: str) -> Any    # for grabbing the output of a task
    • Here's how to use it in a task:
      # example_task.py
      
      from prism.decorators import task
      from prism.runtime import CurrentRun
      
      @task(id="example-task-id")
      def example_task():
          other_output = CurrentRun.ref(task_id="other_task_id")
          ....
    • This is a slightly cleaner user experience (one, unified context object instead of two), and it enables users to take advantage of autocomplete functionality in their IDE.
  • Connectors

    • Instead of defining adapters in profile.yaml, connectors are defined in Python as instances of the Connector class.
    • There are five Connector subclasses:
      BigQueryConnector(Connector)
      PostgresConnector(Connector)
      PrestoConnector(Connector)
      RedshiftConnector(Connector)
      SnowflakeConnector(Connector)
      TrinoConnector(Connector)
    • PrismProject accepts a list of Connector objects via the connectors keyword argument, i.e.,
      snowflake = SnowflakeConnector(
          id="snowflake-connector",
          ...
      )
      
      project = PrismProject(
          ...,
          connectors=[snowflake]
    • These connector objects can be accessed via CurrentRun, i.e.,
      conn = CurrentRun.conn(connector_id="snowflake-connector")
      conn.execute_sql(...)
  • Callbacks

    • Instead of defining custom code to run after a project has succeeded via the triggers.yaml file, users can specify custom code as Python functions.
    • PrismProject accepts a list of functions via the on_success and on_failure keyword arguments
    • Callback functions should not accept any arguments, e.g.,
      def print_success():
          print("Success!")
      
      project = PrismProject(
          ...,
          on_success=[print_success]

🐞 Bug fixes

N/A

📖 Documentation

TODO

🛠️ Other improvements

  • Rich logging - Prism uses rich to create beautiful logs for the user. This includes logs for events, tasks, and exceptions.
  • Updated triggers naming convention to callbacks. See above for more details.
  • Updated adapters naming convention to connectors. See above for more details.

@prism-admin prism-admin merged commit 6e87143 into main Mar 26, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants