Skip to content

Commit

Permalink
snowflake io manager (#7726)
Browse files Browse the repository at this point in the history
  • Loading branch information
sryza committed May 9, 2022
1 parent b4115ef commit 1f0d9fb
Show file tree
Hide file tree
Showing 30 changed files with 1,073 additions and 53 deletions.
12 changes: 12 additions & 0 deletions .buildkite/dagster-buildkite/dagster_buildkite/steps/dagster.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,18 @@ def graphql_pg_extra_cmds_fn(_):
else [SupportedPython.V3_9]
),
),
ModuleBuildSpec(
"python_modules/libraries/dagster-snowflake-pandas",
supported_pythons=( # dropped python 3.6 support
[
SupportedPython.V3_7,
SupportedPython.V3_8,
SupportedPython.V3_9,
]
if (branch_name == "master" or is_release_branch(branch_name))
else [SupportedPython.V3_9]
),
),
ModuleBuildSpec(
"python_modules/libraries/dagster-postgres", extra_cmds_fn=postgres_extra_cmds_fn
),
Expand Down
4 changes: 3 additions & 1 deletion docs/content/_apidocs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,9 @@ Dagster also provides a growing set of optional add-on libraries to integrate wi

- [Slack](/\_apidocs/libraries/dagster-slack) (`dagster-slack`) Provides a simple integration with Slack.

- [Snowflake](/\_apidocs/libraries/dagster-snowflake) (`dagster-snowflake`) Provides a resource for querying Snowflake from Dagster.
- [Snowflake](/\_apidocs/libraries/dagster-snowflake) (`dagster-snowflake`) Provides resources for querying Snowflake from Dagster.

- [Snowflake+Pandas](/\_apidocs/libraries/dagster-snowflake-pandas) (`dagster-snowflake-pandas`) Provides support for storing Pandas DataFrames in Snowflake.

- [Spark](/\_apidocs/libraries/dagster-spark) (`dagster-spark`) Provides an integration for working with Spark in Dagster.

Expand Down
4 changes: 4 additions & 0 deletions docs/content/_navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -684,6 +684,10 @@
"title": "Snowflake (dagster-snowflake)",
"path": "/_apidocs/libraries/dagster-snowflake"
},
{
"title": "Snowflake + Pandas (dagster-snowflake-pandas)",
"path": "/_apidocs/libraries/dagster-snowflake-pandas"
},
{
"title": "Spark (dagster-spark)",
"path": "/_apidocs/libraries/dagster-spark"
Expand Down
71 changes: 36 additions & 35 deletions docs/content/integrations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,38 +25,39 @@ This section includes guides on how to use Dagster with other tools.

Here is a complete list of Dagster's integration libraries. See full documentation in [API Reference](/\_apidocs#libraries).

| Integration | Library |
| ------------------ | ------------------------------------------------------------------ |
| Airbyte | [dagster-airbyte](/_apidocs/libraries/dagster-airbyte) |
| Airflow | [dagster-airflow](/_apidocs/libraries/dagster-airflow) |
| AWS | [dagster-aws](/_apidocs/libraries/dagster-aws) |
| Azure | [dagster-azure](/_apidocs/libraries/dagster-azure) |
| Celery | [dagster-celery](/_apidocs/libraries/dagster-celery) |
| Celery + Docker | [dagster-celery-docker](/_apidocs/libraries/dagster-celery-docker) |
| Dask | [dagster-dask](/_apidocs/libraries/dagster-dask) |
| Databricks | [dagster-databricks](/_apidocs/libraries/dagster-databricks) |
| Datadog | [dagster-datadog](/_apidocs/libraries/dagster-datadog) |
| Docker | [dagster-docker](/_apidocs/libraries/dagster-docker) |
| dbt | [dagster-dbt](/_apidocs/libraries/dagster-dbt) |
| Fivetran | [dagster-fivetran](/_apidocs/libraries/dagster-fivetran) |
| GCP | [dagster-gcp](/_apidocs/libraries/dagster-gcp) |
| Great Expectations | [dagster-ge](/_apidocs/libraries/dagster-ge) |
| Github | [dagster-github](/_apidocs/libraries/dagster-github) |
| Kubernetes | [dagster-k8s](/_apidocs/libraries/dagster-k8s) |
| Microsoft Teams | [dagster-msteams](/_apidocs/libraries/dagster-msteams) |
| MLflow | [dagster-mlflow](/_apidocs/libraries/dagster-mlflow) |
| MySQL | [dagster-mysql](/_apidocs/libraries/dagster-mysql) |
| PagerDuty | [dagster-pagerduty](/_apidocs/libraries/dagster-pagerduty) |
| Pandas | [dagster-pandas](/_apidocs/libraries/dagster-pandas) |
| Pandera | [dagster-pandera](/_apidocs/libraries/dagster-pandera) |
| Papermill | [dagstermill](/_apidocs/libraries/dagstermill) |
| Papertrail | [dagster-papertrail](/_apidocs/libraries/dagster-papertrail) |
| PostgreSQL | [dagster-postgres](/_apidocs/libraries/dagster-postgres) |
| Prometheus | [dagster-prometheus](/_apidocs/libraries/dagster-prometheus) |
| Pyspark | [dagster-pyspark](/_apidocs/libraries/dagster-pyspark) |
| Shell | [dagster-shell](/_apidocs/libraries/dagster-shell) |
| Slack | [dagster-slack](/_apidocs/libraries/dagster-slack) |
| Snowflake | [dagster-snowflake](/_apidocs/libraries/dagster-snowflake) |
| Spark | [dagster-spark](/_apidocs/libraries/dagster-spark) |
| SSH / SFTP | [dagster-ssh](/_apidocs/libraries/dagster-ssh) |
| Twilio | [dagster-twilio](/_apidocs/libraries/dagster-twilio) |
| Integration | Library |
| ------------------ | ------------------------------------------------------------------------ |
| Airbyte | [dagster-airbyte](/_apidocs/libraries/dagster-airbyte) |
| Airflow | [dagster-airflow](/_apidocs/libraries/dagster-airflow) |
| AWS | [dagster-aws](/_apidocs/libraries/dagster-aws) |
| Azure | [dagster-azure](/_apidocs/libraries/dagster-azure) |
| Celery | [dagster-celery](/_apidocs/libraries/dagster-celery) |
| Celery + Docker | [dagster-celery-docker](/_apidocs/libraries/dagster-celery-docker) |
| Dask | [dagster-dask](/_apidocs/libraries/dagster-dask) |
| Databricks | [dagster-databricks](/_apidocs/libraries/dagster-databricks) |
| Datadog | [dagster-datadog](/_apidocs/libraries/dagster-datadog) |
| Docker | [dagster-docker](/_apidocs/libraries/dagster-docker) |
| dbt | [dagster-dbt](/_apidocs/libraries/dagster-dbt) |
| Fivetran | [dagster-fivetran](/_apidocs/libraries/dagster-fivetran) |
| GCP | [dagster-gcp](/_apidocs/libraries/dagster-gcp) |
| Great Expectations | [dagster-ge](/_apidocs/libraries/dagster-ge) |
| Github | [dagster-github](/_apidocs/libraries/dagster-github) |
| Kubernetes | [dagster-k8s](/_apidocs/libraries/dagster-k8s) |
| Microsoft Teams | [dagster-msteams](/_apidocs/libraries/dagster-msteams) |
| MLflow | [dagster-mlflow](/_apidocs/libraries/dagster-mlflow) |
| MySQL | [dagster-mysql](/_apidocs/libraries/dagster-mysql) |
| PagerDuty | [dagster-pagerduty](/_apidocs/libraries/dagster-pagerduty) |
| Pandas | [dagster-pandas](/_apidocs/libraries/dagster-pandas) |
| Pandera | [dagster-pandera](/_apidocs/libraries/dagster-pandera) |
| Papermill | [dagstermill](/_apidocs/libraries/dagstermill) |
| Papertrail | [dagster-papertrail](/_apidocs/libraries/dagster-papertrail) |
| PostgreSQL | [dagster-postgres](/_apidocs/libraries/dagster-postgres) |
| Prometheus | [dagster-prometheus](/_apidocs/libraries/dagster-prometheus) |
| Pyspark | [dagster-pyspark](/_apidocs/libraries/dagster-pyspark) |
| Shell | [dagster-shell](/_apidocs/libraries/dagster-shell) |
| Slack | [dagster-slack](/_apidocs/libraries/dagster-slack) |
| Snowflake | [dagster-snowflake](/_apidocs/libraries/dagster-snowflake) |
| Snowflake + Pandas | [dagster-snowflake-pandas](/_apidocs/libraries/dagster-snowflake-pandas) |
| Spark | [dagster-spark](/_apidocs/libraries/dagster-spark) |
| SSH / SFTP | [dagster-ssh](/_apidocs/libraries/dagster-ssh) |
| Twilio | [dagster-twilio](/_apidocs/libraries/dagster-twilio) |
1 change: 1 addition & 0 deletions docs/docs-dagster-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
-e ../python_modules/libraries/dagster-shell
-e ../python_modules/libraries/dagster-slack
-e ../python_modules/libraries/dagster-snowflake
-e ../python_modules/libraries/dagster-snowflake-pandas
-e ../python_modules/libraries/dagster-spark
-e ../python_modules/libraries/dagster-ssh
-e ../python_modules/libraries/dagster-twilio
Expand Down
1 change: 1 addition & 0 deletions docs/sphinx/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
sections/api/apidocs/libraries/dagster-shell
sections/api/apidocs/libraries/dagster-slack
sections/api/apidocs/libraries/dagster-snowflake
sections/api/apidocs/libraries/dagster-snowflake-pandas
sections/api/apidocs/libraries/dagster-spark
sections/api/apidocs/libraries/dagster-ssh
sections/api/apidocs/libraries/dagster-twilio
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Snowflake with Pandas (dagster-snowflake-pandas)
------------------------------------------------

This library provides an integration with the `Snowflake <https://www.snowflake.com/>`_ data
warehouse and Pandas data processing library.

To use this library, you should first ensure that you have an appropriate `Snowflake user
<https://docs.snowflake.net/manuals/user-guide/admin-user-management.html>`_ configured to access
your data warehouse.


.. currentmodule:: dagster_snowflake_pandas

.. autoclass:: SnowflakePandasTypeHandler

Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ Snowflake (dagster-snowflake)
This library provides an integration with the `Snowflake <https://www.snowflake.com/>`_ data
warehouse.

It provides a ``snowflake_resource``, which is a Dagster resource for configuring
Snowflake connections and issuing queries, as well as a ``snowflake_op_for_query`` function for
constructing ops that execute Snowflake queries.

To use this library, you should first ensure that you have an appropriate `Snowflake user
<https://docs.snowflake.net/manuals/user-guide/admin-user-management.html>`_ configured to access
your data warehouse.
Expand All @@ -18,8 +14,11 @@ your data warehouse.
.. autoconfigurable:: snowflake_resource
:annotation: ResourceDefinition

.. autofunction:: build_snowflake_io_manager

.. autofunction:: snowflake_op_for_query

.. autoclass:: SnowflakeConnection
:members:
:undoc-members:

26 changes: 21 additions & 5 deletions python_modules/dagster/dagster/core/execution/context/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,12 +204,28 @@ def resources(self) -> Any:
return self._resources

@property
def asset_key(self) -> Optional[AssetKey]:
if not self._name:
return None
return self.step_context.pipeline_def.asset_layer.asset_key_for_input(
def has_asset_key(self) -> bool:
return (
self._step_context is not None
and self._name is not None
and self._step_context.pipeline_def.asset_layer.asset_key_for_input(
node_handle=self.step_context.solid_handle, input_name=self._name
)
is not None
)

@property
def asset_key(self) -> AssetKey:
result = self.step_context.pipeline_def.asset_layer.asset_key_for_input(
node_handle=self.step_context.solid_handle, input_name=self.name
)
if result is None:
raise DagsterInvariantViolationError(
"Attempting to access asset_key, "
"but it was not provided when constructing the InputContext"
)

return result

@property
def step_context(self) -> "StepExecutionContext":
Expand Down Expand Up @@ -317,7 +333,7 @@ def add_input_metadata(

metadata = check.dict_param(metadata, "metadata", key_type=str)
self._metadata_entries.extend(normalize_metadata(metadata, []))
if self.asset_key:
if self.has_asset_key:
check.opt_str_param(description, "description")

observation = AssetObservation(
Expand Down
2 changes: 2 additions & 0 deletions python_modules/libraries/dagster-snowflake-pandas/.coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[run]
branch = True

1 comment on commit 1f0d9fb

@vercel
Copy link

@vercel vercel bot commented on 1f0d9fb May 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.