Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update forked repository. #2

Merged
merged 350 commits into from
Jun 16, 2020
Merged

Conversation

DavidKatz-il
Copy link
Owner

No description provided.

schrockn and others added 30 commits June 3, 2020 10:32
Summary:
Rename this to indicate legacy status of repository yaml

Depends on D3243

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3246
Summary:
We are going use code in this file in a different way
during workspace.yaml loading, so refactoring in its own diff

Depends on D3246

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3248
…d document

Summary: Depends on D3248

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3249
Summary:
This adds the code that takes a file or a module and finds
sole repository or pipeline symbol in it and if not, throws.

It was suggested that we also have a mode where we can just package
up all the top level definitions in a file and make a repository.

I decided not to do this for two reasons:

1) We would have to make a change to the core CodePointer abstract
to be able to point to entire module as opposed to a specific
symbol within that module

2) I'm not sure if its a good idea in essence. Seems like it might
be sensible to force repository creation when you have more than
one def

Depends on D3249

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3250
Summary:
Adds the config schema for the workspace and tests for it.

Depends on D3250

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3251
Summary:
This diff adds a new "Run Grouping" UI to the top right section of the Run viewer, allowing you to rapidly jump between runs with the same parent root.

It also incorporates @yuhan's recent "subset tag" to show the solid query syntax in places where the solid subset was previously a comma-separated list of steps.

I also made a first pass at improving the runs list, which has suffered from low information density for a while now. Tags have much more space and are less bold, and the timestamps have been shrunk down considerably. Each run's stats are now behind a hover state on the status dot, which is slightly larger now. This improves load time quite a bit.

{F153431}

{F149470}

Test Plan: Run tests

Reviewers: prha, alangenfeld, schrockn, yuhan, max

Reviewed By: schrockn

Subscribers: yuhan

Differential Revision: https://dagster.phacility.com/D3174
Summary: Fixes #2519

Test Plan: tested make buildnext / yarn dev locally and works

Reviewers: sashank, yuhan

Reviewed By: sashank

Differential Revision: https://dagster.phacility.com/D3257
Summary: testing the removal of fcntl from postgres

Test Plan: full pytest suite

Reviewers: max, nate

Reviewed By: max, nate

Differential Revision: https://dagster.phacility.com/D3210
…cessRepositoryLocation

Summary:
Here we implement the `get_external_pipeline` method on the `OutOfProcessRepositoryLocation`. This lets us enable the test matrix on several sets of tests that query for `pipelineOrError` or `runConfigSchemaOrError`.

This also adds a new method to the `DagsterGraphQLContext`: `validate_external_pipeline`. Given a selector, this method returns whether a give pipeline and a subset are valid. This is used in the `get_external_pipeline` method for proper error handling over the host/user process boundary.

Test Plan: unit

Reviewers: alangenfeld, schrockn

Reviewed By: alangenfeld, schrockn

Differential Revision: https://dagster.phacility.com/D3223
* Add dagster-azure package with various storage components

This adds the following components based on Azure Data Lake Storage
Gen2 (and Azure Blob Storage where appropriate):

- ADLS2FileCache and adls2_file_cache
- ADLS2FileManager
- ADLS2IntermediateStore
- ADLS2ObjectStore
- the adls2_resource providing direct access to Azure Data Lake Storage
- the adls2_system_storage system storage

This is pretty similar to the S3 implementation, the main difference
being configuration: Azure's SDK requires credentials to be passed
explicitly, so the credential is expected in configuration.

Tests currently require an access key to complete any tests marked
'nettest'.

* Rename Fake Azure classes and modules to more English-friendly names

* Add ADLS2Resource to wrap ADLS2/Blob clients

* Fix import order in dagster-azure

* Add dagster-azure to install_dev_python_modules make target

* Include azure-storage-blob in dagster-azure requirements

* Remove unused variable in tests

* Don't install dagster-azure as part of install_dev_python_modules make target

* Remove accidentally committed Azure Blob object/intermediate store implementations

These work but have no immediate use case and no tests, so seem like an
unnecessary maintenance burden. This commit can be reverted if they're needed!

* Wrap potentially incompatible imports to add a custom warning

This centralizes the various azure-storage/azure-core imports and wraps them,
plus the snowflake-connector import, in a try/except block, adding a custom
warning with a suggested solution if the import fails.

* Add README to dagster-azure and note about incompatibility to dagster-snowflake's README

* Isort

* Set buildkite container for dagster-azure tests

* Env variables in buildkite for Azure

* Add dagster-azure package with various storage components

Summary:
This adds the following components based on Azure Data Lake Storage
Gen2 (and Azure Blob Storage where appropriate):

- ADLS2FileCache and adls2_file_cache
- ADLS2FileManager
- ADLS2IntermediateStore
- ADLS2ObjectStore
- the adls2_resource providing direct access to Azure Data Lake Storage
- the adls2_system_storage system storage

This is pretty similar to the S3 implementation, the main difference
being configuration: Azure's SDK requires credentials to be passed
explicitly, so the credential is expected in configuration.

Tests currently require an access key to complete any tests marked
'nettest'.

Rename Fake Azure classes and modules to more English-friendly names

Add ADLS2Resource to wrap ADLS2/Blob clients

Fix import order in dagster-azure

Add dagster-azure to install_dev_python_modules make target

Include azure-storage-blob in dagster-azure requirements

Remove unused variable in tests

Don't install dagster-azure as part of install_dev_python_modules make target

Remove accidentally committed Azure Blob object/intermediate store implementations

These work but have no immediate use case and no tests, so seem like an
unnecessary maintenance burden. This commit can be reverted if they're needed!

Wrap potentially incompatible imports to add a custom warning

This centralizes the various azure-storage/azure-core imports and wraps them,
plus the snowflake-connector import, in a try/except block, adding a custom
warning with a suggested solution if the import fails.

Add README to dagster-azure and note about incompatibility to dagster-snowflake's README

Isort

Set buildkite container for dagster-azure tests

Merge pull request #1 from dagster-io/dagster-azure

Set buildkite container for dagster-azure tests

Env variables in buildkite for Azure

Test Plan: bk

Differential Revision: https://dagster.phacility.com/D3238

* more buildkite

* tox

* Run black on dagster_snowflake

* Add pylint to dagster-azure tox.ini

* Explicitly specify __all__ for dagster_azure utils modules

* Fix black issues

Co-authored-by: Sandy Ryza <sandy@remix.com>
Co-authored-by: Sandy Ryza <sandyryza@gmail.com>
Summary:
We now require the README.md files to point to the docs.dagster.io docs site. Update dagster-azure and add autodocs API docs

Test Plan: docs only

Reviewers: sandyryza, schrockn

Reviewed By: sandyryza, schrockn

Differential Revision: https://dagster.phacility.com/D3261
Test Plan: bk, saw only one button, re-executed from run page, runs page, playground

Reviewers: alangenfeld, bengotow

Reviewed By: bengotow

Differential Revision: https://dagster.phacility.com/D3235
…nd OutOfProcessRepositoryLocationHandle

Summary:
Begin to introduce a light class hierarhcy here to capture the
different use cases.

Test Plan: BK

Reviewers: alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3259
Summary: Docs example for pipeline tags

Test Plan: ran locally

Reviewers: sashank

Reviewed By: sashank

Differential Revision: https://dagster.phacility.com/D3142
Test Plan: bk

Reviewers: alangenfeld, schrockn

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3215
…ocation

Test Plan: unit

Reviewers: schrockn, alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3230
Summary: Merged this into master but didn't catch the rename in the test

Test Plan: unit

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3264
Summary:
multi repo support for the run launcher

This current version expects the pipeline to be hosted in the same location and repo in the launch destination - which is the same as the implicit assumption we have now.

This also makes `external_pipeline` required on `launch_run`.

I think there are more clever ways to deal with this in the future, but this seems to me to be the most reasonable path forward for now.

Test Plan: bk

Reviewers: schrockn, sashank, max

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3202
Summary: Spell things

Test Plan: BK

Reviewers: alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3260
Summary: Lots of small fixes for docs; move everything to API docs from README files

Test Plan: docs only

Reviewers: sashank, yuhan, jordanbramble

Reviewed By: jordanbramble

Differential Revision: https://dagster.phacility.com/D3247
Summary:
Eliminating spurious error

./src/configeditor/codemirror-yaml/mode.tsx
  Line 79:32:  Unnecessary escape character: \[  no-useless-escape
  Line 79:61:  Unnecessary escape character: \[  no-useless-escape

Test Plan: ran make in js_modules/dagit. Saw no error

Reviewers: prha, bengotow

Reviewed By: bengotow

Differential Revision: https://dagster.phacility.com/D3272
Summary:
This adds a new format and method for modeling what
user-provided definitions are loaded by host processes.

Changes from repository.yaml

1) One can now load multiple repositories.
2) Those repositories can be loaded via multiple *locations*. The only
   supported location type right now is in-process. Soon we will support
   out-of-process and containers.
3) You no longer are required to specify the function/symbol name
   where a respository lives. If the target python module contains
   a single repositoroy, it will be automatically loaded. Likewise
   a target module with a single pipeline will be loaded
   with an ephemeral repository.

We also use scalar unions to provide shorthands. An example from
test cases. A single file `hello_world_repository.py` with
a repository defined at `hello_world_repository` is loadable
via these formats:

```
load_from:
  - python_file: hello_world_repository.py
```

```
load_from:
  - python_file:
      relative_path: hello_world_repository.py
```

```
load_from:
  - python_file:
      relative_path: hello_world_repository.py
      definition: hello_world_repository
```

Depends on D3251

Test Plan: BK

Reviewers: alangenfeld, max

Reviewed By: alangenfeld

Subscribers: sashank

Differential Revision: https://dagster.phacility.com/D3253
Test Plan: bk

Reviewers: nate

Reviewed By: nate

Differential Revision: https://dagster.phacility.com/D3256
Test Plan: Printed instance type

Reviewers: alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3273
Test Plan: unit

Reviewers: max

Reviewed By: max

Differential Revision: https://dagster.phacility.com/D3274
Summary: Fixes #2456

Test Plan: tested in browser

Reviewers: sashank, alangenfeld, yuhan

Reviewed By: sashank

Differential Revision: https://dagster.phacility.com/D3265
Summary:
Fixes #2242, fixes #2489

Took a "show, don't tell" approach to instance config, with annotated instance and pipeline run YAML files

Test Plan: docs only

Reviewers: yuhan, sashank

Reviewed By: sashank

Differential Revision: https://dagster.phacility.com/D3258
Summary:
One can now load a workspace in dagit using dagit -w

Added a multi-location, multi-repository example. On
load, we get an expected exception with the message "Invariant failed. Description: [legacy] must have only one location"

Test Plan: BK and load dagit manually to see correct error

Reviewers: alangenfeld, prha, bengotow

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3271
Summary:
Adds a first pass at the external schedule and partition set family of objects

Also hoists partition sets from schedules on to the repository and ensures name collisions don't happen, with related changes to make that pass.

Test Plan: i updated one test so thats something

Reviewers: sashank, prha, schrockn

Reviewed By: sashank

Differential Revision: https://dagster.phacility.com/D3262
Test Plan: bk

Reviewers: nate

Reviewed By: nate

Differential Revision: https://dagster.phacility.com/D3275
sryza and others added 29 commits June 12, 2020 17:14
Summary: #2569

Test Plan: bk

Reviewers: schrockn, sandyryza, max

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3456
Summary:
{F161325}
{F161489}

Test Plan: bk, clicked through to nested assets

Reviewers: schrockn, sashank, bengotow

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3480
Summary: If there is a reasonable asset key scheme label seems quite duplicative. Thoughts?

Test Plan: BK

Reviewers: prha

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D3484
Summary:
- all instances in docs
- all instances in examples
- instances that I missed the first time under python_modules

Test Plan: bk

Reviewers: yuhan, schrockn, max

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3486
Summary: Short-term fix to unblock; filed #2603 for the long-term solution

Test Plan: buildkite

Reviewers: catherinewu, alangenfeld

Reviewed By: catherinewu

Differential Revision: https://dagster.phacility.com/D3491
Summary: Need to guard that this db is used for run_storage

Test Plan: bk

Reviewers: sashank, schrockn, alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3490
Test Plan: bk

Reviewers: sashank, nate, sandyryza

Reviewed By: nate

Differential Revision: https://dagster.phacility.com/D3483
…mpat

Summary:
#2596

- update callsites from previous diffs
- diff 3 has the relatively most risky parts. happy to make them backcompat if needed
	* execute_plan
	* execute_plan_iterator
	* instance.create_run_for_pipeline
	* instance.create_run
	* instance.register_managed_run_for_test
	* create_execution_plan
- other relevant places

See details in revision commit messages below

followup:
- final pass to clean up call sites and variables

Test Plan: bk

Reviewers: schrockn, sandyryza

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3477
…te variables

Summary: #2596

Test Plan: no environment_dict renaming warning in tests

Reviewers: schrockn, sandyryza

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3481
Test Plan: docs build

Reviewers: nate

Reviewed By: nate

Differential Revision: https://dagster.phacility.com/D3494
Summary: resolves #2601

Test Plan: eyes

Reviewers: prha, max

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D3492
Summary: removes sftp_solid in favor of adding functions to SSHResource. Adds test for get, we can also add a test for put time permitting

Test Plan: BK

Reviewers: nate, sandyryza, max

Reviewed By: nate

Differential Revision: https://dagster.phacility.com/D3461
Summary: resolves #2599

Test Plan: added test case

Reviewers: schrockn, prha, sashank

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D3496
Test Plan:
make new $DAGSTER_HOME
git checkout 0.7.16
dagster -y examples/repository.yaml
git checkout master
dagster -w examples/workspace.yaml

{F163329}

Reviewers: sashank, prha

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D3497
Summary: update CHANGES.md

Test Plan: BK

Reviewers: prha

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D3498
Test Plan: inspection

Reviewers: alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3499
Summary:
str(exc_info) introduces ellipses in some configurations, so that tests fail with:

```
E       assert "@composite_solid 'config_fn_only' defines a configuration function <lambda> but does not define a configuration schema." in '<ExceptionInfo DagsterInvalidDefinitionError("@composite_solid \'config_fn_only\' defines a configuration function <l...chema. If you intend this composite to take no config_schema, you must explicitly specify config_schema={}.") tblen=3>'
E        +  where '<ExceptionInfo DagsterInvalidDefinitionError("@composite_solid \'config_fn_only\' defines a configuration function <l...chema. If you intend this composite to take no config_schema, you must explicitly specify config_schema={}.") tblen=3>' = str(<ExceptionInfo DagsterInvalidDefinitionError("@composite_solid 'config_fn_only' defines a configuration function <lamb...schema. If you intend this composite to take no config_schema, you must explicitly specify config_schema={}.") tblen=3>)
```

Test Plan: Unit local

Reviewers: alangenfeld, prha

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3495
Summary: Deal with out-of-process crashes better for resume-retry

Test Plan: Unit

Reviewers: alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D3502
Test Plan: manual

Reviewers: yuhan

Differential Revision: https://dagster.phacility.com/D3505
Summary:
Check for existence of a row without fetching and parsing the
entire snaphsot

Test Plan: BK

Reviewers: prha

Reviewed By: prha

Differential Revision: https://dagster.phacility.com/D3504
Summary: See title

Test Plan: BK

Differential Revision: https://dagster.phacility.com/D3511
@DavidKatz-il DavidKatz-il merged commit fe2cb3e into DavidKatz-il:master Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.