Converting PickledObjectFilesystemIOManager to use UPathIOManager #10273

danielgafni · 2022-10-31T20:17:43Z

Summary & Motivation

The default IO manager can now be built on top of the recently added UPathIOManager.

The PickledObjectFilesystemIOManager class can now be used with any filesystems.

The fs_io_manager object, however, is still meant to be used with the local filesystem

How I Tested These Changes

Running existing tests

vercel · 2022-10-31T20:17:46Z

@danielgafni is attempting to deploy a commit to the Elementl Team on Vercel.

A member of the Team first needs to authorize it.

vercel · 2022-10-31T20:17:47Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Updated
dagit-storybook	⬜️ Ignored (Inspect)		Nov 14, 2022 at 4:05PM (UTC)

danielgafni · 2022-10-31T20:18:23Z

@sryza here is the new PR

danielgafni · 2022-11-02T19:28:03Z

What should we do with CustomPathPickledObjectFilesystemIOManager? I also found VersionedPickledObjectFilesystemIOManager which has to be updated too? Seems like the versioning logic can be included in the UPathIOManager too.

sryza

I love to see lines of code deleted.

Overall, this change looks great.

I'm seeing a few test failures that I think might be related to these changes:

[2022-11-03T15:19:34Z] dagster_tests/core_tests/test_multiple_outputs.py::test_multiple_outputs_only_emit_one_multiproc FAILED [  3%]
[2022-11-03T15:19:56Z] dagster_tests/core_tests/test_pipeline_execution.py::test_multiproc_reexecution_fs_storage_after_fail FAILED [  6%]
[2022-11-03T15:32:10Z] dagster_tests/core_tests/partition_tests/test_partitioned_io_manager.py::test_partitioned_io_manager FAILED [ 66%]

danielgafni · 2022-11-04T19:40:53Z

    
        except DagsterError as e:
            raise DagsterInvalidDefinitionError(
                f"Problem using type '{inferred.annotation}' from type annotation for argument "
                f"'{inferred.name}', correct the issue or explicitly set the dagster_type "
                "via In()."
>           ) from e
E           dagster._core.errors.DagsterInvalidDefinitionError: Problem using type 'typing.Mapping[str, typing.Any]' from type annotation for argument 'hourly_asset', correct the issue or explicitly set the dagster_type via In().

Mappiong type annotation don't seem to work with Dagster out of the box (without DagsterType), so I'm changing the partitioned typing to Dict.

danielgafni · 2022-11-04T19:50:14Z

Also, do you know anything about this TODO? I get an error when trying to log something

        for partition_key, path in paths.items():
            context.log.debug(f"Loading partition from {path} using {self.__class__.__name__}")
            try:
                obj = self.load_from_path(context=context, path=path)
                objs[partition_key] = obj
            except FileNotFoundError as e:
                if not allow_missing_partitions:
                    raise e
                context.log.debug(
                    f"Couldn't load partition {path} and skipped it "
                    f"because the input metadata includes allow_missing_partitions=True"
                )

        # TODO: context.add_output_metadata fails in the partitioned context. this should be fixed?
        return objs

sryza · 2022-11-04T23:08:07Z

We're going to remove both CustomPathPickledObjectFilesystemIOManager and
VersionedPickledObjectFilesystemIOManager, so I don't think it's worth updating them.

I think we should ultimately handle the versioning logic in the fs_io_manager / UPathIOManager itsekf, but I don't think that needs to be part of this PR. We're currently developing asset versioning, so would make sense to wait for that.

docs/content/concepts/io-management/io-managers.mdx

sryza

I'm seeing a few complaints from the build:

dagster_tests/core_tests/storage_tests/test_upath_io_manager.py:6: in <module>
  | [2022-11-08T22:27:23Z]     import pandas as pd
  | [2022-11-08T22:27:23Z] E   ModuleNotFoundError: No module named 'pandas'

It looks like Pandas isn't currently a dependency of the Dagster tests. I'm a tiny bit hesitant to add it because installing Pandas could increase build times. If you do think it makes a lot of sense to use Pandas here, you could add it as part of the testenv in python_modules/dagster/tox.ini.

[2022-11-08T22:23:41Z] +ENOENT: no such file or directory, open '/workdir/examples/docs_snippets/docs_snippets/concepts/io_management/upath_io_manager.py'

Did you possibly forget to commit a file?

btw, this is what I run locally to check for these kinds of errors:

cd $HOME/dagster/docs/
make mdx-format
grep -r --include \*.mdx ENOENT ./
cd ..

one more:

[2022-11-08T22:25:39Z] /workdir/python_modules/dagster/dagster/_core/storage/upath_io_manager.py:docstring of dagster._core.storage.upath_io_manager.UPathIOManager:6: WARNING: Bullet list ends without a blank line; unexpected unindent.

Thanks for all the revisions you're doing on this one!

danielgafni · 2022-11-09T11:14:52Z

I don't think Pandas is really needed. I just wanted to test a type different than Any. I changed this test to run over List.

sryza · 2022-11-09T22:16:17Z

This is very close! It looks like buildkite has a few outstanding complaints:

This one is confusing to me because you listed universal_pathlib in Dagster's setup.py. Let me know if you can't figure this one out, and I can help.

[2022-11-09T20:01:14Z] ************* Module docs_snippets.concepts.io_management.filesystem_io_manager
--
  | [2022-11-09T20:01:14Z] docs_snippets/concepts/io_management/filesystem_io_manager.py:6:0: E0401: Unable to import 'universal_pathlib' (import-error)

Ran into the following error in the GraphQL tests. I think what's going on is that we're no longer doing context.log.debug(f"Writing file at: {filepath}") and context.log.debug(f"Loading file from: {filepath}"). We should probably bring these back for consistency.

[2022-11-09T20:08:18Z] dagster_graphql_tests/graphql/test_execute_pipeline.py::TestExecutePipeline::test_basic_start_pipeline_execution_and_subscribe[sqlite_with_default_run_launcher_managed_grpc_env] FAILED [ 49%]
...
[2022-11-09T20:16:23Z] E       AssertionError: assert ['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'LogMessageEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent'] == ['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent']
--
  | [2022-11-09T20:16:23Z] E         At index 8 diff: 'LogMessageEvent' != 'HandledOutputEvent'
  | [2022-11-09T20:16:23Z] E         Left contains 3 more items, first extra item: 'HandledOutputEvent'
  | [2022-11-09T20:16:23Z] E         Full diff:
  | [2022-11-09T20:16:23Z] E           [
  | [2022-11-09T20:16:23Z] E            'RunStartingEvent',
  | [2022-11-09T20:16:23Z] E            'RunStartEvent',
  | [2022-11-09T20:16:23Z] E            'ResourceInitStartedEvent',
  | [2022-11-09T20:16:23Z] E            'ResourceInitSuccessEvent',
  | [2022-11-09T20:16:23Z] E            'LogsCapturedEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepStartEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepInputEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepOutputEvent',
  | [2022-11-09T20:16:23Z] E         +  'LogMessageEvent',
  | [2022-11-09T20:16:23Z] E            'HandledOutputEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepSuccessEvent',
  | [2022-11-09T20:16:23Z] E            'LogsCapturedEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepStartEvent',
  | [2022-11-09T20:16:23Z] E         +  'LogMessageEvent',
  | [2022-11-09T20:16:23Z] E            'LoadedInputEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepInputEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepOutputEvent',
  | [2022-11-09T20:16:23Z] E         +  'LogMessageEvent',
  | [2022-11-09T20:16:23Z] E            'HandledOutputEvent',
  | [2022-11-09T20:16:23Z] E            'ExecutionStepSuccessEvent',
  | [2022-11-09T20:16:23Z] E            'RunSuccessEvent',
  | [2022-11-09T20:16:23Z] E           ]
  | [2022-11-09T20:16:23Z]
  | [2022-11-09T20:16:23Z] dagster_graphql_tests/graphql/test_execute_pipeline.py:392: AssertionError

A couple mypy errors in the tests. Let me know if you want help figuring these out.


[2022-11-09T20:00:26Z] dagster_tests/core_tests/storage_tests/test_upath_io_manager.py:345: error: Item "StepFailureData" of "Union[StepOutputData, StepFailureData, StepSuccessData, StepMaterializationData, StepExpectationResultData, StepInputData, EngineEventData, HookErroredData, StepRetryData, PipelineFailureData, PipelineCanceledData, ObjectStoreOperationResultData, HandledOutputData, LoadedInputData, ComputeLogsCaptureData, AssetObservationData, AssetMaterializationPlannedData, None]" has no attribute "metadata_entries"  [union-attr]
 [2022-11-09T20:00:26Z] dagster_tests/core_tests/storage_tests/test_upath_io_manager.py:345: error: Value of type "Union[List[MetadataEntry], Any, None]" is not indexable  [index]

One of the toys is failing. This toy isn't super important to keep around, because it covers functionality that we've gotten rid of, so feel free to remove it, but I think it's worth seeing why it's failing to see if it might cause failures in some legitimate situation.

[2022-11-09T20:07:18Z] FAILED dagster_test_tests/test_toys.py::test_asset_lineage_job - dagster._cor...


[2022-11-09T20:05:12Z] dagster._core.errors.DagsterResourceFunctionError: Error executing resource_fn on ResourceDefinition my_db_io_manager
--
  | [2022-11-09T20:05:12Z]
  | [2022-11-09T20:05:12Z] Stack Trace:
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/api.py", line 1091, in pipeline_execution_iterator
  | [2022-11-09T20:05:12Z]     for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/executor/in_process.py", line 38, in execute
  | [2022-11-09T20:05:12Z]     yield from iter(
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/api.py", line 1182, in __iter__
  | [2022-11-09T20:05:12Z]     yield from self.execution_context_manager.prepare_context()
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_utils/__init__.py", line 494, in generate_setup_events
  | [2022-11-09T20:05:12Z]     obj = next(self.generator)
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/context_creation_pipeline.py", line 260, in execution_context_event_generator
  | [2022-11-09T20:05:12Z]     yield from resources_manager.generate_setup_events()
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_utils/__init__.py", line 494, in generate_setup_events
  | [2022-11-09T20:05:12Z]     obj = next(self.generator)
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 257, in resource_initialization_event_generator
  | [2022-11-09T20:05:12Z]     yield from _core_resource_initialization_event_generator(
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 220, in _core_resource_initialization_event_generator
  | [2022-11-09T20:05:12Z]     raise dagster_user_error
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 185, in _core_resource_initialization_event_generator
  | [2022-11-09T20:05:12Z]     for event in manager.generate_setup_events():
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_utils/__init__.py", line 494, in generate_setup_events
  | [2022-11-09T20:05:12Z]     obj = next(self.generator)
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 349, in single_resource_event_generator
  | [2022-11-09T20:05:12Z]     raise dagster_user_error
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 341, in single_resource_event_generator
  | [2022-11-09T20:05:12Z]     except StopIteration:
  | [2022-11-09T20:05:12Z]   File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
  | [2022-11-09T20:05:12Z]     self.gen.throw(typ, value, traceback)
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/errors.py", line 191, in user_code_error_boundary
  | [2022-11-09T20:05:12Z]     raise error_cls(
  | [2022-11-09T20:05:12Z]
  | [2022-11-09T20:05:12Z] The above exception was caused by the following exception:
  | [2022-11-09T20:05:12Z] TypeError: expected str, bytes or os.PathLike object, not NoneType
  | [2022-11-09T20:05:12Z]
  | [2022-11-09T20:05:12Z] Stack Trace:
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/errors.py", line 184, in user_code_error_boundary
  | [2022-11-09T20:05:12Z]     yield
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/execution/resources_init.py", line 325, in single_resource_event_generator
  | [2022-11-09T20:05:12Z]     resource_def.resource_fn(context)
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster-test/dagster_test/toys/asset_lineage.py", line 98, in my_db_io_manager
  | [2022-11-09T20:05:12Z]     return MyDatabaseIOManager()
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster/dagster/_core/storage/fs_io_manager.py", line 144, in __init__
  | [2022-11-09T20:05:12Z]     super().__init__(base_path=UPath(base_dir, **kwargs))
  | [2022-11-09T20:05:12Z]   File "/workdir/python_modules/dagster-test/.tox/py39/lib/python3.9/site-packages/upath/core.py", line 121, in __new__
  | [2022-11-09T20:05:12Z]     return pathlib.Path(*args, **kwargs)
  | [2022-11-09T20:05:12Z]   File "/usr/local/lib/python3.9/pathlib.py", line 1082, in __new__
  | [2022-11-09T20:05:12Z]     self = cls._from_parts(args, init=False)
  | [2022-11-09T20:05:12Z]   File "/usr/local/lib/python3.9/pathlib.py", line 707, in _from_parts
  | [2022-11-09T20:05:12Z]     drv, root, parts = self._parse_args(args)
  | [2022-11-09T20:05:12Z]   File "/usr/local/lib/python3.9/pathlib.py", line 691, in _parse_args
  | [2022-11-09T20:05:12Z]     a = os.fspath(a)
  | [2022-11-09T20:05:12Z]
  | [2022-11-09T20:05:12Z] FAILED
 ```

danielgafni · 2022-11-10T00:19:04Z

mypy - fixed
toy - updated the code, for some reason a MyDatabaseIOManager inherited from PickledObjectFilesystemIOManager which doesn't make much sense... I removed this. The toy runs now. Did I miss something?
GraphQL errors - fixed the logging to be the same as before
fixed package name to upath xD

sryza · 2022-11-10T21:42:14Z

Sorry, still seeing a few issues.

Building API docs. You can test this out with make apidoc-build:

[2022-11-10T18:47:07Z] /workdir/python_modules/dagster/dagster/_core/storage/upath_io_manager.py:docstring of dagster._core.storage.upath_io_manager.UPathIOManager:6: WARNING: Bullet list ends without a blank line; unexpected unindent.
--
  | [2022-11-10T18:47:07Z] /workdir/python_modules/dagster/dagster/_core/storage/upath_io_manager.py:docstring of dagster._core.storage.upath_io_manager.UPathIOManager:9: WARNING: Definition list ends without a blank line; unexpected unindent.
  | [2022-11-10T18:47:07Z] looking for now-outdated files... none found

Some GraphQL tests appear to still be failing for the reason above:

[2022-11-10T18:53:20Z] dagster_graphql_tests/graphql/test_execute_pipeline.py::TestExecutePipeline::test_basic_start_pipeline_execution_and_subscribe[sqlite_with_default_run_launcher_managed_grpc_env] FAILED [ 50%]
...

[2022-11-10T19:01:44Z] E       AssertionError: assert (['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'ExecutionStepStartEvent',\n 'LogMessageEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent'] == ['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'ExecutionStepStartEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent']
--
  | [2022-11-10T19:01:44Z] E         At index 8 diff: 'LogMessageEvent' != 'HandledOutputEvent'
  | [2022-11-10T19:01:44Z] E         Left contains 3 more items, first extra item: 'HandledOutputEvent'
  | [2022-11-10T19:01:44Z] E         Full diff:
  | [2022-11-10T19:01:44Z] E           [
  | [2022-11-10T19:01:44Z] E            'RunStartingEvent',
  | [2022-11-10T19:01:44Z] E            'RunStartEvent',
  | [2022-11-10T19:01:44Z] E            'ResourceInitStartedEvent',
  | [2022-11-10T19:01:44Z] E            'ResourceInitSuccessEvent',
  | [2022-11-10T19:01:44Z] E            'LogsCapturedEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepStartEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepInputEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepOutputEvent',
  | [2022-11-10T19:01:44Z] E         +  'LogMessageEvent',
  | [2022-11-10T19:01:44Z] E            'HandledOutputEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepSuccessEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepStartEvent',
  | [2022-11-10T19:01:44Z] E         +  'LogMessageEvent',
  | [2022-11-10T19:01:44Z] E            'LoadedInputEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepInputEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepOutputEvent',
  | [2022-11-10T19:01:44Z] E         +  'LogMessageEvent',
  | [2022-11-10T19:01:44Z] E            'HandledOutputEvent',
  | [2022-11-10T19:01:44Z] E            'ExecutionStepSuccessEvent',
  | [2022-11-10T19:01:44Z] E            'RunSuccessEvent',
  | [2022-11-10T19:01:44Z] E           ] or ['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'ExecutionStepStartEvent',\n 'LogMessageEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'LogMessageEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent'] == ['RunStartingEvent',\n 'RunStartEvent',\n 'ResourceInitStartedEvent',\n 'ResourceInitSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'LogsCapturedEvent',\n 'ExecutionStepStartEvent',\n 'LoadedInputEvent',\n 'ExecutionStepInputEvent',\n 'ExecutionStepOutputEvent',\n 'HandledOutputEvent',\n 'ExecutionStepSuccessEvent',\n 'RunSuccessEvent']

test_asset_lineage_job also appears to still be failing.

And a couple lint errors:

[2022-11-10T18:48:12Z] ************* Module dagster_test.toys.asset_lineage
--
  | [2022-11-10T18:48:12Z] dagster_test/toys/asset_lineage.py:92:11: E0110: Abstract class 'MyDatabaseIOManager' with abstract methods instantiated (abstract-class-instantiated)
  | [2022-11-10T18:48:12Z] dagster_test/toys/asset_lineage.py:2:0: W0611: Unused import os (unused-import)
  | [2022-11-10T18:48:13Z]
  | [2022-11-10T18:48:13Z] -----------------------------------
  | [2022-11-10T18:48:13Z] Your code has been rated at 9.96/10
  | [2022-11-10T18:48:13Z]

Let me know if you get burnt out on this back and forth. I'd be happy to patch up these final issues on my own if helpful. Btw, I'm pushing harder on our infra team to see if we can get buildkite exposed.

danielgafni · 2022-11-10T23:00:58Z

Thanks! I'm doing fine, I was really trying to get everything ready for this release, but I can be more relaxed now since we are waiting for the next one :)

Re: buildkite - it would be really nice! It's very annoying to wait for you to run the build (I'm sure it's equally annoying for you too), especially since I'm in a very different time zone. It's also easy to forget to run some checks locally.

Perhaps a single make command for running all the checks could simplify the process?

Another problem is with the source of errors - because the repo is already in a dirty state it's harder to identify which errors did your commits introduce.

Ideally it would be nice to have all the errors around the repo be fixed and merging with failed pipelines be forbidden... but I understand it's a lot of work to do.

danielgafni · 2022-11-11T08:27:03Z

Another problem is running tests - it's taking a really long time. So instead of running all the tests I usually only run them for some files which I know I'm affecting. The problem is, sometimes there are tests I don't know I'm affecting, and I can't know about it until running the CI.

sryza · 2022-11-12T00:48:23Z

Latest build: The GraphQL error appears to be gone. test_asset_lineage_job and make apidoc-build appear to still have failures.

Our infra folks are actively looking into exposing buildkite publicly.

danielgafni · 2022-11-12T19:15:58Z

What is the error message for apidoc-build? I can't identify it... I only see this:

sryza · 2022-11-13T17:03:28Z

The apidoc-build errors are annoying - you need to scroll up to find them. It's this one:

[2022-11-11T20:36:11Z] /workdir/python_modules/dagster/dagster/_core/storage/upath_io_manager.py:docstring of dagster._core.storage.upath_io_manager.UPathIOManager:6: WARNING: Bullet list ends without a blank line; unexpected unindent.
--
  | [2022-11-11T20:36:11Z] /workdir/python_modules/dagster/dagster/_core/storage/upath_io_manager.py:docstring of dagster._core.storage.upath_io_manager.UPathIOManager:9: WARNING: Definition list ends without a blank line; unexpected unindent.

danielgafni · 2022-11-13T19:52:41Z

The docs build is now running correctly!

sryza · 2022-11-14T15:58:16Z

One last tiny issue!

[2022-11-14T05:57:50Z] dagster_test/toys/asset_lineage.py:2:0: W0611: Unused import os (unused-import)

Everything else looks great

danielgafni · 2022-11-14T16:05:28Z

fixed

sryza · 2022-11-14T16:51:21Z

The buildkite failures look unrelated. Going to merge this!

converted PickledObjectFilesystemIOManager to use UPathIOManager

19dc59d

danielgafni changed the title ~~Converting PickledObject IOManagers to use UPathIOManager~~ WIP: Converting PickledObject IOManagers to use UPathIOManager Oct 31, 2022

alangenfeld requested a review from sryza November 1, 2022 14:23

remove experimental annotation

c196d09

danielgafni changed the title ~~WIP: Converting PickledObject IOManagers to use UPathIOManager~~ Converting PickledObjectFilesystemIOManager to use UPathIOManager Nov 2, 2022

danielgafni added 2 commits November 2, 2022 20:34

add kwargs for UPath

4b4f048

fix docstring

6696341

sryza reviewed Nov 4, 2022

View reviewed changes

danielgafni added 2 commits November 4, 2022 20:41

refactor load_input logic

e676674

fix typo

71df66c

danielgafni added 2 commits November 7, 2022 20:33

add UPathIOManager docs

5cd6950

UPathIOManager brought into main dagster scope

18e8b46

sryza reviewed Nov 7, 2022

View reviewed changes

danielgafni added 2 commits November 7, 2022 21:02

fix wording

ab469c3

allow omitting type annotations for loading multiple partitions

26fd37b

sryza self-requested a review November 8, 2022 15:51

moved UPathIOManager docs to Examples

ae3fda0

sryza requested changes Nov 9, 2022

View reviewed changes

fix get_metadata call

f4069ad

danielgafni added 3 commits November 9, 2022 15:03

remove pandas from tests & fix some issues with docs

a90eb93

add blank line

fc4138e

make mdx-format

b86b5ee

sryza self-requested a review November 9, 2022 21:26

danielgafni added 5 commits November 10, 2022 04:18

fix issues

bb3dab3

merge master

b5ecbcb

fix import

f7def65

merge master

008dd94

fix docs issues

10ae18d

fix graphql test & toy script

43bf961

fix toy IO manager

29c524a

danielgafni added 3 commits November 13, 2022 23:39

fix apidoc

930a149

fix typo

e01b630

merge master

29e2521

sryza merged commit 40671c2 into dagster-io:master Nov 14, 2022

remove unused import

9962598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting PickledObjectFilesystemIOManager to use UPathIOManager #10273

Converting PickledObjectFilesystemIOManager to use UPathIOManager #10273

danielgafni commented Oct 31, 2022 •

edited

vercel bot commented Oct 31, 2022

vercel bot commented Oct 31, 2022 •

edited

danielgafni commented Oct 31, 2022

danielgafni commented Nov 2, 2022 •

edited

sryza left a comment

danielgafni commented Nov 4, 2022 •

edited

danielgafni commented Nov 4, 2022

sryza commented Nov 4, 2022

sryza left a comment

danielgafni commented Nov 9, 2022 •

edited

sryza commented Nov 9, 2022

danielgafni commented Nov 10, 2022 •

edited

sryza commented Nov 10, 2022

danielgafni commented Nov 10, 2022

danielgafni commented Nov 11, 2022

sryza commented Nov 12, 2022

danielgafni commented Nov 12, 2022

sryza commented Nov 13, 2022

danielgafni commented Nov 13, 2022

sryza commented Nov 14, 2022

danielgafni commented Nov 14, 2022

sryza commented Nov 14, 2022

Converting PickledObjectFilesystemIOManager to use UPathIOManager #10273

Converting PickledObjectFilesystemIOManager to use UPathIOManager #10273

Conversation

danielgafni commented Oct 31, 2022 • edited

Summary & Motivation

How I Tested These Changes

vercel bot commented Oct 31, 2022

vercel bot commented Oct 31, 2022 • edited

danielgafni commented Oct 31, 2022

danielgafni commented Nov 2, 2022 • edited

sryza left a comment

Choose a reason for hiding this comment

danielgafni commented Nov 4, 2022 • edited

danielgafni commented Nov 4, 2022

sryza commented Nov 4, 2022

sryza left a comment

Choose a reason for hiding this comment

danielgafni commented Nov 9, 2022 • edited

sryza commented Nov 9, 2022

danielgafni commented Nov 10, 2022 • edited

sryza commented Nov 10, 2022

danielgafni commented Nov 10, 2022

danielgafni commented Nov 11, 2022

sryza commented Nov 12, 2022

danielgafni commented Nov 12, 2022

sryza commented Nov 13, 2022

danielgafni commented Nov 13, 2022

sryza commented Nov 14, 2022

danielgafni commented Nov 14, 2022

sryza commented Nov 14, 2022

danielgafni commented Oct 31, 2022 •

edited

vercel bot commented Oct 31, 2022 •

edited

danielgafni commented Nov 2, 2022 •

edited

danielgafni commented Nov 4, 2022 •

edited

danielgafni commented Nov 9, 2022 •

edited

danielgafni commented Nov 10, 2022 •

edited