Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default format of structured dataset to empty #1159

Merged
merged 5 commits into from
Dec 9, 2022
Merged

Conversation

pingsutw
Copy link
Member

@pingsutw pingsutw commented Sep 13, 2022

Signed-off-by: Kevin Su pingsutw@apache.org

Compatibility Note

This PR partly relies on flyteorg/flytepropeller@69158ea which was merged into propeller v1.1.36 and Flyte v1.2. If you are running an older version of propeller, you will experience cache-misses with this PR for cached tasks. Please see the tl;dr for more info, but essentially since the format in the signatures of tasks with dataframes in their inputs or outputs will now be "", if you don't upgrade, propeller will think that the signature has changed and result in a data catalog cache-miss.

TL;DR

flyteorg/flyte#2864

The current default format type is Parquet, and it leads to some issues

  1. SD transformer will always convert dataframe to parquet instead of cls.DEFAULT_FORMATS[df_type]
def t1() -> StructuredDataset: # The default format of structured dataset is Parquet here
  1. Failed to run BQ task if the cache is enabled because type validation is failing.
@task(cache=True, cache_version="1.0")
def t1() -> StructuredDataset: # The default format of structured dataset is Parquet here
    df = pd.DataFrame({"len": [len(sd.open(pd.DataFrame).all())]})
    return StructuredDataset(df, uri=bq_uri) # The format of structured dataset is "" 

In this PR, we set the default format to ""

def t1() -> StructuredDataset: # The default format of structured dataset is "" here
  1. By default, we use handlers[df_type][protocol][""] to encode a dataframe. if handlers[df_type][protocol][""] doesn't exist, then use handlers[df_type][protocol][cls.DEFAULT_FORMATS[df_type]].

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Tracking Issue

flyteorg/flyte#2864

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@codecov
Copy link

codecov bot commented Sep 13, 2022

Codecov Report

Merging #1159 (beec6a7) into master (2ccaed7) will increase coverage by 0.58%.
The diff coverage is 76.78%.

@@            Coverage Diff             @@
##           master    #1159      +/-   ##
==========================================
+ Coverage   68.51%   69.09%   +0.58%     
==========================================
  Files         288      295       +7     
  Lines       26085    26983     +898     
  Branches     2918     2537     -381     
==========================================
+ Hits        17871    18644     +773     
- Misses       7735     7843     +108     
- Partials      479      496      +17     
Impacted Files Coverage Δ
flytekit/clis/helpers.py 94.59% <ø> (ø)
flytekit/clis/sdk_in_container/init.py 100.00% <ø> (+33.33%) ⬆️
flytekit/configuration/internal.py 16.43% <0.00%> (+0.43%) ⬆️
flytekit/core/context_manager.py 39.61% <0.00%> (ø)
flytekit/core/launch_plan.py 57.89% <0.00%> (ø)
flytekit/core/map_task.py 43.11% <0.00%> (ø)
flytekit/deck/__init__.py 0.00% <ø> (ø)
flytekit/extras/tensorflow/__init__.py 0.00% <0.00%> (ø)
flytekit/models/literals.py 40.28% <0.00%> (ø)
flytekit/tools/fast_registration.py 81.53% <0.00%> (-7.53%) ⬇️
... and 101 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Kevin Su <pingsutw@apache.org>
@wild-endeavor
Copy link
Contributor

We are going to hold off on this change for a while, need to get the propeller change out and people migrated first. flyteorg/flytepropeller#483

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unf. there's some negative interaction with this PR which removed the setting of the default for a lot of the types. I think it's time to address the flaw introduced in that PR - there we took out the default_for_type switch because it was setting the default for both the format and the storage backend. I think we should add two new args, set_default_format and set_default_storage or something like that.

flytekit/types/structured/structured_dataset.py Outdated Show resolved Hide resolved
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@pingsutw pingsutw merged commit 6d78c56 into master Dec 9, 2022
eapolinario pushed a commit that referenced this pull request Feb 22, 2023
* Set default format of structured dataset to empty

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* last error (#1364)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
eapolinario added a commit that referenced this pull request Feb 23, 2023
* Force flyteidl==1.2.9

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Sanitize query template input in sqlite task (#1359)

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* TypeTransformer for reading and writing from TensorFlowRecord format (#1240)

* first commit

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add tensorflow example tf record transformer

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* refactor

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* correct tfexample description

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix test_native.py

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add tensorflow docs and reqs

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add tensorflow docs and reqs1

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* tensorflow import in init

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix failing tests

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add tensorflow pinned version to reqs

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* pin grpcio-status to remove protobuf error

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add suggested changes

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* redesign transformer

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* remove old script

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix type reference for TFREcordDataset

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* refactor

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* refactor

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* spacing and uppercase

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* redesign with tfdir and tfrecordfile subclass

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix conflicts and typos

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* address majority of comments

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* refactor

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix test with flytefile and metadata annotated

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* fix check for example records in directory

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* refactor and correct typing

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* lint

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* import annotated from typing_extensions

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* tweak to tests to test case when Config not passed in as type

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add suggested changes

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* add task for tfrecord dir with no config in test

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* get filenames from local dir instead of remote

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>

* update ray plugin dependency (#1361)

Signed-off-by: Kevin Su <pingsutw@apache.org>

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Set default format of structured dataset to empty (#1159)

* Set default format of structured dataset to empty

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* last error (#1364)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Adds CLI reference for pyflyte (#1362)

* Adds pyflyte CLI reference guide

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* bump python version

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* bump python version

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* resolve docs error

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* set nested to none

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove flyteidl version constraint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update requirements

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* Signaling (#1133)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Adding created and updated at to ExecutionClosure model (#1371)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Add Databricks config to Spark Job (#1358)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add overwrite_cache option the to calls of remote and local executions (#1375)

Signed-off-by: H. Furkan Vural <hfurkanvural@blackshark.ai>

Implemented cache overwrite feature is added on flytekit as well for the completeness. In order to support the cache eviction RFC, an overwrite parameter was added, indicating the data store should replace an existing artifact instead of creating a new one on local calls.

* Remove project/domain from being overridden with execution values in serialized context (#1378)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Use TaskSpec instead of TaskTemplate for fetch_task and avoid network when loading module (#1348)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Register Databricks config (#1379)

* Register databricks plugin

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Update databricks plugin

Signed-off-by: Kevin Su <pingsutw@apache.org>

* register databricks

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* PodSpec should not require primary_container name (#1380)

For Pod tasks, if the primary_container_name is not specified, it should default.

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fix(pyflyte): change -d to -D for --destination-dir as -d is already for --domain (#1381)

Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>

* Handle Optional[FlyteFile] in Dataclass type transformer (#1393)

* Add support for Optional to dataclass transformer

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Add one more test

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Add one more test

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fix serialization of optional flyte types

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* add FastSerializationSettings to docs (#1386)

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>

* Added more pod tests and an example pod task (#1382)

* Added more pod tests and an example pod task

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fixing test and name

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* updated

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Convert default dict to json string in pyflyte run (#1399)

Signed-off-by: Kevin Su <pingsutw@apache.org>

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>

* docs: update register help, non-fast version is supported (#1402)

Signed-off-by: Patrick Brogan <pbrogan12@gmail.com>

* Update log level for structured dataset (#1394)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add Niels to code owners (#1404)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Signal use (#1398)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* User Documentation Proposal (#1200)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add support MLFlow plugin (#1274)

* MLFlow plugin in progress

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update readme

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* dwip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* change experiment name

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add mlflow to index.rst

Signed-off-by: Kevin Su <pingsutw@apache.org>

* use experiment name that user provided

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update doc-requirements.txt

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add backend plugin deployment

Signed-off-by: Kevin Su <pingsutw@apache.org>

* generate doc for method

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update docstring

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update docstring

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Update tracking.py

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix remote API reference (#1405)

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Read structured dataset from a folder  (#1406)

* Read polars dataframe in a folder

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Read polars dataframe in a folder

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Load huggingface and spark plugin implicitly

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* remove _pyspark alias

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Update default config to work out-of-the-box with flytectl demo (#1384)

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Add dask plugin #patch (#1366)

* Add dummy task type to test backend plugin

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add docs page

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add dask models

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add function to convert resources

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add tests to `dask` task

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Remove namespace

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update setup.py

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add dask to `plugin/README.md`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Add README.md for `dask`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Top level export of `JopPodSpec` and `DaskCluster`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update docs for images

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update README.md

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update models after `flyteidl` change

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update task after `flyteidl` change

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Raise error when less than 1 worker

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update flyteidl to >= 1.3.2

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update doc requirements

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Update doc-requirements.txt

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Re-lock dependencies on linux

Signed-off-by: Bernhard Stadlbauer <bernhard@pachama.com>

* Update dask API docs

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Fix documentation links

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Default optional model constructor arguments to `None`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Refactor `convert_resources_to_resource_model` to `core.resources`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Use `convert_resources_to_resource_model` in `core.node`

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>

* Incorporate review feedback

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Lint

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>
Signed-off-by: Bernhard Stadlbauer <bernhard@pachama.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Add support for overriding task configurations (#1410)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Warning if git is not installed (#1414)

* warning if git is not installed

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Flip the settings for channel and logger (#1415)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Preserving Exception in the LazyEntity fetch (#1412)

* Preserving Exception in the LazyEntity fetch

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* updated lint error

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* more tests

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* [Docs] SynchronousFlyteClient API reference #3095 (#1416)

Signed-off-by: Peeter Piegaze <peeter@union.ai>

Signed-off-by: Peeter Piegaze <peeter@union.ai>
Co-authored-by: Peeter Piegaze <peeter@union.ai>
Co-authored-by: Haytham Abuelfutuh <haytham@afutuh.com>

* Return error code on fail (#1408)

* AWS batch return error code once it fails

Signed-off-by: Kevin Su <pingsutw@gmail.com>

* AWS batch return error code once it fails

Signed-off-by: Kevin Su <pingsutw@gmail.com>

* update tests

Signed-off-by: Kevin Su <pingsutw@gmail.com>

* Update tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

Signed-off-by: Kevin Su <pingsutw@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>

* wrapping flyte entity in a task node in call to flyte node constructor, not sure if integration tests are actually running (#1422)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Sqlalchemy multiline query (#1421)

* SQLAlchemyTask should handle multiline strings for query template

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* sqlalchemy supports multi-line query

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update base sql task

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* remove space

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix snowflake tests

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix test

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Sklearn type transformer should be automatically loaded with import flytekit (#1423)

* add flytekit.extras.sklearn to main __init__ import

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix docs

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add temporary docs/requirements.txt for onnx plugins

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Bump isort to 5.12.0 (#1427)

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fixes guess type bug in UnionTransformer (#1426)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>

* Add `pod_template` and `pod_template_name` arguments for `PythonAutoContainerTask`, its downstream tasks, and `@task`. (#1425)

* Add `pod_template` and `pod_template_name` arguments for `PythonAutoContainerTask`, its downstream tasks, and `@task`

Signed-off-by: byhsu <byhsu@linkedin.com>

* clean

Signed-off-by: byhsu <byhsu@linkedin.com>

* fix test

Signed-off-by: byhsu <byhsu@linkedin.com>

* Fix taskmetadata

Signed-off-by: byhsu <byhsu@linkedin.com>

* add kubernetes in setup.py

Signed-off-by: byhsu <byhsu@linkedin.com>

* address comments

Signed-off-by: byhsu <byhsu@linkedin.com>

* Regenerate requirements using python 3.7

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: byhsu <byhsu@linkedin.com>

* keep container validation

Signed-off-by: byhsu <byhsu@linkedin.com>

* bump idl version

Signed-off-by: byhsu <byhsu@linkedin.com>

* Regenerate requirements using python 3.7

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Regenerate doc-requirements.txt

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* fix

Signed-off-by: byhsu <byhsu@linkedin.com>

---------

Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Auto Backfill workflow (#1420)

* Fix primitive decoder when evaluating Promise (#1432)

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* set maximum python version to 3.10 (#1433)

* set maximum python version to 3.10

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* remove unneeded python version check

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Revert "Remove project/domain from being overridden with execution values in serialized context (#1378)" (#1460)

* Revert "Remove project/domain from being overridden with execution values in serialized context (#1378)"

This reverts commit b3bfef5.

* Import os

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Lint

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Support checkpointing in local mode from cached tasks (#1457)

* support checkpointing in local mode from cached tasks

* clear cache before tests

---------

Co-authored-by: Stef Nelson-Lindall <stef@stripe.com>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>

* Deprecate FlyteSchema (#1418)

* Deprecate FlyteSchema

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Remove version

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>

* Use scarf images (#1434)

* Use scarf images

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Use scarf names in tests.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* add undocumented objects/functions to flytekit api ref (#1502)

* add reference_launch_plan to flytekit api ref

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* import in init, add docstrings

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add more to references

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix up docstrings

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com>
Co-authored-by: Samhita Alla <aallasamhita@gmail.com>

* Use non-root user in default flytekit image (#1417)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix PyTorch transformer (#1510)

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* Fix mypy errors (#1313)

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix mypy errors

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix mypy errors

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Update type

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update dev-requirements.txt

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Address comment

Signed-off-by: Kevin Su <pingsutw@apache.org>

* upgrade torch

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@gmail.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Compile the workflow only at compile time (#1311)

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* add tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* add tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* support dynamic task

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lazy compile

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* add tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update test

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Get the origin type when serializing dataclass (#1508)

* Get the origin type when serializing dataclass

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>

* Fix bad merge

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Delay initialization of SynchronousFlyteClient in FlyteRemote (#1514)

* Delay initialization of SynchronousFlyteClient in FlyteRemote

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fix spark plugin flyteremote test.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Lint

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Set flytekit and flyteidl bounds in plugins tests

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Revert "Fix mypy errors (#1313)"

This reverts commit 3798450.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fix requirements in dask and ray plugins

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fix papermill tests requirements

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Fix doc-requirements

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* dask plugin requirements

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Revert "Add dask plugin #patch (#1366)"

This reverts commit 41a9c7a.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Ryan Nazareth <ryankarlos@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Signed-off-by: Patrick Brogan <pbrogan12@gmail.com>
Signed-off-by: Bernhard Stadlbauer <b.stadlbauer@gmx.net>
Signed-off-by: Bernhard Stadlbauer <bernhard@pachama.com>
Signed-off-by: Peeter Piegaze <peeter@union.ai>
Signed-off-by: Kevin Su <pingsutw@gmail.com>
Signed-off-by: byhsu <byhsu@linkedin.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Ryan Nazareth <ryankarlos@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Samhita Alla <aallasamhita@gmail.com>
Co-authored-by: H. Furkan Vural <33652917+hfurkanvural@users.noreply.github.com>
Co-authored-by: Ketan Umare <16888709+kumare3@users.noreply.github.com>
Co-authored-by: mcloney-ddm <119345186+mcloney-ddm@users.noreply.github.com>
Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: pbrogan12 <pbrogan12@gmail.com>
Co-authored-by: bstadlbauer <11799671+bstadlbauer@users.noreply.github.com>
Co-authored-by: Peeter Piegaze <peeter@piegaze.com>
Co-authored-by: Peeter Piegaze <peeter@union.ai>
Co-authored-by: Haytham Abuelfutuh <haytham@afutuh.com>
Co-authored-by: ByronHsu <byronhsu1230@gmail.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
Co-authored-by: Stef Lindall <bethebunny@gmail.com>
Co-authored-by: Stef Nelson-Lindall <stef@stripe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants